Publication Date

2006

Abstract

Policy evaluation is a critical step in the approximate solution of large Markov decision processes (MDPs), typically requiring O(|S|3) to directly solve the Bellman system of |S| linear equations (where |S| is the state space size in the discrete case, and the sample size in the continuous case). In this paper we apply a recently introduced multiscale framework for analysis on graphs to design a faster algorithm for policy evaluation. For a fixed policy π, this framework efficiently constructs a multiscale decomposition of the random walk P¼ associated with the policy π. This enables efficiently computing medium and long term state distributions, approximation of value functions, and the direct computation of the potential operator (I −γP¼)−1 needed to solve Bellman’s equation. We show that even a preliminary nonoptimized version of the solver competes with highly optimized iterative techniques, requiring in many cases a complexity of O(|S|).

Comments

This paper was harvested from CiteSeer

Share

COinS