Publication Date

1996

Abstract

We introduce two new temporal difference (TD) algorithms based on the theory of linear leastsquares function approximation. We define an algorithm we call Least-Squares TD (LS TD) for which we prove probability-one convergence when it is used with a function approximator linear in the adjustable parameters. We then define a recursive version of this algorithm, Recursive Least-Squares TD (RLS TD). Although these new TD algorithms require more computation per time-step than do Sutton's TD(A) algorithms, they are more efficient in a statistical sense because they extract more information from training experiences. We describe a simulation experiment showing the substantial improvement in learning rate achieved by RLS TD in an example Markov prediction problem. To quantify this improvement, we introduce the TD error variance of a Markov chain, arc,, and experimentally conclude that the convergence rate of a TD algorithm depends linearly on ~ro. In addition to converging more rapidly, LS TD and RLS TD do not have control parameters, such as a learning rate parameter, thus eliminating the possibility of achieving poor performance by an unlucky choice of parameters.

Comments

This paper was harvested from CiteSeer

Recommended Citation

Bradtke, Steven J. and Barto, Andrew G., "Linear Least-Squares Algorithms for Temporal Difference Learning" (1996). Computer Science Department Faculty Publication Series. 9.
Retrieved from https://scholarworks.umass.edu/cs_faculty_pubs/9

Download

Included in

Computer Sciences Commons

COinS

ScholarWorks@UMass Amherst

Computer Science Department Faculty Publication Series

Linear Least-Squares Algorithms for Temporal Difference Learning

Publication Date

Abstract

Comments

Recommended Citation

Included in

Browse

Author Corner

Links

ScholarWorks@UMass Amherst

Computer Science Department Faculty Publication Series

Linear Least-Squares Algorithms for Temporal Difference Learning

Authors

Publication Date

Abstract

Comments

Recommended Citation

Included in

Share

Browse

Author Corner

Links