![](https://old.simons.berkeley.edu/sites/default/files/styles/workshop_main/public/theory_of_reinforcement_learning_0.png?itok=2qgh3N76)
A Simple Convergence Proof For Stochastic Approximation Using Converse Lyapunov Theory
Mathukumalli Vidyasagar (Indian Institute of Technology Hyderabad)
Calvin Lab Auditorium
Traditionally, convergence theorems for the stochastic approximation (SA) algorithm were along the lines of "IF the iterations are almost surely bounded, then the iterations converge to a solution" (subject to other conditions of course). A breakthrough paper by Borkar and Meyn (2000) made the almost sure boundedness of the iterates a part of the conclusions, rather than a part of the hypotheses. Their approach was based on the so-called ODE method, which shows that the sample paths of the SA algorithm converge to the deterministic solution trajectories of an ODE. In this talk, I will use an entirely different approach based on martingale convergence theory, originally proposed by Gladyshev (1965) for a limited class of problems. I will present a new converse Lyapunov theorem for globally exponentially stable ODEs. Using this, I am able to prove the convergence of the SA algorithm in a very streamlined fashion, and essentially reproduce the results of Borkar-Meyn. The new approach can, in principle, be extended to variations such as two-time scale (actor-critic) and function approximation versions of SA. That research is in progress.
Attachment | Size |
---|---|
![]() | 315.8 KB |