Learning to Control Unknown Systems

Monday, March 26th, 2018, 2:45 pm–3:15 pm

Add to Calendar

Event:

Societal Networks

Speaker:

Rahul Jain, University of Southern California

Location:

Calvin Lab Auditorium

We consider the problem of designing `Learning to Control' algorithms for stochastic systems when the model parameters are unknown. Two models are considered: Markov Decision Processes and Linear Stochastic systems. A Thompson-sampling based regret-minimization learning framework is developed for trading off exploration v. exploitation. Sampling from a posterior distribution on unknown parameters at regular intervals provides the necessary exploration for learning. This obviates the need for expensive computation for exploration, and makes the algorithm suitable for real-time decision-making and learning. The key is designing a suitable exploration schedule. The proposed algorithms achieve O(sqrt{T}) expected regret which is order-optimal. It differs from classical adaptive control algorithms in its focus on non-asymptotic regret optimality as opposed to asymptotic stability. Numerical evaluation suggests robust performance of the algorithms.

Attachment	Size
Learning to Control Unknown Systems	2.12 MB