Learning Automata as Building Blocks for MARL

Tuesday, May 3rd, 2022, 2:00 pm–2:30 pm

Add to Calendar

iCalendar
Outlook
Google

Event:

Multi-Agent Reinforcement Learning and Bandit Learning

Speaker:

Ann Nowe (Vrije Universiteit Brussel)

Location:

Calvin Lab Auditorium

In this talk I will show that Learning Automata (LA), and more precisely Reward in Action update schemes are interesting building blocks for Multi-agent RL, both in bandit settings as well as stateful RL. Based on the theorem of Narendra and Wheeler we have convergence guarantees in n-person non-zero sum games. However, LA have also shown to be robust in more relaxed settings, such as queueing systems, where updates happen asynchronously and the feedback sent to the agents is delayed.