Distributed Learning Dynamics Convergence in Routing Games
With the emergence of smartphone based sensing for mobility as the main paradigm for sensing in the last decade, radically new information sets have become available for the driving public. This information enables commuters to make repeated decisions on a daily basis based on anticipated state of the network. This repeated decision-making process creates interesting patterns for the transportation network, in which users might (or might not) reach an equilibrium, depending on the information at their disposal (for example knowing or not what other users of the network are experiencing or doing).
The present talk starts with a brief presentation of the state of the art in traffic monitoring, leading to a new results in routing games. Routing games offer a simple yet powerful model of congestion in traffic networks, both in transportation and communication systems. The congestion in such systems is affected by the combined decision of the agents (drivers or routers), so modeling the decision process of the agents is important, not only to estimate and predict the behavior of the system, but also to be able to control it. This decision process is often called learning, as agents "learn" information about the system or about the other agents.
We propose and study different models of learning with the following requirement: the joint learning dynamics should converge asymptotically to the Nash equilibrium of the game. In particular, we focus on two important properties: Is the model robust to stochastic perturbations (such as measurement noise)? And does the model allow heterogeneous learning (different agents may follow different learning strategies)? We study these questions using tools from online learning theory and stochastic approximation theory.
We then present experimental results obtained with an online gaming application in which distributed players can play the routing game: they connect to the web app and participate in the game, by iteratively making decisions about their routes and observing outcomes. We show preliminary results from data collected from the application. In particular, we propose and solve a model estimation problem to estimate the learning dynamics of the players, and compare the predictions of the model to the actual behavior of the players, and discuss extensions and open questions.