Data-Driven Decision Processes - Seminar
Yifan Wang (Georgia Tech)
Room 116
Title: Bandit Algorithms for Prophet Inequality and Pandora's Box
Abstract: The Prophet Inequality and Pandora's Box problems are fundamental stochastic problems. A usual assumption for both problems is that the probability distributions of the underlying random variables are given as input to the algorithm. In this talk, we assume the distributions are unknown, and study them in the Multi-Armed Bandits model: We interact with the unknown distributions over multiple rounds. In each round we play a policy and receive only bandit feedback. The goal is to minimize the regret, which is the difference in the total value of the optimal algorithm that knows the distributions vs. the total value of our algorithm that learns the distributions from the bandit feedback. Our main results give near-optimal total regret algorithms for both Prophet Inequality and Pandora's Box. This is joint work with Khashayar Gatmiry, Thomas Kesselheim, and Sahil Singla.