Optimizing Intended Reward Functions: Extracting All the Right Information From All the Right Places

Monday, August 31st, 2020, 3:30 pm–4:30 pm

Add to Calendar

iCalendar
Outlook
Google

Event:

Theory of Reinforcement Learning Boot Camp

Speaker:

Anca Dragan (UC Berkeley)

Location:

Zoom

AI work tends to focus on how to optimize a specified reward function, but rewards that lead to the desired behavior consistently are not so easy to specify. Rather than optimizing specified reward, which is already hard, robots have the much harder job of optimizing intended reward. While the specified reward does not have as much information as we make our robots pretend, the good news is that humans constantly leak information about what the robot should optimize. In this talk, we will explore how to read the right amount of information from different types of human behavior -- and even the lack thereof.

Attachment	Size
Slides	6.43 MB