Creating a Zoo of Atari-Playing Agents to Catalyze the Understanding of Deep Reinforcement Learning

Some of the most exciting advances in AI recently have come from the field of deep reinforcement learning (deep RL), where deep neural networks learn to perform complicated tasks from reward signals. RL operates similarly to how you might teach a dog to perform a new trick: treats are offered to reinforce improved behavior. Recently, deep RL agents have exceeded human performance in benchmarks like classic video games (such as Atari 2600 games), the board game Go, and modern computer games like DOTA 2.

One common setup (which our work targets) is for an algorithm to learn to play a single video game, learning only from raw pixels, guided by increases in the game score. Looking beyond video games, we believe RL has great potential for beneficial real-world applications. This is true both at Uber (for example, in improving Uber Eats recommendations or for applications in self-driving cars), and in business and society at large.

However, there is currently much more research focused on improving deep RL performance (e.g., how many points an agent receives in a game) than on understanding the agents trained by deep RL (e.g., whether slight changes in the game an agent is trained on will catastrophically confuse it). Understanding the agents we create helps us develop confidence and trust in them, which is needed before putting RL into sensitive real-world situations.

Source: uber.com