Montezuma’s Revenge Solved by Go-Explore, a New Algorithm for Hard-exploration Problems

Montezuma’s Revenge Solved by Go-Explore, a New Algorithm for Hard-exploration Problems

  • November 27, 2018
Table of Contents

Montezuma’s Revenge Solved by Go-Explore, a New Algorithm for Hard-exploration Problems

In deep reinforcement learning (RL), solving the Atari games Montezuma’s Revenge and Pitfall has been a grand challenge. These games represent a broad class of challenging, real-world problems called “hard-exploration problems,” where an agent has to learn complex tasks with very infrequent or deceptive feedback. The state-of-the-art algorithm on Montezuma’s Revenge gets an average score of 11,347, a max score of 17,500, and solved the first level at one point in one of ten tries.

Surprisingly, despite considerable research effort, so far no algorithm has obtained a score greater than 0 on Pitfall. Today we introduce Go-Explore, a new family of algorithms capable of achieving scores over 2,000,000 on Montezuma’s Revenge and scoring over 400,000 on average! Go-Explore reliably solves the entire game, meaning all three unique levels, and then generalizes to the nearly-identical subsequent levels (which only differ in the timing of events and the score on the screen).

We have even seen it reach level 159!

Source: uber.com

Tags :
Share :
comments powered by Disqus

Related Posts

FastMRI open source tools from Facebook and NYU

FastMRI open source tools from Facebook and NYU

Facebook AI Research (FAIR) and NYU School of Medicine’s Center for Advanced Imaging Innovation and Research (CAI²R) are sharing new open source tools and data as part of fastMRI, a joint research project to spur development of AI systems to speed MRI scans by up to 10x. Today’s releases include new AI models and baselines for this task(as described in our paper here). It also includes the first large-scale MRI data set of its kind, which can serve as a benchmark for future research.

Read More
Humanizing Customer Complaints using NLP Algorithms

Humanizing Customer Complaints using NLP Algorithms

Last Christmas, I went through the most frustrating experience as a consumer. I was doing some last minute holiday shopping and after standing in a long line, I finally reached the blessed register only to find out that my debit card was blocked. I could sense the old lady at the register judging me with her narrowed eyes.

Read More