Learning Pokémon With Reinforcement Learning

Since 2020, a team has been developing a reinforcement learning (RL) agent to beat the 1996 game Pokémon Red. As of February 2025, they have successfully created an RL agent capable of completing the game using a policy with fewer than 10 million parameters, significantly smaller than previous models like DeepSeekV3. The project focuses on becoming the "champion" in the game, leveraging tools like the Pokémon Reverse Engineering Team (PRET) and PyBoy Python Gameboy Emulation to facilitate the process. The team chose RL over other machine learning approaches due to its ability to generate fresh training data without the need for extensive datasets or pretraining.

The project aims to demonstrate the potential of using JRPGs as benchmarks for AI development, given their complexity, nonlinearity, and requirement for multi-task reasoning. The team has open-sourced their code, allowing others to explore and contribute to the project. They acknowledge contributors like Mads Ynddal for creating PyBoy and other collaborators for their support. The project is ongoing, with updates and improvements documented in a changelog.

Key takeaways:

As of February 2025, the team successfully developed a reinforcement learning agent capable of beating Pokémon Red with a policy under 10 million parameters.
Pokémon Red is used as a benchmark for AI development due to its complexity, requiring multi-task reasoning and having non-obvious reward functions.
Reinforcement learning was chosen over other machine learning approaches due to its ability to generate fresh training data without the need for large, labeled datasets.
The project leverages tools like the Pokémon Reverse Engineering Team and PyBoy Python Gameboy Emulation for game introspection and data extraction.

Learning Pokémon With Reinforcement Learning

Key takeaways:

Comments (0)

Newsletter