The development of Eureka builds on Nvidia's previous work with AI agents, including Voyager, an AI agent that can autonomously play Minecraft. A new research paper reveals that Eureka uses the zero-shot generation, code-writing, and in-context improvement capabilities of GPT-4 to perform evolutionary optimization over reward code. The resulting rewards can be used to acquire complex skills through reinforcement learning, outperforming expert human-engineered rewards in 83% of tasks in a suite of 29 open-source RL environments.
Key takeaways:
- Nvidia Research has developed a new AI agent, Eureka, powered by OpenAI’s GPT-4, which can autonomously teach robots complex skills such as pen-spinning tricks, opening drawers and cabinets, and manipulating scissors.
- Eureka autonomously writes reward algorithms and represents a step towards developing new algorithms that integrate generative and reinforcement learning methods to solve complex tasks.
- The Eureka library of AI algorithms has been published for people to experiment with using Nvidia Isaac Gym, a physics simulation reference application for reinforcement learning research.
- According to a research paper, Eureka generates reward functions that outperform expert human-engineered rewards, leading to an average normalized improvement of 52% across a diverse suite of 29 open-source RL environments.