Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

JARVIS-1: Open-Ended Multi-task Agents with Memory-Augmented Multimodal Language Models

Nov 14, 2023 - news.bensbites.co
The article introduces JARVIS-1, an open-ended agent developed for the Minecraft universe, capable of perceiving multimodal input, generating plans, and performing embodied control. The agent is built on pre-trained multimodal language models that map visual observations and textual instructions to plans. JARVIS-1 is equipped with a multimodal memory that aids in planning using both pre-trained knowledge and actual game survival experiences. The agent has shown nearly perfect performances across over 200 varying tasks in Minecraft and has achieved a completion rate of 12.5% in the long-horizon diamond pickaxe task, a significant increase compared to previous records.

JARVIS-1 can self-improve following a life-long learning paradigm, thanks to its growing multimodal memory. This feature sparks more general intelligence and improved autonomy. The article demonstrates the performance of JARVIS-1 at different learning stages when completing the same task. It also shows that JARVIS-1 can execute human instructions in diverse environments. The article concludes by sharing additional results of JARVIS-1 on Minecraft and introducing some related projects.

Key takeaways:

  • JARVIS-1 is an open-ended agent that can perceive multimodal input, generate sophisticated plans, and perform embodied control in the Minecraft universe. It is built on top of pre-trained multimodal language models and is equipped with a multimodal memory.
  • The agent can self-improve following a life-long learning paradigm, demonstrating improved performance over time. For example, it learned to mine an extra log for fuel by the third epoch of a task.
  • JARVIS-1 has shown nearly perfect performances across over 200 varying tasks in Minecraft, from entry to intermediate levels. It achieved a completion rate of 12.5% in the long-horizon diamond pickaxe task, a significant increase compared to previous records.
  • The agent can execute human instructions in diverse environments, demonstrating its ability to adapt to different biomes in the Minecraft universe.
View Full Article

Comments (0)

Be the first to comment!