The team has also introduced RT-Trajectory, a system that uses video input for robotic learning, often overlaying a 2D sketch of the robot's arm over the video for visual hints. This method has doubled the success rate of its predecessor, RT-2, with a 63% success rate in 41 tasks. The team believes that RT-Trajectory is a step towards building robots that can move efficiently and accurately in novel situations, and can unlock knowledge from existing datasets.
Key takeaways:
- Google's DeepMind Robotics researchers are exploring the potential of generative AI and large foundational models in robotics, with a focus on better understanding human needs.
- The newly announced AutoRT is designed to manage a fleet of robots, leveraging a Visual Language Model for situational awareness and a large language model for suggesting tasks that can be accomplished by the hardware.
- AutoRT has been tested over the past seven months, capable of orchestrating up to 20 robots at once and a total of 52 different devices, with DeepMind collecting some 77,000 trials, including more than 6,000 tasks.
- DeepMind has also developed RT-Trajectory, which uses video input for robotic learning, overlaying a two-dimension sketch of the arm in action over the video, and has shown to double the success rate of its RT-2 training.