Over seven months, Google deployed 53 AutoRT robots in four office buildings and conducted over 77,000 trials. The robots, equipped with a camera, robot arm, and mobile base, used the VLM to understand their environment and the LLM to suggest tasks. Google also introduced SARA-RT, a neural network architecture designed to enhance the accuracy and speed of the existing Robotic Transformer RT-2, and RT-Trajectory, which adds 2D outlines to help robots perform specific physical tasks.
Key takeaways:
- The DeepMind robotics team has revealed three new advances to help robots make faster, better, and safer decisions, including a system for gathering training data with a “Robot Constitution”.
- Google’s data gathering system, AutoRT, uses a visual language model (VLM) and large language model (LLM) to understand its environment and decide on appropriate tasks, while avoiding tasks that involve humans, animals, sharp objects, and electrical appliances.
- DeepMind programmed the robots to stop automatically if the force on its joints goes past a certain threshold and included a physical kill switch for human operators. Over seven months, Google deployed 53 AutoRT robots into four office buildings and conducted over 77,000 trials.
- DeepMind's other new tech includes SARA-RT, a neural network architecture designed to make the existing Robotic Transformer RT-2 more accurate and faster, and RT-Trajectory, which adds 2D outlines to help robots better perform specific physical tasks.