Gemini Live offers users a choice of 10 voices, created with the help of voice actors. The AI can perform complex tasks, such as finding family-friendly wineries with outdoor areas and playgrounds. However, it sometimes provides inaccurate information. Google is not allowing Gemini Live to sing or mimic any voices outside of the 10 it provides, likely to avoid copyright issues. The company is also not focusing on getting the AI to understand emotional intonation in a user's voice. Gemini Live is a step towards Project Astra, a fully multimodal AI model, and Google plans to add real-time video understanding in the future.
Key takeaways:
- Google launched Gemini Live, a feature that allows users to have a spoken conversation with an AI chatbot, which is Google's response to OpenAI’s Advanced Voice Mode.
- Gemini Live responds to questions in less than two seconds and can pivot quickly when interrupted, providing a more natural, hands-free phone experience.
- The feature offers 10 different voice options, created in collaboration with voice actors, and is capable of handling complex tasks such as finding family-friendly wineries with specific amenities.
- Google plans to further develop Gemini Live as part of Project Astra, aiming to add real-time video understanding in the future.