The New ChatGPT Can ‘See’ and ‘Talk.’ Here’s What It’s Like.

OpenAI has announced two new features for its popular chatbot, ChatGPT, allowing it to "see, hear and speak". The first feature enables ChatGPT to analyze and respond to images, while the second allows users to converse with the chatbot and receive responses in a synthetic AI voice. These updates are part of an industry trend towards multimodal AI systems that can handle text, photos, and videos. The features will initially be available to paying ChatGPT Plus and Enterprise customers, and will later be more widely accessible.

The image-recognition feature can analyze text within images and identify objects, although it is designed to avoid answering questions about human faces. The voice feature, which uses OpenAI's speech-recognition system Whisper, allows for more natural and fluid conversations compared to older AI voice assistants. Despite some technical issues, the voice feature provides a more intimate user experience, potentially changing the way people interact with AI chatbots.

Key takeaways:

OpenAI has announced new features for its AI chatbot, ChatGPT, allowing it to analyze and respond to images, and to interact with users through a synthetic AI voice.
The image-recognition feature can analyze objects and text within images, but it does not analyze human faces to avoid potential misuse and bias.
The voice feature uses OpenAI’s speech-recognition system, Whisper, and a new text-to-speech algorithm to provide fluid and natural-sounding responses.
These features are currently available to paying ChatGPT Plus and Enterprise customers, with wider availability planned for the future.

The New ChatGPT Can ‘See’ and ‘Talk.’ Here’s What It’s Like.

Key takeaways:

Comments (0)

Newsletter