The company has integrated its latest image generation model, DALL-E 3, into ChatGPT, enhancing its visual content creation capabilities. OpenAI's voice technology can create lifelike synthetic voices from a few seconds of speech, but to mitigate risks like impersonation and fraud, it's being used specifically for ChatGPT's voice chat. The company also mentioned rigorous testing to ensure responsible usage of its vision-based models.
Key takeaways:
- OpenAI is adding voice and image capabilities to its AI chatbot, ChatGPT, allowing it to see, hear, and speak, and enabling users to have conversations with images and speech.
- The voice feature works like Alexa and there are five different voice options that users can choose from. The image feature lets users show ChatGPT photos to get information or advice.
- The new features will be rolled out to Plus and Enterprise customers over the next two weeks, with plans for expansion to all users at a later date. The image feature will be accessible on all platforms, while voice functionality will be limited to iOS and Android.
- OpenAI's latest voice technology can create lifelike synthetic voices from just a few seconds of speech, and its vision-based models have undergone rigorous testing for responsible usage.