The Advanced Voice Mode is different from the existing Voice Mode as it uses GPT-4o, a multimodal model that can process tasks without auxiliary models, resulting in lower latency conversations. It can also sense emotional intonations in the user's voice. The new feature will be released gradually to monitor its usage and will be limited to four preset voices made in collaboration with paid voice actors. OpenAI has also introduced new filters to block requests to generate music or other copyrighted audio to avoid copyright infringement issues.
Key takeaways:
- OpenAI has started rolling out the Advanced Voice Mode of ChatGPT, featuring GPT-4o’s hyper-realistic audio responses, to a small group of ChatGPT Plus users.
- The Advanced Voice Mode is multimodal, capable of processing tasks without auxiliary models, and can sense emotional intonations in user's voice.
- The Advanced Voice Mode will be limited to four preset voices – Juniper, Breeze, Cove and Ember, and will block outputs that differ from these preset voices to avoid deepfake controversies.
- OpenAI has introduced new filters to block certain requests to generate music or other copyrighted audio, in an effort to avoid legal issues related to copyright infringement.