OpenAI releases ChatGPT's hyper-realistic voice to some paying users

OpenAI has started rolling out the Advanced Voice Mode of ChatGPT, featuring GPT-4o's hyper-realistic audio responses. The alpha version will initially be available to a select group of ChatGPT Plus users, with a broader rollout planned for fall 2024. The voice feature, which was first showcased in May, drew attention for its striking resemblance to a human voice, specifically that of actress Scarlett Johansson. However, following legal concerns raised by Johansson, OpenAI removed the voice from its demo and denied using her voice.

The Advanced Voice Mode is different from the existing Voice Mode as it uses GPT-4o, a multimodal model that can process tasks without auxiliary models, resulting in lower latency conversations. It can also sense emotional intonations in the user's voice. The new feature will be released gradually to monitor its usage and will be limited to four preset voices made in collaboration with paid voice actors. OpenAI has also introduced new filters to block requests to generate music or other copyrighted audio to avoid copyright infringement issues.

Key takeaways:

OpenAI has started rolling out the Advanced Voice Mode of ChatGPT, featuring GPT-4o’s hyper-realistic audio responses, to a small group of ChatGPT Plus users.
The Advanced Voice Mode is multimodal, capable of processing tasks without auxiliary models, and can sense emotional intonations in user's voice.
The Advanced Voice Mode will be limited to four preset voices – Juniper, Breeze, Cove and Ember, and will block outputs that differ from these preset voices to avoid deepfake controversies.
OpenAI has introduced new filters to block certain requests to generate music or other copyrighted audio, in an effort to avoid legal issues related to copyright infringement.

OpenAI releases ChatGPT's hyper-realistic voice to some paying users | TechCrunch

Key takeaways:

Comments (0)

Newsletter