Navigating the Challenges and Opportunities of Synthetic Voices

OpenAI has developed a model called Voice Engine that uses text input and a 15-second audio sample to generate natural-sounding speech resembling the original speaker. The technology, first developed in late 2022, is used in the text-to-speech API, ChatGPT Voice, and Read Aloud. However, OpenAI is cautious about a broader release due to potential misuse of synthetic voices. The company is testing the technology with trusted partners and exploring its potential uses, such as providing reading assistance, translating content, reaching global communities, supporting non-verbal individuals, and helping patients recover their voice.

OpenAI is committed to building Voice Engine safely, with partners agreeing to usage policies prohibiting impersonation without consent. The company has implemented safety measures, including watermarking and proactive monitoring. OpenAI believes any broad deployment of synthetic voice technology should include voice authentication experiences and a no-go voice list. The company is not planning a wide release of the technology at this time, but is encouraging steps like phasing out voice-based authentication, exploring policies to protect individuals' voices in AI, educating the public about AI capabilities and limitations, and accelerating the development of techniques for tracking the origin of audiovisual content.

Key takeaways

OpenAI has developed a model called Voice Engine that uses text input and a 15-second audio sample to generate natural-sounding speech resembling the original speaker. It has been used in applications such as reading assistance, content translation, reaching global communities, supporting non-verbal individuals, and helping patients recover their voice.
The company is cautious about a broader release due to the potential for synthetic voice misuse and is engaging in dialogue about responsible deployment of synthetic voices.
Partners testing Voice Engine have agreed to usage policies prohibiting impersonation without consent or legal right, and OpenAI has implemented safety measures including watermarking and proactive monitoring of usage.
OpenAI is not planning a wide release of this technology at this time, but is encouraging steps such as phasing out voice-based authentication, exploring policies to protect the use of individuals' voices in AI, educating the public about AI capabilities and limitations, and accelerating the development of techniques for tracking the origin of audiovisual content.

Navigating the Challenges and Opportunities of Synthetic Voices

Key takeaways

Discussion (0)