The company is taking steps to ensure responsible use of the technology, including watermarking synthetic speech files and launching a proactive monitoring initiative. If made commercially available, Voice Engine could compete with existing synthetic speech services, such as Eleven Labs. However, OpenAI may need to develop a more advanced version of Voice Engine to offer more customization options, which could increase its price. In 2022, OpenAI released the code for a second AI system, Whisper, which can transcribe and translate speech.
Key takeaways:
- OpenAI has developed Voice Engine, an AI model that can generate synthetic speech based on user-provided audio samples, and uses it to power ChatGPT features.
- Voice Engine can analyze a sample of a user’s voice and generate synthetic speech that closely resembles it, requiring only 15 seconds of audio to imitate the speaker.
- OpenAI has not yet made Voice Engine publicly available but opened access to the model for a limited number of partners in late 2023, who have applied it to tasks such as generating voiceovers for educational content and translating videos.
- If OpenAI decides to make Voice Engine commercially available, it could create more competition for the existing synthetic speech services on the market, such as Eleven Labs Inc.