Podcasting platform Podcastle launches a text-to-speech model with more than 450 AI voices

Podcastle has launched its AI-powered text-to-speech model, Asyncflow v1.0, joining other companies like ElevenLabs, Speechify, and WellSaid in the AI voice technology space. The new model offers over 450 AI voices and is designed to be cost-effective in terms of training and inference. An API will be available for developers to integrate the model into their apps. Podcastle's founder, Arto Yeritsyan, mentioned that advancements in large language models have enabled the company to develop a high-quality voice model without extensive data. The company, which raised $13.5 million in a Series A funding round last year, charges $40 for 500 minutes of text-to-speech conversion, compared to ElevenLabs' $99.

Podcastle is also enhancing its voice cloning feature, reducing the training process from reading 70 sentences to just a few seconds of recording. The new process uses Podcastle’s Magic Dust AI to improve audio quality, though the resulting voice can sound slightly robotic. The company plans to refine this feature over time. Podcastle aims to differentiate itself by offering a comprehensive suite of tools for audio, video, podcasts, and AI-powered narration on a single platform. While most users focus on audio content, video usage is increasing.

Key takeaways

Podcastle has released its own AI model, Asyncflow v1.0, for text-to-speech conversion, offering over 450 AI voices.
The company developed its model with low training and inference costs, providing a competitive advantage.
Podcastle's voice cloning feature now requires only a few seconds of recording to create a voice clone.
Podcastle aims to differentiate itself by offering tools for audio, video, podcasts, and AI-powered narration on one platform.

Podcasting platform Podcastle launches a text-to-speech model with more than 450 AI voices | TechCrunch

Key takeaways

Discussion (0)