Meta releases SeamlessM4T AI model for 100-language text and speech translation

Facebook's parent company, Meta, has launched SeamlessM4T, a multimodal multilingual translation and transcription model that offers text and speech translation for nearly 100 languages. The model, which Meta claims is the first of its kind, supports speech-to-speech translation for about 100 input languages and 36 output languages, and text-to-speech translation for almost 100 input languages and 35 output languages. The company has labeled the model a significant breakthrough in speech-to-speech and speech-to-text translation and transcription.

SeamlessM4T is available under a research license, allowing researchers and developers to build upon it. Alongside the model, Meta has also released its multimodal translation dataset, SeamlessAlign. The model was trained using tens of billions of sentences and 4 million hours of speech from publicly available open or licensed sources on the web. It can be trialed through a research demo hosted on Meta’s website, with users able to test it with 2 to 15-second-long clips as input.

Key takeaways

Meta has released a new AI model called SeamlessM4T, a multimodal multilingual translation and transcription model that can translate text and speech in nearly 100 languages.
The model is a significant breakthrough in speech-to-speech and speech-to-text translation and transcription, with speech-to-speech translation supporting about 100 input languages and 36 output languages.
SeamlessM4T is being made available under a research license, and Meta has also released its multimodal translation dataset, SeamlessAlign.
The model was trained using tens of billions of sentences and 4 million hours of speech from publicly available open or licensed sources on the web.

Meta releases SeamlessM4T AI model for 100-language text and speech translation

Key takeaways

Discussion (0)