SeamlessM4T is available under a research license, allowing researchers and developers to build upon it. Alongside the model, Meta has also released its multimodal translation dataset, SeamlessAlign. The model was trained using tens of billions of sentences and 4 million hours of speech from publicly available open or licensed sources on the web. It can be trialed through a research demo hosted on Meta’s website, with users able to test it with 2 to 15-second-long clips as input.
Key takeaways:
- Meta has released a new AI model called SeamlessM4T, a multimodal multilingual translation and transcription model that can translate text and speech in nearly 100 languages.
- The model is a significant breakthrough in speech-to-speech and speech-to-text translation and transcription, with speech-to-speech translation supporting about 100 input languages and 36 output languages.
- SeamlessM4T is being made available under a research license, and Meta has also released its multimodal translation dataset, SeamlessAlign.
- The model was trained using tens of billions of sentences and 4 million hours of speech from publicly available open or licensed sources on the web.