Meta open sources framework for generating sounds and music

Meta has announced Audiocraft, a framework for generating high-quality and realistic audio and music from short text prompts. The framework includes three generative AI models: MusicGen, AudioGen, and EnCodec. MusicGen, which was open-sourced by Meta in June, learns from existing music to produce similar effects, raising potential ethical and legal issues. AudioGen focuses on generating environmental sounds and sound effects, while EnCodec improves on a previous Meta model for generating music with fewer artifacts.

However, concerns have been raised about the potential misuse of Audiocraft, especially in creating deepfake voices and music. While Meta has stated that the pretrained version of MusicGen was trained with Meta-owned and specifically licensed music, and vocals were removed from the training data, the company has not expressly prohibited any commercial applications. Despite potential drawbacks and legal issues, Meta plans to continue improving the performance and controllability of generative audio models and mitigating their limitations and biases.

Key takeaways:

Meta has announced Audiocraft, a framework for generating high-quality, realistic audio and music from short text descriptions, building on its previous work with AI-powered music generator, MusicGen.
Audiocraft contains three generative AI models: MusicGen, AudioGen and EnCodec, with the latter two focusing on generating environmental sounds and sound effects, and improving music generation with fewer artifacts, respectively.
While Meta emphasizes the potential benefits of Audiocraft, such as providing inspiration for musicians and helping people iterate on their compositions, there are concerns about misuse, including deepfaking a person’s voice and potential copyright violations.
Despite these concerns, Meta plans to continue improving the performance of generative audio models and mitigating their limitations and biases, while also being open about their development to help users understand their capabilities and limitations.

Meta open sources framework for generating sounds and music | TechCrunch

Key takeaways:

Comments (0)

Newsletter