The company claims that Fugatto is different from other audio generation models as it can absorb and modify existing sounds, creating original soundscapes by overlaying two distinct audio effects. Nvidia's VP of Applied Deep Learning Research, Bryan Catanzaro, believes generative AI has the potential to impact music production in the same way that electronic synthesizers did. However, Nvidia has no immediate plans to release the model due to potential risks and copyright issues.
Key takeaways:
- Nvidia has developed a new generative AI model called Fugatto, designed to create new music and audio from human language prompts.
- Fugatto can modify human voices and create novel sounds, transform musical segments into different instruments or voices, and alter the accent and mood of a human voice recording.
- The model can create original soundscapes by overlaying two distinct audio effects, a capability not seen before in an audio-generation model.
- Nvidia has not publicly released the model due to safety concerns and potential copyright issues, and is still considering how to safely release it to the public.