?? ?The world’s most flexible sound machine? With text and audio inputs, this new #generativeAI model, named Fugatto, can create any combination of music, voices, and sounds. ?? Read more in our blog by Richard Kerris ?? https://lnkd.in/gmgvqPkn #NVIDIAResearch Note: Some music, sounds, and the voice of NVIDIA CEO Jensen Huang used in this video are AI generated.
Fugatto is truly a game-changer in generative AI for sound! Transforming text and audio into limitless creative possibilities, it’s not just a tool: it’s a new frontier for music, gaming, and storytelling. Excited to see how this innovation redefines the soundscape of the future. ??
Interesting
David Ding, this would be an excellent tool for DJs.
Co-Founder & CEO at GPU Audio | Accelerated Audio Computing Using Graphics Cards
2 天前NVIDIA, congrats on the release! ?? I did notice something in the phase at this timestamp (https://youtu.be/qj1Sp8He6e4?t=58), between 0:58 - 1:02. The ruined phase is everywhere in the first 90 seconds of the video. There’s quite a wobbling there – and I bet everyone can hear it - it's a ruined sound phase there. It happens because gen AI models are 'unaware' of what they are actually producing, and it’s definitely something that might stand out to producers and regular listeners and could be a bit off-putting. It’s definitely important to do some extra processing and polishing before it goes to production, to make it studio-quality, so to speak. Also, running Demucs (https://github.com/facebookresearch/demucs) isn't enough if you want to source-separate and create stems for further re-use. There are many more instruments needed to support making it work in real scenarios, with any song (this is one of the reasons why Meta abandoned the model eventually). Just my two cents here and always happy to help! ??