Microsoft Research Asia: VASA-1 - Lifelike Audio-Driven Talking Faces
David Cronshaw
Sr. Product Manager @Disney Streaming | Co-Founder Chatmosa chatmosa.bsky.social | AI, Generative AI | Revenue Generation | Former Microsoft and T-Mobile | Co-Founder UltimateTV.com - Zap2it.com
Microsoft Research Asia has released a paper on VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time.
VASA is capable of generating a large spectrum of expressive facial nuances and natural head motions, all from a single photo & a one-minute audio clip.
Microsoft Research Asia Researchers:
Using a single photo & a 1-minute audio clip, they can produce highly realistic #video w/ perfectly synched lip movements and nuanced expressions. #vasa
The demo takes a number of AI generated faces, and adds MP3 audio and lifelike movements. It is amazing how good these look.
VASA-1 Paper Summary
领英推荐
Introduction
Methodology
Experiments
Results
Conclusion
Social Impact and Responsible AI
User Reactions
Reactions on X were mixed:
1) the violent head jerks are unrealistic 2) the teeth change size as the avatar is speaking
#ai #aiavatars #microsoft #vasa