Driving Video: A New Approach to Portrait Animation
What Is "LivePortrait?"
A new white paper from researchers in China introduces LivePortrait, an experimental AI tool available for free on Hugging Face. This innovative technology creates realistic animations from a single image.
I've written before about tools that animate still photos with ease. These technologies are fascinating and innovative, but bring up complex ethical questions. We'll dive into the research, but swimming through the murky ethical waters.
How Does It Work?
LivePortrait uses a two-stage process to create its animations:
What Is a "Driving Video?"
Unlike the videos I insist on capturing using my Tesla's dashcam, "driving video" has a different meaning in this context from "LivePortrait." Instead of showing ladders in the middle of Highway 101, or spectacularly bad driving, a "driving video" in this context is a key concept in LivePortrait's animation process. It provides the motion data that drives the animation of the still image. Essentially, the driving video shows a person or animal making various facial expressions and head movements.
LivePortrait uses this video to map these movements onto the still image, making it appear as though the image is performing the same actions. This technique ensures the animated portrait looks natural and lifelike. This is one of the sample driving videos provided by LivePortrait.
If you create a free Hugging Face account, you can try this tool out for yourself. Hugging Face is an open source data science and machine learning platform. Recognizable by an adorable little smiley face emoji with hands, the platform acts as a collaboration space for new models, datasets and applications.
Testing Out LivePortrait
Step # 1 involved choosing a "Source Portrait." I chose a favorite image of my son, taken about a decade ago.
Step # 2 involves adding the driving video. I chose one of the sample videos where the expression matched the serious expression of Dr. Jones.
Step # 3: Click "Animate" button and wait for the magic to happen. It rendered very quickly on a Mac with an "M" series chip. (On a high-end GPU, it can render in just 12.8 milliseconds.)
The combination of the driving image and the still potrait = animated young Dr. Jones.
This is a playable video showing the driving video in action. (The rest of the shots below are screencaps for those who cannot access video.)
Other Key Features:
Versatility: It works on various styles of images, from realistic photos to oil paintings and even 3D renderings.
Fine Control: Users can adjust specific aspects of the animation, like how wide the eyes open or how much the lips move.
Generalization: With some additional training, the system can even animate animals like cats, dogs, and pandas.
Technical Innovations
Potential Applications
-Creating virtual avatars for video calls or online presentations
-Bringing historical figures to life in educational content
-Enhancing storytelling in digital media and entertainment
-Generating animated content for social media and marketing
OK, now let's get into those murky ethical areas.
Ethical Considerations
"I see dead people." - Cole. The Sixth Sense.
Using LivePortrait, you can take any image and combine it with a driving image.
Animating the Deceased
The ability to bring historical figures or recently deceased individuals "back to life" is a double-edged sword. If you have a photo of someone dead, you can combine them with a "driving video" and move their face around. This is a huge and thriving business in China, and I fully respect that other cultures have different views on how they regard their dearly departed. For myself, I'd be uncomfortable animating a photo of the brother I lost to cancer. The memories I have of him are very special to me, and I don't wish to "reanimate him" in AI form.
Potential Benefits:
Ethical Concerns:
Deepfakes and Political Manipulation
In an election year, the implications of such technology become even more concerning:
Potential for Misinformation:
Although, to be fair, these days the public should be continually educated that "seeing is not believing." The reason I probably like the slightly dystopian style of AI-video is I don't have to think about "what is real" and "what is not." Flying goats are clearly not real, but they sure are fun.
领英推荐
Consent and Ownership
The issue of consent becomes complicated when dealing with historical figures or the deceased:
Potential Safeguards and Regulations
To address these ethical concerns, several measures could be considered:
Legal Frameworks: Developing laws that govern the use of AI-generated likenesses, especially of deceased individuals
Ethical Guidelines: Creating industry standards for the responsible use of portrait animation technology
Digital Watermarking: Implementing mandatory markers to identify AI-generated content
Education: Promoting digital literacy to help the public critically evaluate AI-generated media
Consent Mechanisms: Establishing clear processes for obtaining permission from individuals (or their estates) for the use of their likeness
Final Thoughts
As we continue to develop and use technologies like LivePortrait, it's crucial to engage in ongoing ethical discussions and establish frameworks that protect individuals' rights while allowing for innovation and creative expression.
I am a retired educator who enjoys writing about AI.
Learn something new every day with #DeepLearningDaily.
Key Terms From This Article:
Driving Video
A video used as a reference for motion and expressions, which the AI uses to animate the still image. It provides the movement patterns that are then applied to the source image.
Base Model
The foundational AI model in LivePortrait that learns to extract and combine appearance and motion information.
Stitching Module
A component in LivePortrait that enhances the quality of the animation by improving the transition between different facial expressions.
Retargeting Modules
Features in LivePortrait that allow fine-tuned control over specific facial elements like eyes and lips in the animated output.
Deepfake
Synthetic media where a person's likeness is replaced with someone else's in existing images or videos, often using AI techniques.
GPU (Graphics Processing Unit)
A specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images, often used in AI processing for faster computations.
Mixed Training Strategy
An approach in AI where the model is trained on both still images and videos to improve its ability to handle various types of input.
Implicit Keypoints
A technique used in LivePortrait to represent facial movements efficiently with minimal data.
Fine-tuning
The process of making small adjustments to a pre-trained AI model to improve its performance on a specific task or dataset.
Virtual Avatar
A digital representation of a person, often used in video calls, online presentations, or virtual environments.
Digital Right of Publicity
A proposed concept extending an individual's right to control the commercial use of their name, image, or likeness to the digital realm, potentially even after death.
Digital Watermarking
The process of embedding information into digital media, which could be used to identify AI-generated content.
FAQs
Additional Resources for Inquisitive Minds:
#AITechnology,#PortraitAnimation,#LivePortrait,#DeepfakeEthics,#AIEthics, #DeepLearning