Driving Video: A New Approach to Portrait Animation
This screencap from the LivePortrait research paper shows still images brought to life.

Driving Video: A New Approach to Portrait Animation

What Is "LivePortrait?"

A new white paper from researchers in China introduces LivePortrait, an experimental AI tool available for free on Hugging Face. This innovative technology creates realistic animations from a single image.

I've written before about tools that animate still photos with ease. These technologies are fascinating and innovative, but bring up complex ethical questions. We'll dive into the research, but swimming through the murky ethical waters.

How Does It Work?

LivePortrait uses a two-stage process to create its animations:

  1. Base Model Training: The system learns to extract appearance information from the source image and motion information from a driving video. It then combines these elements to create a realistic animation.
  2. Stitching and Retargeting: After the base model is trained, LivePortrait adds extra control features. The "stitching module" helps improve the quality of the animation. The "retargeting modules" allow for fine-tuned control over specific facial features like eyes and lips.

What Is a "Driving Video?"

Unlike the videos I insist on capturing using my Tesla's dashcam, "driving video" has a different meaning in this context from "LivePortrait." Instead of showing ladders in the middle of Highway 101, or spectacularly bad driving, a "driving video" in this context is a key concept in LivePortrait's animation process. It provides the motion data that drives the animation of the still image. Essentially, the driving video shows a person or animal making various facial expressions and head movements.

LivePortrait uses this video to map these movements onto the still image, making it appear as though the image is performing the same actions. This technique ensures the animated portrait looks natural and lifelike. This is one of the sample driving videos provided by LivePortrait.


If you create a free Hugging Face account, you can try this tool out for yourself. Hugging Face is an open source data science and machine learning platform. Recognizable by an adorable little smiley face emoji with hands, the platform acts as a collaboration space for new models, datasets and applications.

Testing Out LivePortrait

Step # 1 involved choosing a "Source Portrait." I chose a favorite image of my son, taken about a decade ago.

If you want to be a good archeologist, you gotta get out of the library!

Step # 2 involves adding the driving video. I chose one of the sample videos where the expression matched the serious expression of Dr. Jones.

Default examples offered on the Hugging Face page

Step # 3: Click "Animate" button and wait for the magic to happen. It rendered very quickly on a Mac with an "M" series chip. (On a high-end GPU, it can render in just 12.8 milliseconds.)

The combination of the driving image and the still potrait = animated young Dr. Jones.

This is a playable video showing the driving video in action. (The rest of the shots below are screencaps for those who cannot access video.)

Screencap: Driving video awkwardly smiles and young Indy awkwardly smiles.
Screencap: Driving video slightly opens mouth and looks off to the side, and young Indiana does the same. The eye movements are not an exact match, likely due to difference in eye shape. Young Indy is only half-Asian.
Screencap: These movements synced up well. Driving video looks to side and does big cheesy grin. Young Indiana perfectly follows suit with a big toothy grin.


Other Key Features:

Versatility: It works on various styles of images, from realistic photos to oil paintings and even 3D renderings.

In this example provided by LivePortrait, an oil painting is brought to life

Fine Control: Users can adjust specific aspects of the animation, like how wide the eyes open or how much the lips move.

Generalization: With some additional training, the system can even animate animals like cats, dogs, and pandas.

And, yes, you could create some incredible cat videos for YouTube with this technology. Screenshot taken from the LivePortrait research paper.

Technical Innovations

  • Large-Scale Training: The researchers used about 69 million high-quality video frames to train the system, significantly improving its performance.
  • Mixed Training Strategy: By using both still images and videos during training, LivePortrait learned to handle a wider range of inputs.
  • Compact Representation: The system uses a clever method to represent facial movements with minimal data, making it very efficient.

Potential Applications

-Creating virtual avatars for video calls or online presentations

-Bringing historical figures to life in educational content

-Enhancing storytelling in digital media and entertainment

-Generating animated content for social media and marketing

OK, now let's get into those murky ethical areas.

Ethical Considerations

"I see dead people." - Cole. The Sixth Sense.

Using LivePortrait, you can take any image and combine it with a driving image.

Animating the Deceased

The ability to bring historical figures or recently deceased individuals "back to life" is a double-edged sword. If you have a photo of someone dead, you can combine them with a "driving video" and move their face around. This is a huge and thriving business in China, and I fully respect that other cultures have different views on how they regard their dearly departed. For myself, I'd be uncomfortable animating a photo of the brother I lost to cancer. The memories I have of him are very special to me, and I don't wish to "reanimate him" in AI form.

Potential Benefits:

  • Educational value in history lessons

  • Preserving memories for grieving families
  • Creating unique experiences in museums or cultural events

Ethical Concerns:

  • Violation of the deceased's privacy and dignity
  • Emotional distress for family members and loved ones
  • Blurring the line between remembrance and exploitation

Deepfakes and Political Manipulation

In an election year, the implications of such technology become even more concerning:

Potential for Misinformation:

  • Creation of fake videos showing politicians saying or doing things they never did
  • Rapid spread of false narratives through social media
  • Erosion of trust in visual evidence

Although, to be fair, these days the public should be continually educated that "seeing is not believing." The reason I probably like the slightly dystopian style of AI-video is I don't have to think about "what is real" and "what is not." Flying goats are clearly not real, but they sure are fun.

Consent and Ownership

The issue of consent becomes complicated when dealing with historical figures or the deceased:

  • Who has the right to authorize the use of a person's likeness? Before using my son's photo for this article, I checked with him to make sure it was okay.
  • Does a fan have a right to manipulate a celebrity's photo without their consent? Do these rights change when a public figure/celebrity passes on? Should there be a "digital right of publicity" that extends beyond death?
  • How do we balance public interest with individual privacy rights?


Potential Safeguards and Regulations

To address these ethical concerns, several measures could be considered:

Legal Frameworks: Developing laws that govern the use of AI-generated likenesses, especially of deceased individuals

Ethical Guidelines: Creating industry standards for the responsible use of portrait animation technology

Digital Watermarking: Implementing mandatory markers to identify AI-generated content

Education: Promoting digital literacy to help the public critically evaluate AI-generated media

Consent Mechanisms: Establishing clear processes for obtaining permission from individuals (or their estates) for the use of their likeness


Final Thoughts

As we continue to develop and use technologies like LivePortrait, it's crucial to engage in ongoing ethical discussions and establish frameworks that protect individuals' rights while allowing for innovation and creative expression.


I am a retired educator who enjoys writing about AI.

Learn something new every day with #DeepLearningDaily.



Key Terms From This Article:

Driving Video

A video used as a reference for motion and expressions, which the AI uses to animate the still image. It provides the movement patterns that are then applied to the source image.

Base Model

The foundational AI model in LivePortrait that learns to extract and combine appearance and motion information.

Stitching Module

A component in LivePortrait that enhances the quality of the animation by improving the transition between different facial expressions.

Retargeting Modules

Features in LivePortrait that allow fine-tuned control over specific facial elements like eyes and lips in the animated output.

Deepfake

Synthetic media where a person's likeness is replaced with someone else's in existing images or videos, often using AI techniques.

GPU (Graphics Processing Unit)

A specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images, often used in AI processing for faster computations.

Mixed Training Strategy

An approach in AI where the model is trained on both still images and videos to improve its ability to handle various types of input.

Implicit Keypoints

A technique used in LivePortrait to represent facial movements efficiently with minimal data.

Fine-tuning

The process of making small adjustments to a pre-trained AI model to improve its performance on a specific task or dataset.

Virtual Avatar

A digital representation of a person, often used in video calls, online presentations, or virtual environments.

Digital Right of Publicity

A proposed concept extending an individual's right to control the commercial use of their name, image, or likeness to the digital realm, potentially even after death.

Digital Watermarking

The process of embedding information into digital media, which could be used to identify AI-generated content.


FAQs

  • What is LivePortrait? It’s a tool that creates animations from a single image using AI.
  • How does it work? It uses AI to add realistic movements to still images.
  • What are its key features? Control over movements, high processing speed, and versatility.
  • What are the practical applications? Virtual avatars, animated stories, and more.
  • What are the ethical considerations? Emotional impact, potential misuse, and consent issues.


Additional Resources for Inquisitive Minds:


#AITechnology,#PortraitAnimation,#LivePortrait,#DeepfakeEthics,#AIEthics, #DeepLearning


要查看或添加评论,请登录

Diana Wolf T.的更多文章

社区洞察

其他会员也浏览了