Face Rigging

Face Rigging

To create and reciprocate the motion of any driving video into anime with AI.

A brief about the module is by converting motions into anime characters with the help of different functionality like face shifter , rota tor and the combiner; this all can be possible with the help of 68 facial landmark detection approaches. The model is working Ill with or without a Webcam with custom background effects.

My scope is to moderate the model with motion of video to any anime character and for this initially I have worked upon the model configuration where I have focused on facial landmarks detection which is easily achievable.So the next approach is to make movement of that facial landmark which can be executed with a face shifter and rota tor with the facial values so all I have to do is combine all the stat points to one through a model combiner and I will have results in output.

Goal and Vision: 

To visualise and reciprocate the motion of driving video into custom characters with customisation and effects which will turn into a cartoonist world.So initially I have tried to cover up this with Anime's if it works then will customise with humans also with cartoonist world in 2D/3D both.

Model brief and Implementation:

Approach is to work with animation characters (3D Model approach) which consist of a face shifter and face rota tor in which it traces out the facial expressions and forward the data points in the face combiner where I can see the actual motion results , here model is not generating any sort of files while execution as it is transferring the data points from one model to another which helps in improved performance.

Tested with predefined files to automate the characters.So I have approached this model for custom characters but for this I need to do a lot of morphing and editing on local with specific terms and conditions with other software like GNU-GIMP but later on it is working with custom.

Technique:

The images that are fed from voxceleb2 are resized from 224x224 to 256x256 by using zero-padding. This is done so that spatial dimensions won't get affected when passing through down sampling layers.

The embedded uses 6 down sampling residual blocks in the middle a self-attention layer is added and in the end the output takes the last residual block which is resized to a vector of size 512 through max pooling.

The generator and discriminator uses the same architecture as the em bedder

Module with code sample:

Code description: It takes a driving video and an anime image as an input , processes it and converts it into the anime motion as in output. 

Code will work for both Webcam and on video.

Code workflow:

Code utilise the necessary packages and libraries to import

I provide driving video and input anime image and the output path as command line argument , where the driving video needs to be in mp4 format also I can provide live Webcam feed through index (--human_video 0) , for the input anime image it must be in RGBA format with transparent background and 256 x 256 PX size.

I can also provide a background image for customising the final outputs background.

The model processes the driving video frame by frame and extracts the facial landmarks using the landmark detection method from frame , then calculates the head, eyes and mouth position from it using Euler angles and  passes it to the next .

Note : This section covered in function update_image(self)

Next is the anime image to the poser module which returns the generated anime image.

 posed_image = self.poser.pose(self.source_image,self.current_pose).detach().CPU()

Now the algorithm performs post processing where it improves the quality of image by removing noise and also adds the background to the process image by calling the function:

pil_image = self.post_processing(pil_image, resized_frame_bg)

The final output image is returned to TK module which displays the processed result.

So result is updated and generated by command

self.master.after(1000 // 60, self.update_image)

For processing with Webcam I can edit the code and changed the index to 0 so it will work for Webcam as well

video_capture = cv2.VideoCapture(opt.human_video)

Results :

No alt text provided for this image
No alt text provided for this image

Customise Results:

As to use features , have made some in modules to work with custom image background , custom video background and on webcam.

No alt text provided for this image
No alt text provided for this image

Extensions:-

This section has some ideas for extending this article that you may wish to explore.

  • Different Datasets - Update the example to use it on some other datasets.
  • Without Identity Mapping - Update the example to train the generator models without the identity mapping and then compare results.
  • Will extend this module into a complete software package.

If you explore any of these extensions, I’d love to know.

Thank You.


要查看或添加评论,请登录

AYUSH GUPTA的更多文章

  • KOSMOS-2: A Multimodal LLM Model for AI Grounding Capabilities

    KOSMOS-2: A Multimodal LLM Model for AI Grounding Capabilities

    The KOSMOS-2 model represents a significant leap in this direction by introducing advanced multimodal grounding…

    1 条评论
  • Improving Large Language Model Performance with BRANCH-SOLVE-MERGE

    Improving Large Language Model Performance with BRANCH-SOLVE-MERGE

    In the realm of artificial intelligence, Large Language Models (LLMs) have become instrumental in various text…

  • Quantum AI : The Race to Unlock AI's Next Frontier

    Quantum AI : The Race to Unlock AI's Next Frontier

    The convergence of quantum computing and artificial intelligence has sparked a global race to harness the immense…

  • "Quantum Computing's Impact on Artificial Intelligence"

    "Quantum Computing's Impact on Artificial Intelligence"

    Quantum computing leverages the principles of quantum mechanics to perform computations that are exponentially faster…

    2 条评论
  • "Transforming Industries: The Impact of Industry 4.0"

    "Transforming Industries: The Impact of Industry 4.0"

    Happy Reading !!! Industry 4.0, is transforming the way we think and manufacture products.

  • 2D Image to 3D Model

    2D Image to 3D Model

    Goal: To visualize and reciprocate the motion (render and animation) in 3D model or to convert flat 2D image into 3D…

  • Virtual Try-on Using AI

    Virtual Try-on Using AI

    Goal and Vision: To visualise and create Cloth-Segmentation and Cloth-Extraction images respectively. Implementation of…

  • 2D to 3D Image Transform With Effects

    2D to 3D Image Transform With Effects

    Goal and Vision: To visualize and create an image from flat 2D image to 3D image with effects. Module Overview: Model…

    1 条评论
  • Full Body Rigging

    Full Body Rigging

    A brief about the module is by converting motions into characters with the help of different functionality like…

  • Transform your face using AI in just one tap !!!

    Transform your face using AI in just one tap !!!

    Today, AI-powered face-swapping technology which awed the internet now a days with its fantastic results and also it is…

社区洞察

其他会员也浏览了