MODNet: Remove Background in real-time [Demo and code included]
Ibrahim Sobh - PhD
?? Senior Expert of Artificial Intelligence, Valeo Group | LinkedIn Top Voice | Machine Learning | Deep Learning | Data Science | Computer Vision | NLP | Developer | Researcher | Lecturer
Do you really need a green screen for real-time portrait matting?
Introduction
MODNet is a light-weight matting objective decomposition network (MODNet), which can process portrait matting from a single input image in realtime.
- MODNet is much faster than contemporaneous matting methods and runs at 63 frames per second.
- MODNet achieves remarkable results in daily photos and videos.
- MODNet is easy to be trained in an end-to-end style.
It is not an easy task to find the person and remove the background. Many techniques are using basic computer vision algorithms for this task quickly but not precisely.
MODNet is simple, fast, and effective to avoid using a green screen in real-time portrait matting.
Related Work
Deep Image Matting by Adobe Research, is an example of using the power of deep learning for this task. However, its implementation is a more complicated approach compared to MODNet. Deep Image Matting consists of two stages, the first stage is a deep convolutional encoder-decoder network that takes an image patch and a trimap as input s and predict the alpha matte of the image. The second part is a small convolutional network that refines the alpha matte predictions of the first network to have more accurate alpha values and sharper edges.
MODNet Approach
By taking only RGB images as input, MODNet enables the prediction of alpha mattes under changing scenes.
MODNet can process trimap-free portrait matting in realtime under changing scenes.
- (a)MODNet is trained on the labeled dataset to learn matting sub-objectives from RGB images. (supervised)
- (b) To adapt to real-world data, MODNet is finetuned on the unlabeled data by using the consistency between sub-objectives. (self-supervised)
- (c) In the application of video matting, one-frame delay (OFD) trick is used to help smooth the predicted alpha mattes of the video sequence.
To overcome the domain shift problem, we introduce a self-supervised strategy based on sub-objective consistency (SOC) for MODNet. This strategy utilizes the consistency among the sub-objectives to reduce artifacts in the predicted alpha matte.
Architecture of MODNet
Given an input image I, MODNet predicts:
- Human semantics sp
- Boundary details dp
- Final alpha matte αp
Through three interdependent branches, S, D, and F, which are constrained by specific supervisions generated from the ground truth matte αg. The decomposed sub-objectives are correlated and help strengthen each other, we can optimize MODNet end-to-end.
MODNet + SOC + OFD
Online supplementary video for more results
Conclusions
By taking only RGB images as input, MODNet enables the prediction of alpha mattes under changing scenes.
- MODNet suffers less from the domain shift problem in practice due to the proposed SOC and OFD.
- MODNet is shown to have good performances on the carefully designed PPM-100 benchmark and a variety of real-world data.
- MODNet is not able to handle strange costumes and strong motion blurs that are not covered by the training set.
- One possible future work is to address video matting under motion blurs through additional sub-objectives, e.g., optical flow estimation.
Try yourself!
Related online tools
Regards
Freelancer Graphic Design | Logo and Branding designer | Adobe Photoshop | Photo Editing | Youtube thumbnail | BFA
1 年Dear, Refferring to the offering for the background remove. I am a professional background remover. having 5 years of dyanamic experience working with multi national companies. I can create a very engaging. Eye catchy and good looking background remove that defineltly inspire the students as per your planning. Find me on Fiverr. Would love to work with you there. Link below https://www.fiverr.com/s/BxDVQd Regards, SAURABH ANURAGI
Data Scientist @Spinneys Egypt
3 年very good job ... but, can i change that white background?how?
PhD student, M.sc. in Electronics and Communications Engineering | AI researcher
3 年???? ???........