登录查看更多内容

MODNet: Remove Background in real-time [Demo and code included]

Ibrahim Sobh - PhD

?? Senior Expert of Artificial Intelligence, Valeo Group | LinkedIn Top Voice | Machine Learning | Deep Learning | Data Science | Computer Vision | NLP | Developer | Researcher | Lecturer

发布日期: 2020年12月28日

+ 关注

Do you really need a green screen for real-time portrait matting?

Introduction

MODNet is a light-weight matting objective decomposition network (MODNet), which can process portrait matting from a single input image in realtime.

MODNet is much faster than contemporaneous matting methods and runs at 63 frames per second.
MODNet achieves remarkable results in daily photos and videos.
MODNet is easy to be trained in an end-to-end style.

It is not an easy task to find the person and remove the background. Many techniques are using basic computer vision algorithms for this task quickly but not precisely.

MODNet is simple, fast, and effective to avoid using a green screen in real-time portrait matting.

Related Work

Deep Image Matting by Adobe Research, is an example of using the power of deep learning for this task. However, its implementation is a more complicated approach compared to MODNet. Deep Image Matting consists of two stages, the first stage is a deep convolutional encoder-decoder network that takes an image patch and a trimap as input s and predict the alpha matte of the image. The second part is a small convolutional network that refines the alpha matte predictions of the first network to have more accurate alpha values and sharper edges.

MODNet Approach

By taking only RGB images as input, MODNet enables the prediction of alpha mattes under changing scenes.

MODNet can process trimap-free portrait matting in realtime under changing scenes.

(a)MODNet is trained on the labeled dataset to learn matting sub-objectives from RGB images. (supervised)
(b) To adapt to real-world data, MODNet is finetuned on the unlabeled data by using the consistency between sub-objectives. (self-supervised)
(c) In the application of video matting, one-frame delay (OFD) trick is used to help smooth the predicted alpha mattes of the video sequence.

To overcome the domain shift problem, we introduce a self-supervised strategy based on sub-objective consistency (SOC) for MODNet. This strategy utilizes the consistency among the sub-objectives to reduce artifacts in the predicted alpha matte.

Architecture of MODNet

Given an input image I, MODNet predicts:

Human semantics sp
Boundary details dp
Final alpha matte αp

Through three interdependent branches, S, D, and F, which are constrained by specific supervisions generated from the ground truth matte αg. The decomposed sub-objectives are correlated and help strengthen each other, we can optimize MODNet end-to-end.

MODNet + SOC + OFD

Online supplementary video for more results

Conclusions

By taking only RGB images as input, MODNet enables the prediction of alpha mattes under changing scenes.

MODNet suffers less from the domain shift problem in practice due to the proposed SOC and OFD.
MODNet is shown to have good performances on the carefully designed PPM-100 benchmark and a variety of real-world data.
MODNet is not able to handle strange costumes and strong motion blurs that are not covered by the training set.
One possible future work is to address video matting under motion blurs through additional sub-objectives, e.g., optical flow estimation.

Try yourself!

Colab Image Demo

Colab WebCam Video Demo

Regards

SAURABH ANURAGI

1 年

Dear, Refferring to the offering for the background remove. I am a professional background remover. having 5 years of dyanamic experience working with multi national companies. I can create a very engaging. Eye catchy and good looking background remove that defineltly inspire the students as per your planning. Find me on Fiverr. Would love to work with you there. Link below https://www.fiverr.com/s/BxDVQd Regards, SAURABH ANURAGI

Ali Gad ????

Data Scientist @Spinneys Egypt

3 年

very good job ... but, can i change that white background?how?

Ayad Almamary

PhD student, M.sc. in Electronics and Communications Engineering | AI researcher

3 年

???? ???........

1 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

MODNet: Remove Background in real-time [Demo and code included]

Ibrahim Sobh - PhD

?? Senior Expert of Artificial Intelligence, Valeo Group | LinkedIn Top Voice | Machine Learning | Deep Learning | Data Science | Computer Vision | NLP | Developer | Researcher | Lecturer

Introduction

Related Work

MODNet Approach

Architecture of MODNet

MODNet + SOC + OFD

Online supplementary video for more results

Conclusions

Try yourself!

Related online tools

更多精彩文章

社区洞察

其他会员也浏览了

Paper Review: Chameleon: Mixed-Modal Early-Fusion Foundation Models

Pico Jarvis: An LLM-based Chatbot Demo with RAG (Part 3)

Paper Review: LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models

Paper Review: Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

Architecture Search Framework for Inference-Time Techniques & Designing Priors for Better Few-Shot Image Synthesis

Diagram GPT's for Seeing Connections in a SWMM5 in Input File

Functionary V2.4 Model Release

YOLO-World: A Fresh Approach to Object Detection Integrating Image Features and Text Embeddings

Image Prompting - Midjourney

Revealing the Geometric Bridge: Transformers and Support Vector Machines in Optimization Geometry

Introduction

Related Work

MODNet Approach

Architecture of MODNet

MODNet + SOC + OFD

Online supplementary video for more results

Conclusions

Try yourself!

Related online tools

How to Learn Artificial Intelligence: A Beginner’s Guide

2024年5月31日

[????????????] ?????????????????? ???????????? explained with code ??

2023年1月28日

A conversation with ChatGPT about AI, study roadmap, applications, interview questions with answers, salaries, and more!

2023年1月21日

10 Object detectors with code [YOLOF, YOLOX, DETR, Deformable DETR, SparseR-CNN, VarifocalNet, PAA, SABL, ATSS, Double Heads]

2022年2月17日

FNet: Do we need the attention layer at all? [Explained with code]

2021年10月30日

Patches Are All You Need! [with code]

2021年10月28日

MLP is all you need! [with code]

2021年10月23日

9 Steps for solving any machine learning problem

2021年8月28日

Anatomy of the Beast with many heads! [with code]

2021年6月12日

The magic of XLM-R: Unsupervised Cross-lingual Representation Learning at Scale

2021年1月16日

社区洞察

其他会员也浏览了

Paper Review: Chameleon: Mixed-Modal Early-Fusion Foundation Models

Pico Jarvis: An LLM-based Chatbot Demo with RAG (Part 3)

Paper Review: LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models

Paper Review: Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

Architecture Search Framework for Inference-Time Techniques & Designing Priors for Better Few-Shot Image Synthesis

Diagram GPT's for Seeing Connections in a SWMM5 in Input File

Functionary V2.4 Model Release

YOLO-World: A Fresh Approach to Object Detection Integrating Image Features and Text Embeddings

Image Prompting - Midjourney

Revealing the Geometric Bridge: Transformers and Support Vector Machines in Optimization Geometry