登录查看更多内容

Deep Learning: InfoGAN

Ibrahim Sobh - PhD

?? Senior Expert of Artificial Intelligence, Valeo Group | LinkedIn Top Voice | Machine Learning | Deep Learning | Data Science | Computer Vision | NLP | Developer | Researcher | Lecturer

发布日期: 2018年8月20日

InfoGAN is an information-theoretic extension to the Generative Adversarial Network (GAN) that is able to learn disentangled representations in a completely unsupervised manner.

Experiments show that InfoGAN learns interpretable representations that are competitive with representations learned by existing supervised methods. InfoGAN successfully disentangles writing styles from digit shapes on the MNIST dataset and background digits from the central digit on the SVHN dataset. It also discovers visual concepts that include hair styles, presence/absence of eyeglasses, and emotions on the CelebA face dataset. (In an unsupervised manner)

Disentangled representation explicitly represents the salient attributes of a data instance. For example, for a dataset of faces, a useful disentangled representation may allocate a separate set of dimensions for each of the following attributes: facial expression, eye color, hairstyle, presence or absence of eyeglasses, and the identity of the corresponding person.

Basic Idea:

InfoGAN splitting the Generator input into two parts: the traditional noise vector and a new “latent code” vector. InfoGAN proposes a simple modification to GAN’s objective that encourages it to learn interpretable and meaningful representations, by maximizing the mutual information between “latent code” and the generator’s output. Despite its simplicity, we found our method to be surprisingly effective.

This framework is implemented by merely adding a regularization term (red box) to the original GAN’s objective function.

where Lambda is the regularization constant. The I(c;G(z,c)) term is the mutual information between the latent code c and the generator output G(z,c). It’s not easy to calculate the mutual information explicitly, so a lower bound of the mutual information objective is used.

GAN is known to be difficult to train. The experiments based on existing techniques introduced by DCGAN

Results:

For MNIST (hand-written digit dataset), the authors specified a 10-state discrete code, c1 (hoping it would map to the hand-written digit value), and two continuous codes between -1 to +1. (c2, c3)

For Manipulating latent codes on 3D Chairs:

In (a), the continuous code captures the pose of the chair while preserving its shape; in (b), the continuous code captures the widths of different chair types, and smoothly interpolate between them.

Change in emotion, roughly ordered from sad to happy.

In conclusion ...

InfoGAN is completely unsupervised and learns interpretable and disentangled representations on challenging datasets.

InfoGAN adds only negligible computation cost on top of GAN and is easy to train.

Best Regards

要查看或添加评论，请登录

Ibrahim Sobh - PhD的更多文章

The Evolution and Applications of Attention Mechanisms in Deep Learning: A Comprehensive Survey

2025年3月1日

The Evolution and Applications of Attention Mechanisms in Deep Learning: A Comprehensive Survey

Article created by Perplexity Deep Research. Prompt: "You are a deep-learning experienced researcher.

1 条评论
The Judicial Cognitive Process: From Case Inception to Judgment and the Promise of AI Augmentation

2025年3月1日

The Judicial Cognitive Process: From Case Inception to Judgment and the Promise of AI Augmentation

Research Report Created by Perplexity Deep Research My Research Question : "Now I want to dig deeper in the human judge…

3 条评论
How to Learn Artificial Intelligence: A Beginner’s Guide

2024年5月31日

How to Learn Artificial Intelligence: A Beginner’s Guide

Artificial Intelligence (AI) is a fascinating field that simulates human intelligence and task performance using…
[????????????] ?????????????????? ???????????? explained with code ??

2023年1月28日

[????????????] ?????????????????? ???????????? explained with code ??

"During the last two years there has been a plethora of large generative models such as ChatGPT or Stable Diffusion…

2 条评论
A conversation with ChatGPT about AI, study roadmap, applications, interview questions with answers, salaries, and more!

2023年1月21日

A conversation with ChatGPT about AI, study roadmap, applications, interview questions with answers, salaries, and more!

Hello everyone, and thank you all for being here today! Let me introduce our new star, the ChatGPT, who will discuss…
10 Object detectors with code [YOLOF, YOLOX, DETR, Deformable DETR, SparseR-CNN, VarifocalNet, PAA, SABL, ATSS, Double Heads]

2022年2月17日

10 Object detectors with code [YOLOF, YOLOX, DETR, Deformable DETR, SparseR-CNN, VarifocalNet, PAA, SABL, ATSS, Double Heads]

In this article, 10 well-known pre-trained object detectors are loaded and used in a standard and easy way. YOLOF: You…

6 条评论
FNet: Do we need the attention layer at all? [Explained with code]

2021年10月30日

FNet: Do we need the attention layer at all? [Explained with code]

FNet: Mixing Tokens with Fourier Transforms "In this work, we investigate whether simpler token mixing mechanisms can…
Patches Are All You Need! [with code]

2021年10月28日

Patches Are All You Need! [with code]

"It is only a matter of time before Transformers become the dominant architecture for vision domains, just as they have…
MLP is all you need! [with code]

2021年10月23日

MLP is all you need! [with code]

From Google: MLP-Mixer: An all-MLP Architecture for Vision Main idea: "While convolutions and attention are both…

2 条评论
9 Steps for solving any machine learning problem

2021年8月28日

9 Steps for solving any machine learning problem

In this article, we will present a universal blueprint that we can use to attack and solve any machine-learning…

2 条评论

See all articles

Deep Learning: InfoGAN

Ibrahim Sobh - PhD

?? Senior Expert of Artificial Intelligence, Valeo Group | LinkedIn Top Voice | Machine Learning | Deep Learning | Data Science | Computer Vision | NLP | Developer | Researcher | Lecturer

Basic Idea:

Results:

Ibrahim Sobh - PhD的更多文章

社区洞察

其他会员也浏览了

Anomaly Detection: Machine Learning, Deep Learning, AutoML training

Machine Learning Basics

BASIC OF THIRD COMPONENT OF DEEP LEARNING LOSS FUNCTION:

BASICS OF OPTIMIZER IN DEEP LEARNING(AI):

Deep Learning, an Alternative way of Thinking

Machine Learning 101: Understanding the inner workings of AI

4 Types of Machine Learning to Know

Machine Learning: A Comprehensive Overview

Beginner's Guide to Optimizers in Deep Learning

BEST WAYS TO LEARN MACHINE LEARNING

Basic Idea:

Results:

Ibrahim Sobh - PhD的更多文章

The Evolution and Applications of Attention Mechanisms in Deep Learning: A Comprehensive Survey

The Judicial Cognitive Process: From Case Inception to Judgment and the Promise of AI Augmentation

How to Learn Artificial Intelligence: A Beginner’s Guide

[????????????] ?????????????????? ???????????? explained with code ??

A conversation with ChatGPT about AI, study roadmap, applications, interview questions with answers, salaries, and more!

10 Object detectors with code [YOLOF, YOLOX, DETR, Deformable DETR, SparseR-CNN, VarifocalNet, PAA, SABL, ATSS, Double Heads]

FNet: Do we need the attention layer at all? [Explained with code]

Patches Are All You Need! [with code]

MLP is all you need! [with code]

9 Steps for solving any machine learning problem

社区洞察

其他会员也浏览了

Anomaly Detection: Machine Learning, Deep Learning, AutoML training

Machine Learning Basics

BASIC OF THIRD COMPONENT OF DEEP LEARNING LOSS FUNCTION:

BASICS OF OPTIMIZER IN DEEP LEARNING(AI):

Deep Learning, an Alternative way of Thinking

Machine Learning 101: Understanding the inner workings of AI

4 Types of Machine Learning to Know

Machine Learning: A Comprehensive Overview

Beginner's Guide to Optimizers in Deep Learning

BEST WAYS TO LEARN MACHINE LEARNING