登录查看更多内容

Deep Learning: Teach your Network How to draw!

Ibrahim Sobh - PhD

?? Senior Expert of Artificial Intelligence, Valeo Group | LinkedIn Top Voice | Machine Learning | Deep Learning | Data Science | Computer Vision | NLP | Developer | Researcher | Lecturer

发布日期: 2016年8月11日

Teaching your kids!It is common that parents teach their kids how to draw images by presenting some images to their kids and ask them to draw them. Usually, kids do not copy pixel by pixel, they use a more compressed internal representation of the input image and use it to reconstruct the original image. (Some copy pixel by pixel, this is not our case :) )

Neural Networks: In our case, input is just the pixels of the input image, the desired output is the same input image. The training of the network is actually adjusting the parameter to achieve certain objective, our objective is clear: ask the network to generate the same input image from the internal/compressed representation.

We will train Neural Networks to do the same. For example consider the fully connected network:

Input: gray scale 64*64 images, vector length = 4096
Internal/compressed representation length = 1024 < 4096
Output: the same gray scale 64*64 images, vector length = 4096

Here, after training the network, I give it some new images and asked it to reconstruct these images (First raw is the input images, second raw is the reconstructed images). Data set used in the following examples is here

Internal layer size = 1024 (looks very nice!)

Internal layer size = 512 (looks good)

Internal layer size = 128

Internal layer size = only 4 (looks like average of all faces)

Wait a minute! This looks like a lossy data compression method where we need only 1024 vector (for example) to represent 4096 vector. In other words, the 1024 vector holds the important features to reconstruct the original image. This is not generic compression, it depends on the training process. Moreover, we can control the compression ratio and the quality of the reconstructed images by changing the internal vector length. Increasing the internal length will reconstruct better looking images and will increase the parameter of the model.

Convolutional Networks

Our inputs are images, it make sense to use ConvNets, for example:

Input: gray scale 64*64 images, vector length = 4096
Convolution size 8
Max pooling
Convolution size 8
Max pooling
Output: the same gray scale 64*64 images, vector length = 4096

As expected, we have better results even for smaller number of features

Colored images

Gray images are saved in 2 dimensional space where each point in the space represents the gray level of the corresponding pixel (from 0 to 1 or from 0 to 255). On the other hand, colored images are saved in 3 dimensional space where we have three values representing the degree of red, green and blue (R,G,B) components for each pixel.

For neural networks, colored images is very similar to gray images, both are just vectors.

Here I used the CIFAR-10 dataset and ConvNets.

(First raw is the input original images, second raw is the reconstructed images)

After 1 epoch (looks good)

After 5 epoch (looks better)

Autoencoders

What we have implemented above is known as “Autoencoders”. Training deep neural networks was very difficult. Auto encoders were used in greedy layer-wise pre-training for deep convolutional neural networks. But now we usually apply better random weight initialization schemes, Relu and batch normalization that enable using even deeper networks.

Tip: Sometimes decoder and encoder share weights as regularization strategy.

Autoencoders usually used for dimensionality reduction. With appropriate settings, autoencoders can learn data projections that are more interesting than PCA.

Autoencoders Summary

Try to reconstruct the input
Used to learn features or summarization of the data
Features can be used for supervised tasks

Regards

要查看或添加评论，请登录

Ibrahim Sobh - PhD的更多文章

The Evolution and Applications of Attention Mechanisms in Deep Learning: A Comprehensive Survey

2025年3月1日

The Evolution and Applications of Attention Mechanisms in Deep Learning: A Comprehensive Survey

Article created by Perplexity Deep Research. Prompt: "You are a deep-learning experienced researcher.

1 条评论
The Judicial Cognitive Process: From Case Inception to Judgment and the Promise of AI Augmentation

2025年3月1日

The Judicial Cognitive Process: From Case Inception to Judgment and the Promise of AI Augmentation

Research Report Created by Perplexity Deep Research My Research Question : "Now I want to dig deeper in the human judge…

3 条评论
How to Learn Artificial Intelligence: A Beginner’s Guide

2024年5月31日

How to Learn Artificial Intelligence: A Beginner’s Guide

Artificial Intelligence (AI) is a fascinating field that simulates human intelligence and task performance using…
[????????????] ?????????????????? ???????????? explained with code ??

2023年1月28日

[????????????] ?????????????????? ???????????? explained with code ??

"During the last two years there has been a plethora of large generative models such as ChatGPT or Stable Diffusion…

2 条评论
A conversation with ChatGPT about AI, study roadmap, applications, interview questions with answers, salaries, and more!

2023年1月21日

A conversation with ChatGPT about AI, study roadmap, applications, interview questions with answers, salaries, and more!

Hello everyone, and thank you all for being here today! Let me introduce our new star, the ChatGPT, who will discuss…
10 Object detectors with code [YOLOF, YOLOX, DETR, Deformable DETR, SparseR-CNN, VarifocalNet, PAA, SABL, ATSS, Double Heads]

2022年2月17日

10 Object detectors with code [YOLOF, YOLOX, DETR, Deformable DETR, SparseR-CNN, VarifocalNet, PAA, SABL, ATSS, Double Heads]

In this article, 10 well-known pre-trained object detectors are loaded and used in a standard and easy way. YOLOF: You…

6 条评论
FNet: Do we need the attention layer at all? [Explained with code]

2021年10月30日

FNet: Do we need the attention layer at all? [Explained with code]

FNet: Mixing Tokens with Fourier Transforms "In this work, we investigate whether simpler token mixing mechanisms can…
Patches Are All You Need! [with code]

2021年10月28日

Patches Are All You Need! [with code]

"It is only a matter of time before Transformers become the dominant architecture for vision domains, just as they have…
MLP is all you need! [with code]

2021年10月23日

MLP is all you need! [with code]

From Google: MLP-Mixer: An all-MLP Architecture for Vision Main idea: "While convolutions and attention are both…

2 条评论
9 Steps for solving any machine learning problem

2021年8月28日

9 Steps for solving any machine learning problem

In this article, we will present a universal blueprint that we can use to attack and solve any machine-learning…

2 条评论

See all articles

Deep Learning: Teach your Network How to draw!

Ibrahim Sobh - PhD

?? Senior Expert of Artificial Intelligence, Valeo Group | LinkedIn Top Voice | Machine Learning | Deep Learning | Data Science | Computer Vision | NLP | Developer | Researcher | Lecturer

Ibrahim Sobh - PhD的更多文章

社区洞察

其他会员也浏览了

Batch Size Selection in Deep Learning: A Comprehensive Analysis of Training Dynamics and Performance Optimization

DEEP LEARNING INTERVIEW QUESTIONS

Deep Learning Essentials

Deep Learning: Explaining Optimization (Gradient Descent, Momentum, RMSprop, Adam)

Deep Learning: The magic of Batch Normalization, Code Included.

How Does Regularization Affect Model Performance in Supervised Learning?

Deep Learning In Reinforcement Learning, Training Workflow, Categories of Deep Learning, Deep Q-Network, & More.

what is AutoEncoder and its Types/Applications

Compression of information is essential to intelligence

Gentle Introduction to the Adam Optimization Algorithm for Deep Learning

Ibrahim Sobh - PhD的更多文章

The Evolution and Applications of Attention Mechanisms in Deep Learning: A Comprehensive Survey

The Judicial Cognitive Process: From Case Inception to Judgment and the Promise of AI Augmentation

How to Learn Artificial Intelligence: A Beginner’s Guide

[????????????] ?????????????????? ???????????? explained with code ??

A conversation with ChatGPT about AI, study roadmap, applications, interview questions with answers, salaries, and more!

10 Object detectors with code [YOLOF, YOLOX, DETR, Deformable DETR, SparseR-CNN, VarifocalNet, PAA, SABL, ATSS, Double Heads]

FNet: Do we need the attention layer at all? [Explained with code]

Patches Are All You Need! [with code]

MLP is all you need! [with code]

9 Steps for solving any machine learning problem

社区洞察

其他会员也浏览了

Batch Size Selection in Deep Learning: A Comprehensive Analysis of Training Dynamics and Performance Optimization

DEEP LEARNING INTERVIEW QUESTIONS

Deep Learning Essentials

Deep Learning: Explaining Optimization (Gradient Descent, Momentum, RMSprop, Adam)

Deep Learning: The magic of Batch Normalization, Code Included.

How Does Regularization Affect Model Performance in Supervised Learning?

Deep Learning In Reinforcement Learning, Training Workflow, Categories of Deep Learning, Deep Q-Network, & More.

what is AutoEncoder and its Types/Applications

Compression of information is essential to intelligence

Gentle Introduction to the Adam Optimization Algorithm for Deep Learning