Demystify Deep Learning - Vanishing Gradient Problem

Demystify Deep Learning - Vanishing Gradient Problem

In the last article you have understood that how vanishing gradient was a problem while training a deep net. Also, you came to know that Restricted Boltzmann Machines solved the problem.

So, what is Restricted Boltzmann Machines?

No alt text provided for this image

This is a method that can automatically find patterns in the data by reconstructing the input.

Did you catch?

No? Don’t worry!!!

You go through this article till the end. I assure you that you will understand the whole concept.

No alt text provided for this image

Geoffrey Hinton from university of Toronto was the first amongst the researcher who came up with a break through idea of training a deep net and his approach created Restricted Boltzmann Machines which also known as the RBM.

No alt text provided for this image

An RBM is a shallow two layer net and first layer is called as visible layer and the second layer is called as Hidden Layer. Each node in the visible layer is connected to the every node of the hidden layer. RBM is considered restricted as no two nodes in the same layer share a connection.

RBM is the mathematical equivalent of a two way translator. In the forward pass RBM takes the inputs and translates them in to a set of numbers that encode the inputs. In the backward pass it takes the set of numbers and translates them back to form the reconstructed inputs. A well trained net will be able to perform backward translation to reconstructed inputs with a high degree of accuracy. In both the steps like forward and backward weights and biases play a very important role. They allow the RBM to decide the interrelations among the input features and also help RBM decide which input features among others are most important when detecting patterns.

No alt text provided for this image

Through several forward and backward passes RBM is trained to reconstruct the input data. Following three steps are repeated over and over through the training process:

No alt text provided for this image

1.      For forward pass every input is combined with an individual weight and one overall bias and the result is passed to the hidden layer which may or may not activate.

No alt text provided for this image
No alt text provided for this image
No alt text provided for this image












2.      In a backward pass each activation is combined with an individual weight and an overall bias and the result is passed to the visible layer for reconstruction. 

No alt text provided for this image

3.      At the visible layer the reconstruction is compared against the original input to determine the quality of the result. For this RBM uses a measure called “KL Divergence”, which is the difference between actual and recreation.

These steps from 1 to 3 are repeated with different weights and biases until the inputs and recreations are as close as possible.

The fascinating part of RBM is that for RBM to be trained the data need not to be labelled as you already realized from the training flow. This unique quality of RBM made it very useful for real time datasets like photos, videos, voices and sensor data (for IoT) as all of these tends to be unlabeled.

Rather than manually labelled data and introduce errors, RBM automatically sorts through the data and by properly adjusting the weights and biases it’s able to extract the important features and reconstruct the inputs. The most important part is that RBM is actually capable to take decisions on which input features are important and how they should be combined to form the patterns.

In neural net family RBM is a part feature extractor neural nets which all are designed to recognize the hidden patterns in the data. These all feature extractor neural nets are also called Auto Encoders.

At this point you learned about Restricted Boltzmann Machines and how it works.

No alt text provided for this image

But how does this helps solving the problem of vanishing gradient.

In the next article you will check on how.

Till then write your comments in the comment section and follow #demystifydeep

#demystifydeep #deeplearning #machinglearning #classification #technology #artificial_intelligence #ml #ai #neuralnetworks #artificialintelligence #algorithms #algorithm #econometrics #machinelearning 

Deborshee Bhattacharjee

SAP EWM | SAP S/4 HANA | ABAP | ePGDM IIM-R | Learner, Listener & Enabler

5 年

Not only through these blogs..But the way you explain the complicated topics offline so effortlessly, is also commendable..!! Keep writing ?? for your readers!!

Samir Paul

Leadership in Data & AI | Marketing | Supply Chain

5 年

Explained in a very lucid way...this is great

要查看或添加评论,请登录

Subhasish Bhattacharjee的更多文章

社区洞察

其他会员也浏览了