Demystify Deep Learning - Vanishing Gradient Problem
Subhasish Bhattacharjee
Director Business Solution | Co-Founder & VP | Product & Program Manager | Big Data | Data Science
In the last article you have understood that how vanishing gradient was a problem while training a deep net. Also, you came to know that Restricted Boltzmann Machines solved the problem.
So, what is Restricted Boltzmann Machines?
This is a method that can automatically find patterns in the data by reconstructing the input.
Did you catch?
No? Don’t worry!!!
You go through this article till the end. I assure you that you will understand the whole concept.
Geoffrey Hinton from university of Toronto was the first amongst the researcher who came up with a break through idea of training a deep net and his approach created Restricted Boltzmann Machines which also known as the RBM.
An RBM is a shallow two layer net and first layer is called as visible layer and the second layer is called as Hidden Layer. Each node in the visible layer is connected to the every node of the hidden layer. RBM is considered restricted as no two nodes in the same layer share a connection.
RBM is the mathematical equivalent of a two way translator. In the forward pass RBM takes the inputs and translates them in to a set of numbers that encode the inputs. In the backward pass it takes the set of numbers and translates them back to form the reconstructed inputs. A well trained net will be able to perform backward translation to reconstructed inputs with a high degree of accuracy. In both the steps like forward and backward weights and biases play a very important role. They allow the RBM to decide the interrelations among the input features and also help RBM decide which input features among others are most important when detecting patterns.
Through several forward and backward passes RBM is trained to reconstruct the input data. Following three steps are repeated over and over through the training process:
1. For forward pass every input is combined with an individual weight and one overall bias and the result is passed to the hidden layer which may or may not activate.
2. In a backward pass each activation is combined with an individual weight and an overall bias and the result is passed to the visible layer for reconstruction.
3. At the visible layer the reconstruction is compared against the original input to determine the quality of the result. For this RBM uses a measure called “KL Divergence”, which is the difference between actual and recreation.
These steps from 1 to 3 are repeated with different weights and biases until the inputs and recreations are as close as possible.
The fascinating part of RBM is that for RBM to be trained the data need not to be labelled as you already realized from the training flow. This unique quality of RBM made it very useful for real time datasets like photos, videos, voices and sensor data (for IoT) as all of these tends to be unlabeled.
Rather than manually labelled data and introduce errors, RBM automatically sorts through the data and by properly adjusting the weights and biases it’s able to extract the important features and reconstruct the inputs. The most important part is that RBM is actually capable to take decisions on which input features are important and how they should be combined to form the patterns.
In neural net family RBM is a part feature extractor neural nets which all are designed to recognize the hidden patterns in the data. These all feature extractor neural nets are also called Auto Encoders.
At this point you learned about Restricted Boltzmann Machines and how it works.
But how does this helps solving the problem of vanishing gradient.
In the next article you will check on how.
Till then write your comments in the comment section and follow #demystifydeep
#demystifydeep #deeplearning #machinglearning #classification #technology #artificial_intelligence #ml #ai #neuralnetworks #artificialintelligence #algorithms #algorithm #econometrics #machinelearning
SAP EWM | SAP S/4 HANA | ABAP | ePGDM IIM-R | Learner, Listener & Enabler
5 年Not only through these blogs..But the way you explain the complicated topics offline so effortlessly, is also commendable..!! Keep writing ?? for your readers!!
Leadership in Data & AI | Marketing | Supply Chain
5 年Explained in a very lucid way...this is great