登录查看更多内容

Demystify Deep Learning - Vanishing Gradient Problem

Subhasish Bhattacharjee

Director Business Solution | Co-Founder & VP | Product & Program Manager | Big Data | Data Science

发布日期: 2019年9月7日

In the last article you have understood that how vanishing gradient was a problem while training a deep net. Also, you came to know that Restricted Boltzmann Machines solved the problem.

So, what is Restricted Boltzmann Machines?

This is a method that can automatically find patterns in the data by reconstructing the input.

Did you catch?

No? Don’t worry!!!

You go through this article till the end. I assure you that you will understand the whole concept.

Geoffrey Hinton from university of Toronto was the first amongst the researcher who came up with a break through idea of training a deep net and his approach created Restricted Boltzmann Machines which also known as the RBM.

An RBM is a shallow two layer net and first layer is called as visible layer and the second layer is called as Hidden Layer. Each node in the visible layer is connected to the every node of the hidden layer. RBM is considered restricted as no two nodes in the same layer share a connection.

RBM is the mathematical equivalent of a two way translator. In the forward pass RBM takes the inputs and translates them in to a set of numbers that encode the inputs. In the backward pass it takes the set of numbers and translates them back to form the reconstructed inputs. A well trained net will be able to perform backward translation to reconstructed inputs with a high degree of accuracy. In both the steps like forward and backward weights and biases play a very important role. They allow the RBM to decide the interrelations among the input features and also help RBM decide which input features among others are most important when detecting patterns.

Through several forward and backward passes RBM is trained to reconstruct the input data. Following three steps are repeated over and over through the training process:

1. For forward pass every input is combined with an individual weight and one overall bias and the result is passed to the hidden layer which may or may not activate.

2. In a backward pass each activation is combined with an individual weight and an overall bias and the result is passed to the visible layer for reconstruction.

3. At the visible layer the reconstruction is compared against the original input to determine the quality of the result. For this RBM uses a measure called “KL Divergence”, which is the difference between actual and recreation.

These steps from 1 to 3 are repeated with different weights and biases until the inputs and recreations are as close as possible.

The fascinating part of RBM is that for RBM to be trained the data need not to be labelled as you already realized from the training flow. This unique quality of RBM made it very useful for real time datasets like photos, videos, voices and sensor data (for IoT) as all of these tends to be unlabeled.

Rather than manually labelled data and introduce errors, RBM automatically sorts through the data and by properly adjusting the weights and biases it’s able to extract the important features and reconstruct the inputs. The most important part is that RBM is actually capable to take decisions on which input features are important and how they should be combined to form the patterns.

In neural net family RBM is a part feature extractor neural nets which all are designed to recognize the hidden patterns in the data. These all feature extractor neural nets are also called Auto Encoders.

At this point you learned about Restricted Boltzmann Machines and how it works.

But how does this helps solving the problem of vanishing gradient.

In the next article you will check on how.

Till then write your comments in the comment section and follow #demystifydeep

#demystifydeep #deeplearning #machinglearning #classification #technology #artificial_intelligence #ml #ai #neuralnetworks #artificialintelligence #algorithms #algorithm #econometrics #machinelearning

Deborshee Bhattacharjee

SAP EWM | SAP S/4 HANA | ABAP | ePGDM IIM-R | Learner, Listener & Enabler

5 年

Not only through these blogs..But the way you explain the complicated topics offline so effortlessly, is also commendable..!! Keep writing ?? for your readers!!

1 次回应

Samir Paul

Leadership in Data & AI | Marketing | Supply Chain

5 年

Explained in a very lucid way...this is great

1 次回应

查看更多评论

要查看或添加评论，请登录

Subhasish Bhattacharjee的更多文章

Designing for low latency streaming platform

2019年11月23日

Designing for low latency streaming platform

The demand for low latency stream processing is growing in this data driven era. Motivation behind it is that handling…
Setting up AI based guest-centric hospitality service

2019年10月16日

Setting up AI based guest-centric hospitality service

If you enjoy great hospitality in any Hotels, Resorts or Restaurants you have visited to, you tend to become a loyal…
Innovations for Need or Needs for Innovation

2019年9月24日

Innovations for Need or Needs for Innovation

"Necessity is the mother of invention" It means, roughly, that the primary driving force for any invention is a need…

2 条评论
Embracing AI in Insurance service is radical

2019年9月22日

Embracing AI in Insurance service is radical

For insurance companies becoming a money-making business is all about finding and building customer relationships and…
Implementing an omnichannel solution may spawn increased risk of losing customer

2019年9月18日

Implementing an omnichannel solution may spawn increased risk of losing customer

Ecommerce is projected to be the largest retail channel in the world in just two years. Statista forecasts that…
Demystify Deep Learning - Recursive Neural Tensor Networks

2019年9月15日

Demystify Deep Learning - Recursive Neural Tensor Networks

Sometimes it is important and useful to understand the hierarchy of the data structure. For example, how to understand…
Demystify Deep Learning - Dimensionality Reduction with Autoencoders

2019年9月12日

Demystify Deep Learning - Dimensionality Reduction with Autoencoders

In this article you will look in the family of deep learning model Autoencoders using which you can achieve higher…

1 条评论
Demystify Deep Learning - Recurrent Neural Network

2019年9月11日

Demystify Deep Learning - Recurrent Neural Network

What do you do if the pattern of your data changes with time? In this case it is best to use Recurrent Neural Network…

7 条评论
Demystify Deep Learning - Convolutional Neural Nets

2019年9月10日

Demystify Deep Learning - Convolutional Neural Nets

Convolutional Neural Net (CNN) is till now dominating and most influential in AI field. But these are little…
Demystify Deep Learning - Solution for Vanishing Gradient

2019年9月8日

Demystify Deep Learning - Solution for Vanishing Gradient

In the last article you came to know that Restricted Boltzmann Machines is a powerful tool which automatically sorts…

See all articles

Demystify Deep Learning - Vanishing Gradient Problem

Subhasish Bhattacharjee

Director Business Solution | Co-Founder & VP | Product & Program Manager | Big Data | Data Science

Did you catch?

No? Don’t worry!!!

Subhasish Bhattacharjee的更多文章

社区洞察

其他会员也浏览了

HOW TO CATEGORIZE YOUR DEEP LEARNING PROJECT ?

Deep Learning, an Alternative way of Thinking

An Introduction to Deep Learning

Optimization in deep learning- Learn with examples

Optimization in deep learning- Learn with examples

Optimization in deep learning- Learn with examples

Against deep learning: on bio-inspired alternatives

What is Semantic Image Segmentation and Types for Deep Learning?

Transfer Learning: Garbage Classification Using MobileNetV2

Did you catch?

No? Don’t worry!!!

Subhasish Bhattacharjee的更多文章

Designing for low latency streaming platform

Setting up AI based guest-centric hospitality service

Innovations for Need or Needs for Innovation

Embracing AI in Insurance service is radical

Implementing an omnichannel solution may spawn increased risk of losing customer

Demystify Deep Learning - Recursive Neural Tensor Networks

Demystify Deep Learning - Dimensionality Reduction with Autoencoders

Demystify Deep Learning - Recurrent Neural Network

Demystify Deep Learning - Convolutional Neural Nets

Demystify Deep Learning - Solution for Vanishing Gradient

社区洞察

其他会员也浏览了

HOW TO CATEGORIZE YOUR DEEP LEARNING PROJECT ?

Deep Learning, an Alternative way of Thinking

An Introduction to Deep Learning

Optimization in deep learning- Learn with examples

Optimization in deep learning- Learn with examples

Optimization in deep learning- Learn with examples

Against deep learning: on bio-inspired alternatives

What is Semantic Image Segmentation and Types for Deep Learning?

Transfer Learning: Garbage Classification Using MobileNetV2