登录查看更多内容

Simplifying Deep Learning - Part II

Rohan Chikorde

VP - AIML at BNY Mellon | 17k+ followers | AIML Corporate Trainer | University Professor | Speaker

发布日期: 2018年2月10日

+ 关注

Outline of Deep Belief Nets Algorithm

An RBM can extract features and reconstruct input data, but it still lacks the ability to combat the vanishing gradient. However, through a clever combination of several stacked RBMs and a classifier, you can form a neural net that can solve the problem. This net is known as a Deep Belief Network.

The Deep Belief Network, or DBN, was also conceived by Geoff Hinton. These powerful nets are believed to be used by Google for their work on the image recognition problem. In terms of structure, a Deep Belief is identical to a Multilayer Perceptron, but structure is where their similarities end – a DBN has a radically different training method which allows it to tackle the vanishing gradient

The method is known as Layer-wise, unsupervised, greedy pre-training. Essentially, the DBN is trained two layers at a time, and these two layers are treated like an RBM. Throughout the net, the hidden layer of an RBM acts as the input layer of the adjacent one. So the first RBM is trained, and its outputs are then used as inputs to the next RBM. This procedure is repeated until the output layer is reached.

After this training process, the DBN is capable of recognizing the inherent patterns in the data. In other words, it’s a sophisticated, multilayer feature extractor. The unique aspect of this type of net is that each layer ends up learning the full input structure. In other types of deep nets, layers generally learn progressively complex patterns – for facial recognition, early layers could detect edges and later layers would combine them to form facial features. On the other hand, A DBN learns the hidden patterns globally, like a camera slowly bringing an image into focus.

In the end, a DBN still requires a set of labels to apply to the resulting patterns. As a final step, the DBN is fine-tuned with supervised learning and a small set of labeled examples. After making minor tweaks to the weights and biases, the net will achieve a slight increase in accuracy.

This entire process can be completed in a reasonable amount of time using GPUs, and the resulting net is typically very accurate. Thus the DBN is an effective solution to the vanishing gradient problem. As an added real-world bonus, the training process only requires a small set of labelled data.

Outline of Convolutional Neural Network (CNN) Algorithm:

Out of all the current Deep Learning applications, machine vision remains one of the most popular. Since Convolutional Neural Nets (CNN) are one of the best available tools for machine vision, these nets have helped Deep Learning become one of the hottest topics in AI.

CNNs are deep nets that are used for image, object, and even speech recognition. Pioneered by Yann Lecun at New York University, these nets are currently utilized in the tech industry, such as with Facebook for facial recognition. If you start reading about CNNs you will quickly discover the ImageNet challenge, a project that was started to showcase the state of the art and to help researchers access high-quality image data. Every top Deep Learning team in the world joins the competition, but each time it’s a CNN that ends up taking first place

CNNs have multiple types of layers, the first of which is the convolutional layer. To visualize this layer, imagine a set of evenly spaced flashlights all shining directly at a wall. Every flashlight is looking for the exact same pattern through a process called convolution. A flashlight’s area of search is fixed in place, and it is bounded by the individual circle of light cast on the wall. The entire set of flashlights forms one filter, which is able to output location data of the given pattern. A CNN typically uses multiple filters in parallel, each scanning for a different pattern in the image. Thus the entire convolutional layer is a 3-dimensional grid of these flashlights.

Connecting some dots

A series of filters forms layer one, called the convolutional layer. The weights and biases in this layer determine the effectiveness of the filtering process.
Each flashlight represents a single neuron. Typically, neurons in a layer activate or fire. On the other hand, in the convolutional layer, neurons search for patterns through convolution. Neurons from different filters search for different patterns, and thus they will process the input differently.
Unlike the nets we've seen thus far where every neuron in a layer is connected to every neuron in the adjacent layers, a CNN has the flashlight effect. A convolutional neuron will only connect to the input neurons that it “shines” upon.

The convoluted input is then sent to the next layer for activation. CNNs use backprop for training, but because a special engine called RELU is used for activation, the nets don’t suffer from the vanishing gradient problem

In real world applications, image convolution results in 100s of millions of weights and biases, which has an adverse effect on performance. Thus after RELU, the activations are typically pooled in an adjacent layer to reduce dimensionality. Afterwards, there is usually a fully connected layer that acts as a classifier.

CNNs that are in use typically have an architecture with repeated sets of layers. Set 1 is a convolutional layer followed by a RELU. This set can be repeated a few times, and the repeated structure is followed by a pooling layer. This resulting combination forms set 2, which is also repeated a few more times. The final resulting structure is then attached to a fully connected layer at the end. This architecture allows the net to continuously build complex patterns from simple ones, all while lowering computing costs with dimensionality reduction.

CNNs are a powerful tool, but there is one drawback – they require 10s of millions of labelled data points for training. They also must be trained with GPUs for the process to be completed in a reasonable amount of time.

In next article we will see some more deep learning algorithms. Please feel free to like, share, comment. Thanks

Happy Learning. Happy Knowledge Sharing...!!!

Kishor K.

Management Consulting | Applied AI | Supply Chain

6 年

When there was DBN which is effective and requires less supervised data , why did the need arise for CNN which has millions of weights ?

要查看或添加评论，请登录

Rohan Chikorde的更多文章

Key Steps to Learn Machine Learning in 2024

2024年3月10日

Key Steps to Learn Machine Learning in 2024

Welcome to the exciting world of machine learning! Whether you're a complete beginner or have some programming…

3 条评论
From Content to Art: An Introduction to Neural Style Transfer using Python and TensorFlow

2023年2月15日

From Content to Art: An Introduction to Neural Style Transfer using Python and TensorFlow

In this blog post on Neural Style Transfer - a technique that allows you to combine the content of one image with the…
Dask vs Spark

2021年7月8日

Dask vs Spark

#Apache Spark is a popular distributed computing tool for tabular datasets that is growing to become a dominant name in…

1 条评论
How to Handle Large Data for Machine Learning

2021年6月30日

How to Handle Large Data for Machine Learning

Many times, data scientist or analyst finds difficulty to fit large data (multiple #GB/#TB) into memory and this is a…

2 条评论
Quick Understanding: Instance segmentation vs. Semantic segmentation in Image Analysis

2020年3月12日

Quick Understanding: Instance segmentation vs. Semantic segmentation in Image Analysis

Explaining the differences between traditional image classification, object detection, semantic segmentation, and…

2 条评论
Configure Deep Learning Architecture

2019年1月6日

Configure Deep Learning Architecture

Deep Learning can used in wide range of domains – Ecommerce, Supply Chain, Transportation, Medicine etc. and there are…

4 条评论
Recurrent Neural Networks (#RNN) and #LSTM- Deep Learning

2018年10月18日

Recurrent Neural Networks (#RNN) and #LSTM- Deep Learning

What do you do if the patterns in your data change with time? In that case, your best bet is to use a recurrent neural…
Deep Learning vs Traditional Machine Learning... Which one I should use?

2018年8月25日

Deep Learning vs Traditional Machine Learning... Which one I should use?

Over the past several years, deep learning has become the go-to technique for most AI type problems, overshadowing…
Use Cases of Deep Learning

2018年7月28日

Use Cases of Deep Learning

Deep Learning (DL) has become more than just a buzzword in the Artificial Intelligence (AI) community – it is reshaping…

11 条评论
Simplifying Deep Learning - Part I

2017年11月19日

Simplifying Deep Learning - Part I

If you are looking out simplify deep learning so as to make sense out of technical details, then here you go…

7 条评论

See all articles

Simplifying Deep Learning - Part II

Rohan Chikorde

VP - AIML at BNY Mellon | 17k+ followers | AIML Corporate Trainer | University Professor | Speaker

Outline of Deep Belief Nets Algorithm

Outline of Convolutional Neural Network (CNN) Algorithm:

Happy Learning. Happy Knowledge Sharing...!!!

Rohan Chikorde的更多文章

社区洞察

其他会员也浏览了

Data Transformation and Wrangling: An Extensive Look into ML Data Pipelines In DQN

The Game-Changer in Deep Learning: Transformers

Essential Concepts From Little Book of Deep Learning

What Are The Mechanics Of AI

Deep Learning: Explaining Optimization (Gradient Descent, Momentum, RMSprop, Adam)

Future of HR from 2020: Embrace Machine Learning & Deep Learning

Finding Links Between Deming and Deep Learning

Deep Learning: Architectural buildings Generator using GANs

BxD Primer Series: Transfer Learning Techniques

Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before

Outline of Deep Belief Nets Algorithm

Outline of Convolutional Neural Network (CNN) Algorithm:

Happy Learning. Happy Knowledge Sharing...!!!

Rohan Chikorde的更多文章

Key Steps to Learn Machine Learning in 2024

From Content to Art: An Introduction to Neural Style Transfer using Python and TensorFlow

Dask vs Spark

How to Handle Large Data for Machine Learning

Quick Understanding: Instance segmentation vs. Semantic segmentation in Image Analysis

Configure Deep Learning Architecture

Recurrent Neural Networks (#RNN) and #LSTM- Deep Learning

Deep Learning vs Traditional Machine Learning... Which one I should use?

Use Cases of Deep Learning