登录查看更多内容

BxD Primer Series: Deep Residual Neural Networks

Mayank K.

Founding Partner - BUSINESS x DATA

发布日期: 2023年7月8日

Hey there ??

Welcome to BxD Primer Series where we are covering topics such as Machine learning models, Neural Nets, GPT, Ensemble models, Hyper-automation in ‘one-post-one-topic’ format. Today’s post is on?Deep Residual Neural Networks. Let’s get started:

The What:

Deep Residual Neural Networks, also known as ResNets, were developed to address vanishing gradients problem in traditional neural networks.?Vanishing gradients?problem occurs when the gradient signal that is used to update network weights becomes very small?as it propagates backward?through the network. As a result, the weights in earlier layers of network are not updated effectively, and the network fails to learn useful representations of input data.

ResNets represent a significant departure from traditional architectures. Instead of trying to learn input-output mapping directly, ResNets learn the?difference between input and output. This is accomplished by using?skip connections, which allow the network to learn residual functions.

Skip connections allow the gradient to flow more easily through network and reduce the risk of vanishing gradients because they bypass one or more layers in the network.

Anatomy of a?ResNet:

Pre-activation v/s Post-activation ResNet:

In a standard ResNet, skip connection is?added after activation function. This is known as a post-activation ResNet.

In a pre-activation ResNet, skip connection is?added before activation function. This means that the input to activation function is sum of output of previous layer and skip input to the layer, rather than just the output of previous layer.

Advantages of pre-activation ResNet were discovered later and post-activation was the original ResNet. Advantages of pre-activation ResNet (for very deep networks):

Converges faster than post-activation ResNet.
Less prone to overfitting than post-activation ResNet.
Requires less memory than post-activation ResNet because it does not need to store the intermediate activations of activation function.

The How:

ResNets can be implemented with many different types of base neural network architecture. The are particularly relevant for?convolution neural networks. Here are the general steps to implement a ResNet based architecture:

Step 1: Initialize?the weights and biases of ResNet randomly or using a initialization technique.

Step 2: Forward Pass: Given an input sample?x, perform a forward pass through the network to compute the output?y^.

Let's denote the input to the ResNet as?a[0]=x.

For each layer?l?from?1?to?L, output?a[l]?is computed as:

Where,

H_l?is the residual block of layer?l
W[l]?and?b[l]?are the weight and bias parameters of?l’th layer
z[l]?is the intermediate pre-activation output
g(.)?is the activation function, e.g., ReLU

In case of ResNets, we introduce a skip connection that adds the input a[l?1] directly to output of residual block:

a_skip[l] = F(a[l?1])

The output of l’th layer, taking into account the skip connection, becomes:

a[l] =?g(z[l] +?a_skip[l])

Here,?F(·)?is the identity function.

Step 3: Compute the loss function?L?between predicted output?y^?and ground truth labels?y.

领英推荐

How Convolutional Neural Networks are Revolutionizing…

Dr. Vivek Pandey 1 年前

Understanding Convolutional Neural Networks (CNNs):…

Rany ElHousieny, PhD??? 1 年前

Deep neural networks as a composite function and the…

Ajit Jaokar 8 个月前

Step 4: Backpropagation: Compute gradients of loss function w.r.t. network parameters (weights and biases) using back-propagation.

Starting from the output layer, calculate gradients recursively using chain rule.

For each layer?l?from?L?to?1, the gradients is computed as:

Where,

⊙?denotes element-wise multiplication
g′(?)?represents the derivative of activation function.

Additionally, the gradient w.r.t. skip connection is calculated as:

δ_skip[l] = δ[l]

Step 5: Gradient Descent: Update the network parameters. The update rule for parameters is:

Where,?α?is the learning rate.

Step 6: Repeat?steps 2-5 for a fixed number of iterations or until convergence.

Step 7: Prediction: Once training is complete, use the trained ResNet model to make predictions on unseen data. Perform a?forward pass through network?obtain predicted output?y^.

The Why:

Reasons for using Deep Residual Neural Networks:

Improves the accuracy of neural networks for very deep networks.
Converge faster than traditional neural networks for very deep networks.
Residual connections improves regularization by introducing skip connections to prevent overfitting.
Can reuse learned features across layers, useful in tasks where input data is high-dimensional, as in computer vision.
Can be modified and extended in many ways, allowing for flexible network architectures.

The Why Not:

Reasons for not using Deep Residual Neural Networks:

Require more memory than traditional neural networks.
Use of residual connections makes it difficult to interpret the learned features of network, especially layer-by-layer analysis.
If data and task is not complex, simpler architectures may be more effective.

Time for you to support:

Reply to this email with your question
Forward/Share to a friend who can benefit from this
Chat on Substack with BxD (here)
Engage with BxD on LinkedIN (here)

In next edition, we will cover Capsule Neural Networks.

Let us know your feedback!

Until then,

Have a great time! ??

#businessxdata?#bxd?#Deep #Residual #neuralnetworks?#primer

BUSINESS x DATA

765 位关注者

要查看或添加评论，请登录

Mayank K.的更多文章

What we look for in new recruits?

2024年9月22日

What we look for in new recruits?

Personalization is the #1 use case of most of AI technology (including Generative AI, Knowledge Graphs…
500+ Enrollments, ?????????? Ratings and a Podcast

2024年9月14日

500+ Enrollments, ?????????? Ratings and a Podcast

We are all in for AI Driven Marketing Personalization. This is the niche where we want to build this business.
What you mean 'Build A Business'?

2024年9月7日

What you mean 'Build A Business'?

We are all in for AI Driven Personalization in Business. This is the niche where we want to build this business.
Why 'AI-Driven Personalization' niche?

2024年8月31日

Why 'AI-Driven Personalization' niche?

We are all in for AI Driven Personalization in Business. In fact, this is the niche where we want to build this…
Entering the next chapter of BxD

2024年8月24日

Entering the next chapter of BxD

We are all in for AI Driven Personalization in Business. And recently we created a course about it.

1 条评论
We are ranking #1

2024年8月17日

We are ranking #1

We are all in for AI Driven Personalization in Business. And recently we created a course about it.
My favorites from the new release

2024年7月27日

My favorites from the new release

The Full version of BxD newsletter has a new home. Subscribe on LinkedIn: ?? https://www.
Many senior level jobs inside....

2024年7月7日

Many senior level jobs inside....

Hi friend - As you know, we recently completed 100 editions of this newsletter and I was the primary publisher so far…
People need more jobs and videos.

2024年6月29日

People need more jobs and videos.

From the 100th edition celebration survey conducted last week- one point is standing out that people need more jobs and…
BxD Saturday Letter #202425

2024年6月22日

BxD Saturday Letter #202425

Please take 2 mins to send your feedback. Link: https://forms.

See all articles

BxD Primer Series: Deep Residual Neural Networks

Mayank K.

Founding Partner - BUSINESS x DATA

The What:

Anatomy of a?ResNet:

The How:

领英推荐

The Why:

The Why Not:

Time for you to support:

BUSINESS x DATA

765 位关注者

Mayank K.的更多文章

社区洞察

其他会员也浏览了

A Guide into Activation Functions in Neural Networks

BxD Primer Series: Convolutional Neural Networks

Recurrent Neural Networks (RNN)

BxD Primer Series: Variational Autoencoder (VAE) Neural Networks

Conquer Feed forward Neural Networks with TensorFlow

Recurrent Neural Networks in Deep Learning — Part 1

Neural networks

BxD Primer Series: Hopfield Neural Networks

Comparative Analysis: ARIMA's Box-Jenkins Approach vs. LSTM's Neural Network Structure in Time Series Forecasting

BxD Primer Series: Capsule Neural Networks

The What:

Anatomy of a?ResNet:

The How:

领英推荐

The Why:

The Why Not:

Time for you to support:

BUSINESS x DATA

765 位关注者

Mayank K.的更多文章

What we look for in new recruits?

500+ Enrollments, ?????????? Ratings and a Podcast

What you mean 'Build A Business'?

Why 'AI-Driven Personalization' niche?

Entering the next chapter of BxD

We are ranking #1

My favorites from the new release

Many senior level jobs inside....

People need more jobs and videos.

BxD Saturday Letter #202425

社区洞察

其他会员也浏览了

A Guide into Activation Functions in Neural Networks

BxD Primer Series: Convolutional Neural Networks

Recurrent Neural Networks (RNN)

BxD Primer Series: Variational Autoencoder (VAE) Neural Networks

Conquer Feed forward Neural Networks with TensorFlow

Recurrent Neural Networks in Deep Learning — Part 1

Neural networks

BxD Primer Series: Hopfield Neural Networks

Comparative Analysis: ARIMA's Box-Jenkins Approach vs. LSTM's Neural Network Structure in Time Series Forecasting

BxD Primer Series: Capsule Neural Networks