登录查看更多内容

Nesterov Accelerated Gradient Descent

Dr.A.Sumithra Gavaskar

Associate Professor at Sns College of Technology , Research Co-ordinator of Dept of CSE

发布日期: 2024年5月15日

Gradient descent

It is essential to understand,?before we look at Nesterov Accelerated Algorithm. Gradient descent is an?optimization?algorithm that is used to train our model. The accuracy of a machine learning model is determined by the cost function. The lower the cost, the better our ML model is performing. Optimization algorithms are used to reach the minimum point of our cost function. Gradient descent is the most common optimization algorithm. It takes parameters at the start and then changes them iteratively to reach the minimum point of our cost function.

As we can see above, we take some initial weight, and according to that, we are positioned at some point on our cost function. Now, gradient descent tweaks the weight in each iteration, and we move towards the minimum of our cost function accordingly.?

The size of our steps depends on?the learning rate?of our model. The higher the learning rate, the higher the step size. Choosing the correct learning rate for our model is very important as it can cause problems while training.

A low learning rate assures us to reach the minimum point, but it takes a lot of iterations to train, while a very high learning rate can cause us to cross the minimum point, a problem commonly known as?overshooting.

领英推荐

Why businesses fail at machine learning

Cassie Kozyrkov 4 年前

A simple way to understand Machine Learning concepts…

Mineral Processing Analysis 10 个月前

Difference Between Inductive Machine Learning &…

Korvage Information Technology 1 年前

Drawbacks of gradient descent

The main drawback of gradient descent is that it depends on the learning rate and the gradient of that particular step only. The gradient at the plateau, also known as?saddle points of our function, will be close to zero. The step size becomes very small or even zero. Thus, the update of our parameters is very slow at a gentle slope.

Let us look at an example. The starting point of our model is ‘A’. The loss function will decrease rapidly on the path AB because of the higher gradient. But as the gradient decreases from B to C, the learning is negligible. The gradient at point ‘C’ is zero, and it is the saddle point of our function. Even after many iterations, we will be stuck at ‘C’ and will not reach the desired minimum ‘D’.

Gradient descent with momentum

The issue discussed above can be solved by including the previous gradients in our calculation. The intuition behind this is if we are repeatedly asked to go in a particular direction, we can take bigger steps towards that direction.?

The weighted average of all the previous gradients is added to our equation, and it acts as momentum to our step.?

要查看或添加评论，请登录

Dr.A.Sumithra Gavaskar的更多文章

Dr.A.Sumithra Engages as a Resource Person on Next-Generation Firewalls and Network Security Tools

2024年10月21日

Dr.A.Sumithra Engages as a Resource Person on Next-Generation Firewalls and Network Security Tools

#snsinstitutions #snsdesignthinkers #designthinking In a world where cybersecurity threats continue to evolve, the need…
"Effective Mentoring Strategies for Student Placement Success"

2024年8月19日

"Effective Mentoring Strategies for Student Placement Success"

A day of mentoring students for placement starts with a resume review session, focusing on tailoring their CVs to…
Joy of Course instructor for OOPS

2024年8月19日

Joy of Course instructor for OOPS

As the Course Instructor for Object-Oriented Programming (OOP) for second-year Electronics and Communication…
Serve as a member of the IQAC audit committee

2024年8月19日

Serve as a member of the IQAC audit committee

A happy IQAC audit in a college is the result of diligent preparation, teamwork, and a commitment to quality. The…
Journey of Placement Mentor for Accenture

2024年8月19日

Journey of Placement Mentor for Accenture

The journey of a Placement Mentor in college begins with a passion for helping students succeed in their careers. Often…
Recursive Neural Networks

2024年5月15日

Recursive Neural Networks

? They are yet another generalization of recurrent networks with a different kind of computational graph ? It is…
CNN Architecture

2024年5月15日

CNN Architecture

Introduction A convolutional neural network (CNN), is a network architecture for deep learning which learns directly…
Deep Recurrent Network

2024年5月15日

Deep Recurrent Network

Machine learning techniques have been widely applied in various areas such as pattern recognition, natural language…
Standard in deep learning architecture

2024年4月23日

Standard in deep learning architecture

#snsinstitutions #snsdesignthinkers #designthinking Now that we’ve seen some of the components of deep networks, let’s…
Rectified linear unit in deep learning

2024年4月23日

Rectified linear unit in deep learning

#snsinstitutions #snsdesignthinkers #designthinking ANN are inspired by the biological neurons within the human body…

See all articles

Nesterov Accelerated Gradient Descent

Dr.A.Sumithra Gavaskar

Associate Professor at Sns College of Technology , Research Co-ordinator of Dept of CSE

领英推荐

Drawbacks of gradient descent

Gradient descent with momentum

Dr.A.Sumithra Gavaskar的更多文章

社区洞察

其他会员也浏览了

Ensemble Learning and Its Variants

The Complete Guide to Handling Missing Values in Machine Learning: Strategies, Impact, and Best Practices

Bias-Variance Tradeoff in Machine Learning

A Beginner's Guide to Machine Learning

Ensemble learning

Gradient Descent Algorithm in Machine Learning

Hyperband

A* Algorithm: the darling of pathfinding algorithms (Q* series - part 3/4)

Insight into the Decision Tree Model

Hands-on ML Series, Episode 3: Testting and Validation of ML System

领英推荐

Drawbacks of gradient descent

Gradient descent with momentum

Dr.A.Sumithra Gavaskar的更多文章

Dr.A.Sumithra Engages as a Resource Person on Next-Generation Firewalls and Network Security Tools

"Effective Mentoring Strategies for Student Placement Success"

Joy of Course instructor for OOPS

Serve as a member of the IQAC audit committee

Journey of Placement Mentor for Accenture

Recursive Neural Networks

CNN Architecture

Deep Recurrent Network

Standard in deep learning architecture

Rectified linear unit in deep learning

社区洞察

其他会员也浏览了

Ensemble Learning and Its Variants

The Complete Guide to Handling Missing Values in Machine Learning: Strategies, Impact, and Best Practices

Bias-Variance Tradeoff in Machine Learning

A Beginner's Guide to Machine Learning

Ensemble learning

Gradient Descent Algorithm in Machine Learning

Hyperband

A* Algorithm: the darling of pathfinding algorithms (Q* series - part 3/4)

Insight into the Decision Tree Model

Hands-on ML Series, Episode 3: Testting and Validation of ML System