登录查看更多内容

Top 10 Activation Function's Advantages & Disadvantages

Aditya Bikram Dash

Product at Pragmatic | Building SaaS | CSPO? | 2X Azure

发布日期: 2021年6月30日

+ 关注

Sigmoid:-

Normally used as the output of a binary probabilistic function.

Advantages:

-> Gives you a smooth gradient while converging.

-> One of the best Normalised functions.

-> Gives a clear prediction(classification) with 1 & 0.

Disadvantages:

-> Prone to Vanishing Gradient problem.

-> Not a zero-centric function(Always gives a positive values).

-> Computationally expensive function(exponential in nature).

Tanh:-

Normally used as the input of a binary probabilistic function.

Advantages:

-> Zero-centric function unlike Sigmoid.

-> It is a smooth gradient converging function.

Disadvantages:

-> Prone to Vanishing Gradient function.

-> Computationally expensive function(exponential in nature).

RELU:-(Rectified Linear Unit)

Advantages:

-> Can deal with Vanishing Gradient problem.

-> Computationally inexpensive function(linear in nature).

Disadvantages:

-> Not a zero-centric function.

-> Gives zero value as inactive in the negative axis.

Leaky RELU:-

It is the same as of RELU function except it gives some partial value(0.01 instead zero as of RELU) in the negative axis.

ELU:- (Exponential Linear Unit)

Advantages:

-> Gives smoother convergence for any negative axis value.

-> For any positive output, it behaves like a step function and gives a constant output.

SoftMax:-

Normally used as the output in multi-class classification problems to find out different probabilities for different classes(Unlike Sigmoid which is prefered for a binary-class classification).

PRELU:- (Parametric RELU)

The advantage of PRELU is it has the learning parameter function which fine-tunes the activation function based on its learning rate(unlike zero in the case of RELU and 0.01 in the case of Leaky RELU).

SWISH:-

Also known as self gated function. This activation function is one of the kinds that is being inspired by the use of the Sigmoid function inside an LSTM(Long Short Term Memory) based network.

Advantages:

-> Can deal with Vanishing Gradient problem.

-> The output is a workaround between RELU and Sigmoid function which helps in normalising the output.

Disadvantage:

-> Computationally expensive function (as of Sigmoid).

MaxOut:-

Also known as the Learnable Activation Function.

It has all the advantages of a RELU function but at the same time do not have its disadvantages.

SoftPlus:-

Advantages:

-> Convergence of gradient is smoother than RELU function.

-> It can handle the Vanishing Gradient problem.

Disadvantage:

-> Computationally expensive than RELU(as of exponential in nature).

Thanks for going through this article. This is just a brief of the frequently used 10 activation function's advantages and disadvantages. Although there is a lot to cover about each of the activation functions, I hope this was meaningful to all of you.

I will be sharing my knowledge every now and then based on my availability.

Peace :)

Sunny P.

MSc in Artificial Intelligence & Machine Learning | BE in Mechatronics Engineering | Data Scientist at Honda Canada Inc. and LTIMindtree Canada

2 年

informative article! thanks for sharing

1 次回应

查看更多评论

要查看或添加评论，请登录

Aditya Bikram Dash的更多文章

Leveraging Positive Friction in Product Marketing: A Key to Enhanced User Engagement

2024年1月29日

Leveraging Positive Friction in Product Marketing: A Key to Enhanced User Engagement

In the dynamic world of user experience, 'friction' is typically viewed as a hurdle. Yet, when applied strategically…
Riding the Market Wave: Advanced Demand Testing in Online Retail

2024年1月24日

Riding the Market Wave: Advanced Demand Testing in Online Retail

Introduction: Welcome to the pulse of the online retail world, where understanding and predicting market demand isn't…
Unveiling the Unseen: Adventures in Product Value Discovery

2024年1月22日

Unveiling the Unseen: Adventures in Product Value Discovery

Hey Product Pals! Ever wondered what makes a product go from 'meh' to 'must-have'? During my time at one of the…
Brilliance Bias in GPT-3 (Chat GPT!!!)

2023年1月23日

Brilliance Bias in GPT-3 (Chat GPT!!!)

Chat GPT has taken the globe by storm. I recently studied an IEEE paper titled "Brilliance Bias in GPT-3," which is an…

2 条评论
Netflix - Microsoft Partnered Finally!

2022年7月16日

Netflix - Microsoft Partnered Finally!

To counteract its sluggish subscriber growth, Netflix Inc (NFLX.O) announced on Wednesday that it had chosen Microsoft…
Revolution of Online Learning

2020年12月12日

Revolution of Online Learning

This fast-evolving world of Data science has always put me in awe. This time it’s the revolution of Online Learning…
What should be your Product's Price?

2020年6月10日

What should be your Product's Price?

Deciding the Price is the most important part of your new launching product, don't you think the same way!!! As for the…
Learn, Unlearn & ReLearn.

2019年3月11日

Learn, Unlearn & ReLearn.

"Its difficult to learn mathematics. Its difficult to learn coding.
DATA is an ART !!!

2019年1月9日

DATA is an ART !!!

As Aristotle once said, "The aim of ART is not to represent outward appearance of things, but their inward…

2 条评论
Briefing Machine Learning Algorithms

2018年12月21日

Briefing Machine Learning Algorithms

1.) Naive Bayes Classifier Algorithm If we’re planning to automatically classify web pages, forum posts, blog snippets…

6 条评论

See all articles

Top 10 Activation Function's Advantages & Disadvantages

Aditya Bikram Dash

Product at Pragmatic | Building SaaS | CSPO? | 2X Azure

Aditya Bikram Dash的更多文章

社区洞察

其他会员也浏览了

Exploring Linear Transformations in AlphaFold 3 Pairformer: A Simplified Demo

Machine Learning - Hyperparameter Tuning

Support Vector Machine (SVM) Classification

A chat with GPT

What Is Polynomial Regression in Machine Learning?

Revisiting Support Vector Machines

Create a prediction model for plant-type detection using TensorFlow

Day 15 — XGBoost

Gradient Descent Algorithm in Machine Learning

Hyperband

Aditya Bikram Dash的更多文章

Leveraging Positive Friction in Product Marketing: A Key to Enhanced User Engagement

Riding the Market Wave: Advanced Demand Testing in Online Retail

Unveiling the Unseen: Adventures in Product Value Discovery

Brilliance Bias in GPT-3 (Chat GPT!!!)

Netflix - Microsoft Partnered Finally!

Revolution of Online Learning

What should be your Product's Price?

Learn, Unlearn & ReLearn.

DATA is an ART !!!

Briefing Machine Learning Algorithms

社区洞察

其他会员也浏览了

Exploring Linear Transformations in AlphaFold 3 Pairformer: A Simplified Demo

Machine Learning - Hyperparameter Tuning

Support Vector Machine (SVM) Classification

A chat with GPT

What Is Polynomial Regression in Machine Learning?

Revisiting Support Vector Machines

Create a prediction model for plant-type detection using TensorFlow

Day 15 — XGBoost

Gradient Descent Algorithm in Machine Learning

Hyperband