ç™»å½•æŸ¥çœ‹æ›´å¤šå†…å®¹

Machine Learning & Activation Function

Shrikant Panda

SDM at ITC Infotech

å‘å¸ƒæ—¥æœŸ: 2018å¹´5æœˆ28æ—¥

Machine learning uses Neural Networks for its structural framework to build the engine of learning and prediction. Each Neuron in the neural network has weights, biases and activation functions as its core skeleton to prepare the learning model.

If you see the discipline of ML and Neural networks - is to mimic the learning process of human brain so that machines can follow a similar approach to learn from test data like brain learns from experience and then predict results from new input data.

Weights and Biases are linear techniques to map the neuron parameters so that the learning is appropriate. But the natural environment where we will use the ML programs for image classification or NLP (Natural Language Processing) and other real life problems is more non-linear than linear. Like human brain filters out or segregates or discriminates more useful information than not needed or less beneficial information to take the next steps or arrive at a better decision during a problem scenario; neurons use the activation function to decide whether the inputs have to be taken to the next step for processing an output or discard it at the first step.

There are many Activation functions which are used currently by programmers to achieve the above need while training the network. Few examples are as below â€“

Â· Sigmoid

It is an activation function of form f(x) = 1 / 1 + exp(-x) . Its Range is between 0 and 1. It is a S â€” shaped curve. It is easy to understand and apply but it has major reasons which have made it fall out of popularity â€“

o Vanishing gradient problem.

o Secondly, its output isnâ€™t zero centered. It makes the gradient updates go too far in different directions. 0 < output < 1, and it makes optimization harder.

Â· Tanh â€“ Hyperbolic Tangent

o Itâ€™s mathematical formula is f(x) = 1 â€” exp(-2x) / 1 + exp(-2x). Now its output is zero centered because its range in between -1 to 1 i.e -1 < output < 1. Hence optimization is easier in this method hence in practice it is always preferred over Sigmoid function. But still it suffers from Vanishing gradient problem.

Â· RELU â€“ Rectified Linear Unit

o It has become very popular in the past couple of years. It was recently proved that it had 6 times improvement in convergence from Tanh function. Itâ€™s just R(x) = max(0,x) i.e if x < 0 , R(x) = 0 and if x >= 0 , R(x) = x. Hence as seeing the mathematical form of this function we can see that it is very simple and efficient. Hence it avoids and rectifies vanishing gradient problem. Almost all deep learning Models use ReLu nowadays.

o But its limitation is that it should only be used within Hidden layers of a Neural Network Model.

Â· Leaky RELU

o Another problem with ReLu is that some gradients can be fragile during training and can die. It can cause a weight update which will make it never activate on any data point again. Simply saying that ReLu could result in Dead Neurons.

o To fix this problem another modification was introduced called Leaky ReLu to fix the problem of dying neurons. It introduces a small slope to keep the updates alive.

The fact that the success of Machine Learning programs depends on â€“ how fast they are able to learn the hidden patterns in training data appropriately and efficiently; Activation functions play a major role in the back propagation of the errors from the actual training data set and hence optimum updating of the weights. The above list of Activation functions serve specific purposes of learning of a data set but cannot fit for learning models in all scenarios of data classification or regression or NLP. One can also devise Activation functions for their specific problems in ML and it depends on how they understand the training data and patterns behind the learning model.

Hence this knowledge area of understanding and devising Activation functions so that the neural network in the ML program learns efficiently to deliver predictions at a higher percentage close to 100% holds a lot of opportunities and challenges in the coming time for ML discipline.

è¦æŸ¥çœ‹æˆ–æ·»åŠ è¯„è®ºï¼Œè¯·ç™»å½•

Shrikant Pandaçš„æ›´å¤šæ–‡ç«

SAP Field Service Management - An Effective Tool for Technicians

2021å¹´1æœˆ19æ—¥

SAP Field Service Management - An Effective Tool for Technicians

Jan 2020, I came back from Germany and started on a big project for implementation of SAP FSM (Field Serviceâ€¦

5 æ¡è¯„è®º
Customer Identity

2020å¹´1æœˆ1æ—¥

Customer Identity

I was reading an article in HBR â€“ Issue â€“ Jan to Feb 2015 - Why donâ€™t customers do what they say they will do? Thereâ€¦
Customer Relationship Management will go the Next Mile with Crowd Service

2018å¹´9æœˆ30æ—¥

Customer Relationship Management will go the Next Mile with Crowd Service

Brand is enhanced by the service processes in place for any organization. With customer at the center of the economicâ€¦
Building Blocks of CRM

2018å¹´7æœˆ20æ—¥

Building Blocks of CRM

Working on CRM products for a decade and seeing the ever more focus on Customer in the current times, makes me think onâ€¦
CRM and Customer 360Â° â€“ A view on the Big Picture

2018å¹´3æœˆ31æ—¥

CRM and Customer 360Â° â€“ A view on the Big Picture

Customer Relationship Management (CRM) revolves around insights on customer data - through the range of interactions ofâ€¦
My View of Software Project Management

2018å¹´3æœˆ13æ—¥

My View of Software Project Management

In my experience of project execution and delivery over a decade, I can gather that there is a common control Threadâ€¦
POCs as an Engine of Growth

2018å¹´3æœˆ11æ—¥

POCs as an Engine of Growth

In IT Service industry every business unit involves in POCs (Proof of Concepts) at multiple points of time. Theyâ€¦
Retail Execution Takes a Step Up into the Cloud

2018å¹´3æœˆ6æ—¥

Retail Execution Takes a Step Up into the Cloud

Delighting retail consumers is not easy. Not only must a company design the best products, it must also ensure they areâ€¦

See all articles

Machine Learning & Activation Function

Shrikant Panda

SDM at ITC Infotech

Shrikant Pandaçš„æ›´å¤šæ–‡ç«

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

ARTIFICIAL NEURAL NETWORK Notes from the AI Advance course-Class 25 by Irfan Malik & Dr Sheraz Naseer (Xeven Solutions)

BxD Primer Series: Attention Mechanism

What Are The Mechanics Of AI

BxD Primer Series: Transfer Learning Techniques

Configuring a Neural Network Output Layer

From backpropagation to federated learning

How the Transformer Architecture is Revolutionizing AI

Neural Networks and Deep Learning: Simply Explained

Day 2/60 Reviewing AI & Machine Learning: Generative Deep Learning

What content my trained model have?

Shrikant Pandaçš„æ›´å¤šæ–‡ç«

SAP Field Service Management - An Effective Tool for Technicians

Customer Identity

Customer Relationship Management will go the Next Mile with Crowd Service

Building Blocks of CRM

CRM and Customer 360Â° â€“ A view on the Big Picture

My View of Software Project Management

POCs as an Engine of Growth

Retail Execution Takes a Step Up into the Cloud

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

ARTIFICIAL NEURAL NETWORK Notes from the AI Advance course-Class 25 by Irfan Malik & Dr Sheraz Naseer (Xeven Solutions)

BxD Primer Series: Attention Mechanism

What Are The Mechanics Of AI

BxD Primer Series: Transfer Learning Techniques

Configuring a Neural Network Output Layer

From backpropagation to federated learning

How the Transformer Architecture is Revolutionizing AI

Neural Networks and Deep Learning: Simply Explained

Day 2/60 Reviewing AI & Machine Learning: Generative Deep Learning

What content my trained model have?

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†