Machine Learning & Activation Function

Machine Learning & Activation Function

Machine learning uses Neural Networks for its structural framework to build the engine of learning and prediction. Each Neuron in the neural network has weights, biases and activation functions as its core skeleton to prepare the learning model.

If you see the discipline of ML and Neural networks - is to mimic the learning process of human brain so that machines can follow a similar approach to learn from test data like brain learns from experience and then predict results from new input data.

Weights and Biases are linear techniques to map the neuron parameters so that the learning is appropriate. But the natural environment where we will use the ML programs for image classification or NLP (Natural Language Processing) and other real life problems is more non-linear than linear. Like human brain filters out or segregates or discriminates more useful information than not needed or less beneficial information to take the next steps or arrive at a better decision during a problem scenario; neurons use the activation function to decide whether the inputs have to be taken to the next step for processing an output or discard it at the first step.

There are many Activation functions which are used currently by programmers to achieve the above need while training the network. Few examples are as below –

·        Sigmoid

It is an activation function of form f(x) = 1 / 1 + exp(-x) . Its Range is between 0 and 1. It is a S — shaped curve. It is easy to understand and apply but it has major reasons which have made it fall out of popularity –

o  Vanishing gradient problem.

o  Secondly, its output isn’t zero centered. It makes the gradient updates go too far in different directions. 0 < output < 1, and it makes optimization harder.

·        Tanh – Hyperbolic Tangent

o  It’s mathematical formula is f(x) = 1 — exp(-2x) / 1 + exp(-2x). Now its output is zero centered because its range in between -1 to 1 i.e -1 < output < 1. Hence optimization is easier in this method hence in practice it is always preferred over Sigmoid function. But still it suffers from Vanishing gradient problem.

·        RELU – Rectified Linear Unit

o  It has become very popular in the past couple of years. It was recently proved that it had 6 times improvement in convergence from Tanh function. It’s just R(x) = max(0,x) i.e if x < 0 , R(x) = 0 and if x >= 0 , R(x) = x. Hence as seeing the mathematical form of this function we can see that it is very simple and efficient. Hence it avoids and rectifies vanishing gradient problem. Almost all deep learning Models use ReLu nowadays.

o  But its limitation is that it should only be used within Hidden layers of a Neural Network Model.

·        Leaky RELU

o  Another problem with ReLu is that some gradients can be fragile during training and can die. It can cause a weight update which will make it never activate on any data point again. Simply saying that ReLu could result in Dead Neurons.

o  To fix this problem another modification was introduced called Leaky ReLu to fix the problem of dying neurons. It introduces a small slope to keep the updates alive.

The fact that the success of Machine Learning programs depends on – how fast they are able to learn the hidden patterns in training data appropriately and efficiently; Activation functions play a major role in the back propagation of the errors from the actual training data set and hence optimum updating of the weights. The above list of Activation functions serve specific purposes of learning of a data set but cannot fit for learning models in all scenarios of data classification or regression or NLP. One can also devise Activation functions for their specific problems in ML and it depends on how they understand the training data and patterns behind the learning model.

Hence this knowledge area of understanding and devising Activation functions so that the neural network in the ML program learns efficiently to deliver predictions at a higher percentage close to 100% holds a lot of opportunities and challenges in the coming time for ML discipline.

要查看或添加评论,请登录

Shrikant Panda的更多文章

  • SAP Field Service Management - An Effective Tool for Technicians

    SAP Field Service Management - An Effective Tool for Technicians

    Jan 2020, I came back from Germany and started on a big project for implementation of SAP FSM (Field Service…

    5 条评论
  • Customer Identity

    Customer Identity

    I was reading an article in HBR – Issue – Jan to Feb 2015 - Why don’t customers do what they say they will do? There…

  • Customer Relationship Management will go the Next Mile with Crowd Service

    Customer Relationship Management will go the Next Mile with Crowd Service

    Brand is enhanced by the service processes in place for any organization. With customer at the center of the economic…

  • Building Blocks of CRM

    Building Blocks of CRM

    Working on CRM products for a decade and seeing the ever more focus on Customer in the current times, makes me think on…

  • CRM and Customer 360° – A view on the Big Picture

    CRM and Customer 360° – A view on the Big Picture

    Customer Relationship Management (CRM) revolves around insights on customer data - through the range of interactions of…

  • My View of Software Project Management

    My View of Software Project Management

    In my experience of project execution and delivery over a decade, I can gather that there is a common control Thread…

  • POCs as an Engine of Growth

    POCs as an Engine of Growth

    In IT Service industry every business unit involves in POCs (Proof of Concepts) at multiple points of time. They…

  • Retail Execution Takes a Step Up into the Cloud

    Retail Execution Takes a Step Up into the Cloud

    Delighting retail consumers is not easy. Not only must a company design the best products, it must also ensure they are…

社区洞察

其他会员也浏览了