Activation functions in machine learning and Neural Networks
Juan David Tuta Botero
Data Science | Machine Learning | Artificial Intelligence
It’s been a while since the last article I wrote, today we are going to talk about activation functions, this concept is one of the pillars of machine learning and how we create neural networks and manage them to work properly. This topic tent to be one of the first barriers to encountering those who are starting to adventure in the world of Artificial Intelligence and that’s the reason why maybe we must retake the most basic concepts.
Remember the structure of a normal neuron, we have an entry that is represented by X, the weight that’s represented by W, and finally the bias for the letter b as we can see in the next picture.
This is the usual way to represent a neuron, and it’s very helpful when we are trying to solve problems of binary classification it means it works when we want to accomplish a problem with only 2 possible outcomes True or false, or mathematically talking positive or negative. And it works very well with solutions systems similar to this one when the green zone is represented by success and the red zone by a failure:
But what happens when the system becomes more complicated maybe instead of just having one red zone we have two, well in the next case presented the solution is very intuitive and is adding a new neuron to the network.
As the solving problems are getting more complex the normal uses for a neuron start to present limitations. As you can perceive they all work as a linear function and if we remember our calculus course the summary of any linear functions is going to result in one linear function. This is when the activation functions shine, Well we took the output of our neuron that we no we are going to represent by the letter Z and make it pass through an activation function that will be represented by the letter A. Here we are going to use the sigmoid activation function.
There are many activation functions and all of them have their specific uses. But how exactly do those functions work, well instead of just modifying in a 2D diagram they do in a 3D and if you combine, move, and use different of them we will have any kind of forms able to solve any kind of problem as presented below.
领英推荐
In the next example, we are going to see a problem of image visualizing and we want to separate different points from a circular formation, which could be generalized in biological behavior in animals or even in cells and it’s being helpful to analyze and treat the disease like cancer.
Activation functions
The following table compares the properties of several activation functions that are functions of one?fold?x?from the previous layer or layers:
Bibliography
https://towardsdatascience.com/activation-functions-neural-networks-1cbd9f8d91d6
https://en.wikipedia.org/wiki/Activation_function