Activation Functions in Neural Networks

Activation Functions in Neural Networks

When someone decide to read more about how artificial intelligence work , the sentence "activation functions" will be popular.

In simple terms, activation functions are a mathematical functions that determine the output of a neuron. the activation function is applied to the weighted sum of inputs and biases, and the resulting output is passed on to the next layer of the neural network

No alt text provided for this image

We use activation function introduce non-linearity into the output of a neuron, that's mean the neural network will be able to learn complex functions and relations between the input and output

There are a different type of activation functions and each have a strengths and weaknesses point, in this article i will compare and contrast some of the most commonly used activation functions ,including binary, linear, sigmoid, tanh, ReLU, and softmax.


1- Binary Activation Function:

And that's a simple one that maps all inputs to either 0 or 1 depending on whether they are above or below a certain threshold. The binary function is rarely used in modern ANN, and it allow only linearity.

No alt text provided for this image

2-Linear Activation Function:

The linear function simply return the input value as its output without any transformation. This function is used in regression problems, where the output is a continuous value, rather than a binary or categorical value.

No alt text provided for this image

3-Sigmoid Activation Function:

The sigmoid function is the best choice for binary classification problems, where the output is either 0 or 1. This function maps the input to a value between 0 and 1, using the following formula:

f(x) = 1 / (1 + exp(-x))

The sigmoid function give us S-shaped curve, which allows it to introduce non-linearity into the output of the neural network. However, it suffers from the problem of vanishing gradients, where the gradients become very small as the input value approaches either 0 or 1, which can slow down the learning process.

No alt text provided for this image

4-Tanh Activation Function:

The tanh function is similar to the sigmoid function, but maps the input to a value between -1 and 1, using the following formula:

f(x) = (exp(x) - exp(-x)) / (exp(x) + exp(-x))

Like the sigmoid function, the tanh function introduces non-linearity into the output of the neural network, but it has a stronger gradient, which can help speed up the learning process. However, it suffers from the same problem of vanishing gradients as the sigmoid function.


No alt text provided for this image

5-ReLU Activation Function:

The ReLU (Rectified Linear Unit) activation function is currently the most popular choice for deep learning problems, and is used in most modern neural networks.

Formula:

f(x) = max(0, x)

It's returns the input value if it is greater than 0, and returns 0 otherwise. The ReLU function has a simple and efficient implementation, and introduces non-linearity into the output of the neural. it is popular because it allows for faster training of deep neural networks compared to other activation functions but yoy can suffer from the "dying ReLU" problem where a large number of neurons can become inactive and not contribute to the network's output.

No alt text provided for this image

The softmax activation function is commonly used in the output layer of neural networks for multi-class classification problems.

Formula:

f(x) = e^x / sum(e^x)

It transforms the output of each neuron into a probability distribution over all possible classes. The function ensures that the sum of the probabilities of all classes is equal to 1, making it useful for determining the most likely class. Softmax is often used with the cross-entropy loss function to train neural networks for classification tasks.

要查看或添加评论,请登录

Mahdi Bani的更多文章

社区洞察

其他会员也浏览了