20 Deep Learning Terminologies You Must Know
Malini Shukla
Senior Data Scientist || Hiring || 6M+ impressions || Trainer || Top Data Scientist || Speaker || Top content creator on LinkedIn || Tech Evangelist
Deep Learning Terminologies
a. Recurrent Neuron
It’s one of the best from the Deep Learning Terminologies. Basically, in this output is sent back to the neuron for t timestamps. After looking at the diagram, we can say output is back as input t times. Also, we have to connect different together that will look like an unrolled neuron. Although, an important thing is that it provides us a more generalized output.
b. RNN (Recurrent Neural Network)
We use a recurrent neural network, especially for sequential data. As in this, we use the previous output to predict the next one. Also, in this case, loops have a network within them. In a hidden neuron, loops have the capability to store information. As it stores previous words to predict the output.
Again, we have to send an output of hidden layer for t timestamps. Moreover, you can see that unfolded neuron looks like. Once the neuron completes it all timestamps then it goes to the next layer. As a result, we can say that the output is more generalized. Although, the before fetched information is retained after a long time.
Moreover, to update the weight of the unfolded network, we have to propagate error once again. Hence, called backpropagation through time(BPTT).
c. Vanishing Gradient Problem
It’s one of the best from the Deep Learning Terminologies. Where the activation function is very small, this problem arises. At the time of backpropagation, we have to multiply weights with low gradients. Although, they are small and vanish if they go further deep in the network. As for this reason, the neural network forgets the long-range dependence. Also, it becomes a problem of neural networks. As a result, dependence is very important for the network to remember.
We use activation function to solve problems like ReLu which do not have small gradients.
d. Exploding Gradient Problem
We can say this is the opposite of the vanishing gradient problem. It is different as the activation function is too large. Also, it makes the weight of particular node very high. Although, we can solve it by clipping the gradient. So that it doesn’t exceed a certain value.
e. Pooling
It’s one of the best from the Deep Learning Terminologies. We can introduce pooling layers in between the convolution layers. Basically, use this to reduce the number of parameters. Although, prevent over-fitting. Although, the size of the most common type of pooling layer of filter size(2,2) using the MAX operation. Further, we can say what it would do is, it would take the maximum of each 4*4 matrix of the original image.
We can also use other applications of pooling such as average pooling etc.
f. Padding
In this process, we have to add an extra layer of zeros across the images. So, output image has the same size as the input. Hence, called as padding. If pixels of the image are actual or valid, we can say it’s a valid padding.
g. Data Augmentation
It refers to the addition of new data that come from the given data, which might prove to be beneficial for prediction.
For example:
Let us assume we have a digit “ 9 “. We can also change its recognition. But if it’s rotating or tilting. Thus, rotation help to increase the accuracy of our model. Although, we increase the quality of data by rotating. Hence, called for Data Augmentation.
h. Softmax
We use softmax activation function in the output layer for classification problems. It’s like sigmoid function. Also, the difference is that outputs are normalized, to sum up to 1.
It is like the sigmoid function, with the only difference being that the outputs are normalized, to sum up to 1. The sigmoid function would work in case we have a binary output. But we also have a multiclass classification problem. In this process softmax makes it easy to assign values to each class. Also, that can be interpreted as probabilities.
It’s very easy to see it this way – Suppose you’re trying to identify a 6 which might also look a bit like 8. The function would assign values to each number as below. We can easily see that the highest probability is assigned to 6, with the next highest assigned to 8 and so on…
i. Neural Network
Neural Network form the backbone of deep learning. The goal of it is to find an approximation of an unknown function. It is a combination of interconnected neurons. These neurons have weights. Also, have a bias that needs to be updated during the network training depending upon the error. The activation function puts a nonlinear transformation to the linear combination. Thus, generates the output. The combinations of the activated neurons give the output.
j. Input layer/ Output layer / Hidden layer
It’s one of the best from the Deep Learning Terminologies. The input layer is the one which receives the input. Also, it’s the first layer of the network. The output layer is the final layer of the network. These layers are the hidden layers of the network. We use these hidden layers to perform tasks on incoming data. Hence, pass generated output to the next layer. Although, both layers are visible but the intermediate layers are hidden.
k. MLP (Multi-Layer perceptron)
We can not perform highly complex tasks by a single neuron. Therefore, we use stacks of neurons to generate the desired outputs. In the simplest network, we would have an input layer, a hidden layer, and an output layer. As in this, each layer has multiple neurons. Also, in each layer, all neurons are connected to all the neurons in the next layer. These networks are fully connected networks.
l. Neuron
As we can say that we use neuron to form the basic elements of a brain. Also, helps to form the basic structure of a neural network. As we get new information. We start to generate an output.
Similarly, we have to deal in case of a neural network. As soon as neuron will get the input, we have to start this process. Further, after processing generates an output. Also, we have to send neurons which helps in further processing. Either, we can consider it as the final output.
m. Weights
As soon as the input enters the neuron, we have to multiply it by a weight.
For example:
If in case a neuron has two inputs, then we have to assign each input an associated weight. Further, we have to initialize the weights randomly. Moreover, during the model training process, these weights are updating. Although, after training, we have to assign a higher weight to the input.
Let’s assume the input to be a, and then associate weight to be W1. Then after passing through the node the input becomes a*W1
Read Complete Article-Deep Learning Terminologies
Tell me
Solutions Architect - Government & Public Sector
5 年Amazingly Written...!!