Friendly?? Guide to Neural Net!

Friendly?? Guide to Neural Net!

Deep Learning is a growing field in the last 10 years and it was growing too fastly. Because it solves more complex problems in the real world. In simple words, Deep Learning is like a human brain. It can learn about music, images, pattern, and more. So, it solves more complex real-world problems easily.

How it's learning?: It internally uses the NN (Neural Networks), NN helps to learn more complex data, and it solves the various kinds of problems using simple neural networks.

In this?post, we will explore the ins and outs of a simple neural network. And by the end, hopefully, you (and I) will have gained a deeper and more intuitive understanding of how neural networks do what they do.

No alt text provided for this image


???? We will learn what is the neural network, and how the NN learns. Before that, we will refresh some math for NN.

Basics of Neural Network:

Vector

  • Scaler: It is a quantity with magnitude, it tells the distance and volume of water.
  • Vector: It also has magnitude, but it has direction.
  • Vectors help to interpret the higher dimensions easily.
  • [1,2] - Two dimension vector
  • [1,2,3] - Three dimension vector
  • [1...n] - N dimension vector

Feature Vector - n dimension vector contains the information on labels

No alt text provided for this image

Dot Product:

  • Multiply, the two vectors are the dot product.
  • But, both the vectors need to be the same size.
  • The output of the Dot product is always a single value.

No alt text provided for this image

Matrix Multiplication

  • Same as previous, but here we will multiply the matrix:, {rectangular array of numbers contains rows and columns}
  • Condition: The number of rows in the first matrix is the same as the number of columns in the second matrix.

No alt text provided for this image


  • Same as the dot product. Here, we multiply the two matrices.

No alt text provided for this image

Exponent

  • Exponent is nothing but if you want to multiply the 2 four times, you can use these notations 2e4, e is the exponent.
  • Non-zero number equals to 1, 4e0 = 1
  • It grows to infinity!

No alt text provided for this image

Graph:

No alt text provided for this image

Logarithm

  • It is exactly the same thing as an exponent.
  • Small Change in the log, we use the log to write the output of the exponent. Confusing? See the image.

No alt text provided for this image

  • The log is a?Non-Linear Function?that helps to optimize the deep learning models.
  • It is?a monotonic function; it means whenever the x goes up, the log also goes up.

  1. Exp goes up to infinity.
  2. Log of exp is inverse.

No alt text provided for this image

Linear :

  • Linear?is?Simply Addition?and?Multiplication. If you are familiar with?Linear Regression, it is just an add and multiplication term y = mx + c.
  • In further, Linear refers to the?Straight Line.
  • It can able to solve linear problems like separating the line and linear regression.
  • Important thing?Never uses Linear equations to solve non-linear problems.

Non-Linear:

  • Except for linear equations, remaining all operations are considered the?non-linear operation.
  • Non Linear does not have a?straight line. Look at the equation: 3x2 + 2x + 1 = 0.
  • It can able to solve more?complex?problems and it adds non-linearity to our function.
  • Some examples are sigma, tanh - If you don't know about this, Click here to hit activation functions??
  • Important thing?Never use Non-linear equations to solve linear problems.

No alt text provided for this image

Now you can able to understand the Simple Neural Network. Let's break it??!

Let’s start with a really high-level overview so we know what we are working with.?Neural networks are multi-layer networks of neurons (the blue and magenta nodes in the chart below) that we use to classify things, make predictions, etc. Below is the diagram of a simple neural network with five inputs, 5 outputs, and two hidden layers of neurons.

No alt text provided for this image

  1. The input layer of our model is orange.
  2. Our first hidden layer of neurons is blue.
  3. Our second hidden layer of neurons is magenta.
  4. The output layer (a.k.a. the prediction) of our model is green.

The arrows that connect the dots show how all the neurons are interconnected and how data travels from the input layer all the way through to the output layer.

The line connects with each other neurons; it means it connected each node with each other. Each node shares the output to all neighbor nodes.

In fact, all the neurons are independent of each other, because they don't have any relationship with each other, each node will receive the input from other neighbor nodes on behalf, they don't do any things with each other.

No alt text provided for this image

Each node is individual with each other, they just get input from neighbor nodes, it is very important. Because each node will learn individually with the input of neighbor nodes.

Let me break the entire architecture ??

I will take a single node and we will learn what each node does and how they learn.

No alt text provided for this image

Single node referred to a single perceptron. The first image is the human neuron. Human neurons contain dendrites, nuclei, and axons. This helps to learn new things, same as this, each perceptron will learn by weights, bias, and loss function. Don't worry, we will see each thing clearly!

Single Perceptron contains weights, bias, summation, and non-linear function.

No alt text provided for this image

SIGMA - Activation Function , b - bias

X - Input, Wo - weights

No alt text provided for this image

Single perceptron does two functions, summation, and adding nonlinearity (sigma) to the values.

The output of this will send to the next node, and the same process will continue until the last node.

No alt text provided for this image

I hope you understand this, this is forward propagation. It means value starting from input to output, same backpropagation means value pass output to input and it has many loops.

Forward Propagation, weights initialized randomly for each node, bias is 0 for each node and it does the operation and the last layer will give the output.

Let’s Add a Bit of Complexity Now

Now that we have our basic framework, let’s go back to our slightly more complicated neural network and see how it goes from input to output. Here it is again for reference:

No alt text provided for this image

The first hidden layer comprises two neurons. So, connecting all five inputs to the neurons in Hidden Layer 1, we need ten connections. The next image (below) shows just the connections between Input 1 and Hidden Layer 1.

No alt text provided for this image


Note our notation for the weights that live in the connections — W1,1 denotes the weight that lives in the connection between Input 1 and Neuron 1 and W1,2 denotes the weight in the connection between Input 1 and Neuron 2. So the general notation that I will follow is Wa,b?denotes the weight on the connection between Input?a?(or Neuron?a) and Neuron?b.

Now let’s calculate the outputs of each neuron in Hidden Layer 1 (known as the activations). We do so using the following formulas (W?denotes weight,?In?denotes input).


Z1 = W1*In1 + W2*In2 + W3*In3 + W4*In4 + W5*In5 + Bias_Neuron1
Neuron 1 Activation =?Sigmoid(Z1)

We can use matrix math to summarize this calculation (remember our notation rules — for example, W4,2 denotes the weight that lives in the connection between Input 4 and Neuron 2):

No alt text provided for this image

For any layer of a neural network where the prior layer is?m?elements deep and the current layer is?n?elements deep, this generalizes to:

[W] @ [X] + [Bias] = [Z]


Where [W] is your?n by m?matrix of weights (the connections between the prior layer and the current layer), [X] is your?m?by 1?matrix of either starting inputs or activations from the prior layer, [Bias] is your?n by 1?matrix of neuron biases, and [Z] is your?n by 1?matrix of intermediate outputs.?In the previous equation, I follow Python notation and use @ to denote matrix multiplication. Once we have [Z], we can apply the activation function (sigmoid in our case) to each element of [Z] and that gives us our neuron outputs (activations) for the current layer.

Finally before we move on, let’s visually map each of these elements back onto our neural network chart to tie it all up ([Bias] is embedded in the blue neurons).

By repeatedly calculating [Z] and applying the activation function to it for each successive layer, we can move from input to output. This process is known as forward propagation. Now that we know how the outputs are calculated, it’s time to evaluate the quality of the outputs and train our neural network.

Back Propagation:

Neural Networks are learned during backpropagation using optimizers, optimizers. help to reduce the cost function (actual value - output value). If the difference is very high, the optimizer helps to reduce the cost function value.

No alt text provided for this image


Optimizer is a very vast topic, but here we will build some intuition for the optimizer. Just understand optimizer helps to reduce the cost function value, and it also helps to make the learning faster and slower based on the cost value.

Actually, Backpropagation is exactly a gradient descent but more function is evolving here (multidimensional space) for that we will use the chain rule to update the weights.

No alt text provided for this image

Attention?? Read Carefully!

Step1: Forward pass (compute the output and find the cost value)

No alt text provided for this image


The formula for calculating the forward pass.

Here, X is constant, W is a learning parameter.

Step2: Backward Pass (update the weight using chain rule)

No alt text provided for this image

Weight updation formula.


Etta- learning rate, and derivate L - a derivative of the loss

Let's see how to calculate the derivative for loss or cost value.

No alt text provided for this image

If you see carefully, it is an embedded function.

Partial derivatives

dw - weights, dl (y-y) - loss or cost function

No alt text provided for this image



This is a formula for the derivative of loss.

After calculating the derivatives, the weight updation will occur for every neuron in the network and again it will do forward and backward pass based on the iteration. What you have set.

Small reminder:

  • During Forward pass, the weight is randomly initialized with the bias of 0, and the summation and nonlinear function do their work, giving the output.
  • Finding the cost function value, if the cost value is too high, we will backpropagation the network, which means updating the weight based on the cost value using optimizers, after updating the weight, we will again do the front and back propagation until the cost value will be very low.

No alt text provided for this image

Another:

No alt text provided for this image

I hope you understood!

Did you like this article? Don't forget to share:

Look at our latest articles:

No alt text provided for this image



Activation Functions



No alt text provided for this image


Gentle Introduction to Inferential Statistics!




+

Name: R.Aravindan

Company: Artificial Neurons.AI

Position: Content Writer


要查看或添加评论,请登录

社区洞察

其他会员也浏览了