Perceptron:

Perceptron:

Perceptron is the simplest form and single unit of? Neural Network . Its a mathematical representation of Biological Neuron. ( If you want to know what is neural network at a high level , Please refer?https://www.dhirubhai.net/posts/birbal-ekka_activity-7207942840222453760-fwwf?utm_source=share&utm_medium=member_desktop ) .

Perceptron contains a single Input layer and output Node . Lets take a mathematical approach as how inputs and weights are used to predict the output .


Say we have training data which has “N” inputs and Single output (y) . The input layer contains N nodes that transmit the “N”? features [x1,x2,x3...xN]? . This can be represented as a row vector.

The input layer has edges of weight [ w1,w2,w3....wN?] ?contained in the column Vector.

All these edges are linked to a single output Node.

In a linear model Perceptron generally the linear function {y=mx+b}?is used at the output node .

A linear function looks as below

where ,

y = Co-ordinate on Y axis

x = Co-ordinate on X axis

m = slope of the line

b = y intercept ( value of y , when x=0)

As we know from the above equation for a given value of ?x , y?can be calculated when m?(slope of the line)? and b ( y-intercept ) are known.? For the simplicity of the understanding we will ignore the y – intercept (b) for time being , we will look into this later .

This results into further simpler equation { y = mx } .? Lets replace with, resulting in equation { y = wx } .This brings some similarity to our initial? feature vector [ x1....xN ] and weights [w1....wN] .

So for any given value of "x" and "w", y?can be calculated by using the linear equation. ?Further Replacing?y with y'?which will be our predicted output and y will be the expected output .

Now elaborating this our final equation is as follows

Here the Sign Function maps a real value to +1 or -1 , which is appropriate for binary classification .This serves as an activation function (More on this later .. )

Now we can already see that the neural network forms the basic function which needs to be learned. The weights are unknown that is to be inferred in a data driven manner from the training data .

Initially the weights "W" in column vector are unknown and they may be set randomly . As a result the prediction y'?will be random for any given input and it may not match the expected output y . The goal of any training is to adjust the weight "W" to minimize the difference between y?and?y'? ( i.e expected out and predicted output respectively ) , so that the prediction value becomes more accurate . When the prediction is different from the expected value the weight vector is updated as

The rational behind the update of the weight is that it always increases the dot product of the weight "W" vector and transpose of "X" by a quantity proportional to ?, which tends to improve the prediction for the training instance . The parameter alpha?is the learning rate .

This is the general principle in machine learning , where the basic function of two or more variables is chosen and some parameters are left unknown. These parameters are then learned in a data-driven manner, so that the functional relationship between the two variables are consistent with the observed data .

Remember !! we left b ( y-intercept ) from the linear equation for the simplicity of understanding , This is referred to as bias and more on this in next article .

Until then thank you for Reading .

要查看或添加评论,请登录

社区洞察

其他会员也浏览了