登录查看更多内容

Neural Network

Eeswar C.

发布日期: 2023年4月23日

In this article I am going back to the basics, Neural Networks!

Most of the readers must have seen the picture above and heard of neural networks, perceptron’s, neurons, hidden layers etc. I would like to take this opportunity to explain each of those terms and how do they connect back to the network.

Some Background -

No alt text provided for this image — Not My Neural Network :)

The Neural Network concept has been around for almost a century now, it was initially coined in 1944, not until the invention of graphic chips with increased processing power did, they really pick up steam.

Neural networks are in loose terms a model architecture intended to do machine learning, in which model learns to perform a task by analyzing training data. Example, you have pictures of handwritten number '4' & have labelled them, by using these as training data, your model will identify patterns in the handwritten pictures when trained and associate to number '4'. If you show next time a picture of '5' if will be able to tell it’s not '4', simple classification. Just like any other model like decision trees, logistic regression etc.

Then what makes them powerful?

Automated Feature engineering - Network adjusts weights to automatically figure out how to best combine the features to be predictive.
Non-Linear Features - activation function allows NN to adapt to whatever input shape & boundaries presented.

What are Neural Networks?

In simple terms, NN is a grouping of processing nodes that are densely interconnected. Nodes are organized into layers to allow data to move thorough them to the next layer. Individual nodes inherently can add no value, but their strength is in numbers. To each of the connections, a node will assign a weight, when network is active, node receives data, for each of the data point passing through a new weight is assigned. A single number is derived as a product of multiple associated input nodes weights, if the number exceeds the threshold value, the node will send the sum of weighted inputs along all its outgoing connections.

NN at the beginning has random values assigned to weights & thresholds of nodes which will be optimized during the training process and learning. Training data is fed thru input layer, data passes thru the succeeding layer until it reaches to output layer, during this process weights & thresholds are adjusted until training labels are consistently predicted for the feature set combinations.

you might ask what nodes/weights/thresholds/layers etc. I got you covered.

Nodes - It has multiple names Neuron/Perceptron/Nodes, they all point to the basic unit of neural network. Each node receives a set of inputs and bias values, when the input arrives it gets multiplied by a weight value.

Connection - Each node might have connections to its input layer or within its layer. The transfer of input from one layer node to receiver layer is called a connection.

Input (x) - Can be the function output from a node or training dataset value received by a node. Each input is associated with a corresponding weight.

Weight (w) - It represents the magnitude of influence of an input on the node.

Activation Function f(z) - It is a nonlinear transformation of input values. There are various types of functions, most used are sigmoid, Tanh, ReLU, SoftMax etc. These functions help boundary the input value between a range 0 to 1.

Bias - It ensures nodes are activated even when there are no input values to a node. It is an extra input to neurons, and it is always 1 with its own weights.

Layers - Layers are a logical representation of the nodes based on their input and output connections. There are broadly three major layer types - Input, Hidden & output.

Forward Propagation - It is the process of samples at input from training dataset moving thru the nodes at each hidden layer based on the weights, bias & activation function transformation until it reaches the final output layer to predict the label in training set.

Loss/Cost Function - A function to estimate deviation of the estimated values from the actual values. Neural network efficiency depends on the optimization of the loss function and total loss value (Error). Loss Function depends on the use case we are trying to solve, for regressions we can think of MSE, classification then Entropy's etc.

Backward Propagation - Is the process of tracing steps back to the input layer from output layer to adjust weights in a way that it reduces the error from loss function. For the Error from loss function post forward propagation, we can calculate the derivation with respect to weights from the last layer, these derivatives are called Gradients. Gradients from one layer can be used to derive gradients of its previous layer and so on till we reach input layer. This allows us to derive gradients for every weight, to reduce error we subtract gradient from weights and rerun the forward propagation. This allows models to descent to local minima. Gradient Descent is one type of simple optimization algorithm, there are other types like stochastic?etc. based on the Cost Function, learning rate & Regularization used.

Batch Size - If you are using keras/TensorFlow etc. you must have seen Batch being used, it is the number of training examples in one forward/backward pass. The higher the batch size more memory space.

Epoch - One forward pass & one backward pass of all the training examples. Training Epochs is the number of times a model is exposed to the training dataset.

I have generalized and simplified the language for ease of understanding, please take it with a grain of salt. There are multiple variants of neural networks based on the use case. Hope this helps with one stop reference for most of the neural network terminology, I have also added references to the picture & Gif's which can be a resourceful if interested in deeper dive.

Gentle Gaint

159 位关注者

要查看或添加评论，请登录

Eeswar C.的更多文章

In-Context Learning

2023年9月22日

In-Context Learning

Have you ever encountered instances where ChatGPT repeatedly provides similar responses to your queries, or where its…

1 条评论
Retrieval Augumented Generation

2023年9月1日

Retrieval Augumented Generation

Anyone within the industry who has utilized ChatGPT for business purposes would likely have had the thought, "This is…
Diffusion Model - Gen AI

2023年8月18日

Diffusion Model - Gen AI

Diffusion models have gained attention for their ability to handle various tasks, particularly in the domains of image…
Anomaly Detection with VAE

2023年5月4日

Anomaly Detection with VAE

Anomaly detection is a machine learning technique used to identify patterns that are considered unusual or out of the…
BERT - Who?

2023年4月15日

BERT - Who?

BERT - Bidirectional Encoder Representations from Transformers, isn’t that a tongue twister! 5 years ago, google…
How Does my Iphone know its me?

2023年4月8日

How Does my Iphone know its me?

Ever wondered how does iPhone know its you and never mistakes someone else for you when using Face Detection? Drum Roll…

1 条评论
Natural Language Data Search

2023年4月1日

Natural Language Data Search

Remember how search was tedious a decade ago! Today you can search and ask questions in any search engine as you would…
Machine Learning & Data Privacy

2023年3月21日

Machine Learning & Data Privacy

Every person i know fears about how their personal data is at risk by all the AI/ML that is surrounding them, whether…
Business at center of Data Science

2023年3月17日

Business at center of Data Science

Any one who has participated in brainstroming & whiteboarding sessions would agree that, what data scientists think of…
Capsule Networks (#capsnets)

2023年3月11日

Capsule Networks (#capsnets)

In my previous article on Handwriting Decoder (#ocr), we touched on how can we read Hand Writing using Computer vision.…

See all articles

Gentle Gaint

159 位关注者

Eeswar C.的更多文章

In-Context Learning

Retrieval Augumented Generation

Diffusion Model - Gen AI

Anomaly Detection with VAE

BERT - Who?

How Does my Iphone know its me?

Natural Language Data Search

Machine Learning & Data Privacy

Business at center of Data Science

Capsule Networks (#capsnets)