An Introduction to Neural Networks
The renaissance of Artificial Intelligence with the boost in Machine Learning and Deep Learning terminologies has sparked a lot of minds around the world, especially the ones in academics. While Artificial Intelligence on a whole encompasses the idea of creating machines that can mimic Human-like behaviour and showcase intelligence that is comparable (or can even exceed) Human Intelligence, it adds various other terminologies like Machine Learning as well, in which a model is trained to learn from the existing data where it can learn the vital patterns in the data and generate new predictions based on it. Deep Learning and Neural Networks on the other hands were once seen as the "Black Sheep", have now gained a lot of traction and buzz-word as the concept of Artificial Intelligence has come to the hype.
So what makes Neural Networks so popular? The concept of Neural Networks have existed for decades, but only in the convolutions of mathematical concepts and research work. Neural Network a very long time to evolve, namely because of two reasons: Neural Nets require a lot of data to process and function and it needs large computing frameworks to process the data and analyze it. Neural Networks was recognized as computationally expensive and it was not until Geoffrey Hinton and his team used Restricted Boltzmann Machines to kick-start the Neural Network and Deep Learning as a whole.
What is a Net?
The basic introduction to a Neural Network would be that it is inspired by how a Human Brain works, or let's say how we would expect it to work. The simplest Neural Network would be a perceptron which is a network of "neuron" that can take a number of inputs and produce a single output. Our Human Brain consists of billions of neurons and this is what Neural Network tries to mimic. Just in the manner that the neurons in our brain, can process the information by receiving the necessary inputs, a traditional Artifical Neural Network can take raw features as inputs which are then processed by "Hidden Layers" before a numerical-vector is generated as an output.
A Neural Network primarily consists of the following parts:
- Input Layer: Through the Input Layer, information is inputted to the Neural Network. This is the initial data layer that is used to input our pieces of information which is then passed onto the Hidden Layers.
- Hidden Layer: While the neurons in the Input Layer, take all the raw features as the inputs, they are fed onto the Hidden Layers. The Hidden Layer takes a weighted combination of the inputs and plugs them into a sigmoid function about which we will discuss a little later. Each node in the Hidden Layer depict a weight which is multiplied to the Input Vector and a bias is added to it.
- Output Layer: The Output Layer constitutes the Output of our Neural Network and they simply present the information that has been computed from the Hidden Layers.
How does a Neural Network work?
Neural Networks are trained and made to work using an iterative algorithm called Backpropagation. Backpropagation works by feeding it numerous sequences of input vectors along with the corresponding Output Vector. During this whole process, the hidden layers that are accommodated between the Input Layer and the Output Layer are tweaked and optimized to identify the aggregate features in our data and thus can take a weighted combination. So how is Backpropagation made successful? The secret to it lies in our Activation Function.
The key part of a Neural Network is how can a neuron determine the output vector from numerous input vectors. The answer to it lies in the Activation Function. While there are numerous activation functions, the one that is most commonly found in Classification Algorithms is the Sigmoid Function. Each neuron in such a system is its own logistic regression function and a Neural Network without any hidden layer, would just be a collection of numerous Logistic Regressors.
Neural Networks also require Graphics Processing Units (GPUs) to work which are specialized hardware and are extraordinarily powerful devices with massive computing and parallel processing capabilities which have gained a god-like status in Deep Learning and setting up Neural Networks. The entire aim for setting up a Neural Network is to minimize the Cost Error Function which is done through forwarding and Backward Propagation.
"Hello World" of Neural Network
In this Code Snippet, the "Hello World" of Neural Network has been coded where we use Keras, which is a High-Level Neural Network API that can be run on top of popular Deep Learning Libraries like Tensorflow or Theano. In the Code Snippet, we have tried to implement a simple function y=2*x and trained a Densely-Connected Neural Network.
What flows between these layers are Tensors. Tensors can be visualized as matrices hence the Input Layer itself is not a Layer but a Tensor which has the same shape as the training data. After taking in the inputs and giving the corresponding output, the model is iteratively trained for 500 epochs where the weights are calculated based on the input and output shapes.
In a dense layer, as we implemented above, the weights are multiplied with the inputs. This means, that we implement a matrix with one column per unit and one row per unit. After training the model we can give off standard values to predict the efficiency of our model which is almost 97% efficient.
This is not all about Neural Networks. With the dawn of the 21st Century, and aided by huge computational powers and lots of data being generated with every click on our devices, Neural Networks have expanded in scope and functionality and have been seeing wide varied uses and purpose. Various forms of Neural Networks like Convolutional Neural Networks, Generative Adversarial Networks, Recurrent Neural Networks are being used in various applications all around the world. With the rise of various Deep Learning Libraries like the popular Tensorflow, PyTorch, Keras and more the scope and functionality of Deep Learning has expanded not only to the academics but too low-level users who don't understand much mathematics that surrounds it, but can put the pre-defined algorithms to societal purposes, aiding in solving real-world problems.