登录查看更多内容

Deep Learning 1: ANN (Artificial Neural Network) Architecture

Jitender Malik

SVP | Data engineering & Science(AI/ML, Gen AI, Computer Vision) | AI Engineering Lead at NatWest Group

发布日期: 2024年7月28日

Neuron and perceptron

Deep learning is heavily inspired by our own nervous system. Just as our nervous system works, deep learning also works in a similar way. So, we need to discuss this point. We will discuss how a perceptron (ANN Basic building block) compares to a single neuron.

Take a look at this image. On the right-hand side, we have a perceptron. On the left-hand side, there is a neuron. If we compare side by side, we can see some similarities. we could say that the perceptron is inspired by neurons because the structure is similar. Input part here can be compared to the dendrites. The weights can be compared to the nucleus, and the output going out is similar to the axon. Neurons connect with each other to form the nervous system, and similarly, multiple perceptron's connect to form a neural network.

Before diving into ANN Architecture, it's important to understand the Perceptron

To start, let’s look at the structure of a Perceptron. Here’s the model of a Perceptron. First, we’ll understand its different components and then how it operates. In a Perceptron, inputs are provided from one side. These inputs are represented as x1, x2 .. xN. There's also a bias term, denoted as '1'. The connections between inputs and the summation block are called weights and biases. The weights are represented by 'W' and the bias term is represented by 'X0'. The summation block adds up these inputs after they are multiplied by their respective weights.

The operation within this block can be described as follows: we multiply each input with its corresponding weight, add the results, and then add the bias term. For example, for inputs x1 and x2, the operation is: W1x1 + W2x2 + X0. The resulting value, let’s call it 'z', is then passed to the activation function. The activation function is responsible for normalizing the output to a specific range, which could be between 0 and 1, or -1 and 1, depending on the function used.

An activation function can be something like a step function, where the output is '1' if z is greater than or equal to 0, and '0' otherwise. There are other activation functions as well, like the sigmoid function, ReLU, etc.

Identification of weight and biases

Training Process

The goal of the training process is to find the correct values for the weights (W1, W2) and the bias (b). This process involves adjusting these values so that the Perceptron can make accurate predictions.

Training a perceptron is covered in this article in details and using backpropagation/forward propagation (will be covered in detail later)

https://www.dhirubhai.net/pulse/logistic-regression-deep-learning-approach-jitender-malik-7vekc/?trackingId=L0b%2F77PUQtmR%2F19HU9yVaQ%3D%3D

Problems with Perceptron

The reason for this is that the perceptron can only work on linear data. It cannot work on nonlinear data. No matter how much time we give perceptron to train, it will never be able to classify nonlinear data correctly.

If we will closely see, then this dataset is linearly separable because both classes are separate. The inputs are X1 and X2 and learning rate at 0.03. When we run it, we can see that the perceptron quickly finds a line and gives us the correct result.

Now, if we look at this dataset. No matter how much time we give or how many epochs we run, we will not get the output. it won't be able to separate both classes.

Multi-Layer Perceptron (MLP)

While perceptron works, but its only problem is that if there is non-linearity in our data, it cannot capture it

like below example, where there is a dataset with two classes, green and red. But we can see to separate these two classes, we need a decision boundary like this, which is actually not a line.

The perceptron creates a boundary like this. The problem right now is that we need an algorithm that can capture any kind of non-linearity that our perceptron cannot.

The most challenging thing is that we need to create that algorithm using perceptron's. So, we will try to create a network using multiple perceptron's that can capture any kind of non-linearity. Now, let's try to get an idea of the multiplayer perceptron, I have trained two separate perceptron's on data.

As shown in image in model 1, W1 is 2, W2 is 3, and the bias is 6 and line of this perceptron will look something like this.

The equation of line will be 2x + 3y + 3 = 0

领英推荐

Artificial Neural Networks A Comprehensive Guide

Global Software Consulting 5 个月前

Understanding the Perceptron: The First Step in Deep…

Khichad Technologies 1 个月前

Deep Learning: Unleashing the Power of Neural Networks

Indeed Inspiring Infotech 1 年前

and in model 2, W1 is 5, W2 is 4, and the bias is 3.

The equation of line will be 5x + 4y + 3 = 0

Now if we somehow combine the output of these two perceptron's, we can create a new decision boundary like this. So, think that I have superimposed this image over the previous one, where I am getting this kind of new decision boundary.

Now we smooth it, and as we smooth it, it will look like this.

This is the decision boundary we needed. So this is the basic idea of the multiplayer perceptron.

So, this whole thing is actually doing the work of a perceptron, what is happening is that we are creating a combination of three perceptron's, which we can call a multi-layer perceptron.

we can see this is called a multi-layer perceptron because here, there are multiple layers. first is called the input layer, hidden layer and output layer. so we understand how we can organize more than one perceptron in such a way that we can capture non-linear interactions. The concept is very simple; we create a linear combination of multiple perceptron's.

ANN Architecture

First of all, it is important to understand what is meant by the architecture of a neural network. The basic meaning of architecture is how nodes, which are also called perceptron's, are connected to each other. The connection between the weights and nodes is referred to as architecture.

Now we will understand how we can make changes in our neural network architecture to have some additional flexibility.

So, what can we do to bring changes to architecture as per the new requirement? There are four ways in which we can bring changes.

1.Increase the number of nodes in the hidden layer: it means we can increase the number of perceptron's. For example, in below image middle one is hidden layer where we are adding one more perceptron.

Until now, we had two perceptron's in the hidden layer, but in this case, we can see there are three perceptron's. Everything else remains the same, we have added an extra perceptron in the hidden layer. Now, what happens because of this is that if I have such data that is very non-linear, then the additional nodes help in capturing that complexity. When I created a linear combination of these three, I got this output. Now, nothing different happened. Everything is the same; just one extra node was added. If we add one more node, we will have to add one more weight well. So, the thing to understand is that we can add as many nodes as we want to the hidden layer, and the second thing to understand is that the more nodes we add, the more it will help in creating complex non-linear decision boundaries.

2. Increasing the number of nodes in input layer. When the input columns in data increase. So, it is very simple that the more columns we have in the input, the more nodes we will have in the input layer.

3. Increasing the number of units in output layer. In all the examples we have discussed so far, there was always only one perceptron in the end. But it is not necessary; we can have more than one perceptron. Generally, we do this when we are doing multi-class classification. For example, if we want to identify from a photo whether there is a dog, a cat, or a human in it, what we will do is create three perceptron's, one for the dog, one for the cat, and one for the human. In the end, whichever has the highest probability, we will say that the particular photo has a dog, or a cat, or a human. So, the learning point here is that in multi-class classification cases, we can have multiple nodes in output layer.

4. Increasing the number of hidden layers (Deep neural network). Until now, we haven't increased the number of layers; we were increasing the nodes within the layer. But if we want, we can increase the number of layers. Here we can see, this is the input layer, this is the

output layer, but here we have two hidden layers. what benefit we get from this? The benefit is that when we have very complex non-linear data, which requires very complex decision boundaries, we will only be able to create those boundaries by adding layers. First layer is only creating the needed amount of complexity. But as we go deeper into the network, we will be able to capture more complex relationships. And then their linear combination will capture even more complex relationships. No matter how complex our data is, how non-linear it is, if we keep adding layers to network, given enough time for training, neural network will be able to capture the relationships.

For example, the data which perceptron could not capture multi-layer perceptron easily captured those non-linear boundaries as shown below.

Vansh Jaiswal

7 个月

Very helpful!

1 次回应

Surya Mani Singh

Assistant Consultant | Sr Devops Engineer

7 个月

Very helpful!

1 次回应

ARUN YADAV

7 个月

Wonderful

1 次回应

Piyush Kanungo

Principal Architect at Persistent Systems, Pune

7 个月

Explain b ??

2 次回应

查看更多评论

要查看或添加评论，请登录

Jitender Malik的更多文章

Key issues (Post-production) in an ML based solution

2025年3月2日

Key issues (Post-production) in an ML based solution

In my last article, I talked about the key challenges in AI adoption. Even after organizations successfully build and…

1 条评论
6 Core steps for choosing a ML Model

2025年2月24日

6 Core steps for choosing a ML Model

There are many possible solutions to any given problem. Given a task that can leverage ML in its solution, you might…

1 条评论
LLM Agents: Reasoning and acting (ReAct)

2025年1月5日

LLM Agents: Reasoning and acting (ReAct)

In this article I have covered three things to start the series of articles on Agents. first, what is LLM agents? And…
LLM's: Chain of thought prompting

2024年10月6日

LLM's: Chain of thought prompting

Chain rule(Backpropagation - Wikipedia) doesn’t get the appreciation it should. Without it back-propagation (Backbone…

1 条评论
Logistic regression: A deep learning approach.

2024年7月20日

Logistic regression: A deep learning approach.

Logistic regression is one of the most modern machine learning algorithms, and it is important because if you want to…

3 条评论
Encoder decoder to Transfer learning: An analysis of all research papers contributed towards journey of Transformers Architecture (LLM's)

2024年7月13日

Encoder decoder to Transfer learning: An analysis of all research papers contributed towards journey of Transformers Architecture (LLM's)

This article talks about the journey of transformer architecture where the 4 groundbreaking research paper brought the…

1 条评论
A data driven approach for scalable Integration testing.

2024年6月29日

A data driven approach for scalable Integration testing.

Note: This article talks about using statistics to scale integration testing using pact flow. for detailed…

3 条评论
EMIR Refit Pairing and matching : A machine learning approach.

2024年6月22日

EMIR Refit Pairing and matching : A machine learning approach.

The EMIR mandates EU counterparties to report their transactions to trade repositories. EMIR focuses on the…

10 条评论
Comparison of Multivariate Data Using Principal Component Analysis

2024年6月16日

Comparison of Multivariate Data Using Principal Component Analysis

Lately I was working on a project where we need to solve the problem of comparing population with its sample to ensure…

8 条评论

See all articles

Deep Learning 1: ANN (Artificial Neural Network) Architecture

Jitender Malik

SVP | Data engineering & Science(AI/ML, Gen AI, Computer Vision) | AI Engineering Lead at NatWest Group

Neuron and perceptron

Multi-Layer Perceptron (MLP)

领英推荐

ANN Architecture

Jitender Malik的更多文章

社区洞察

其他会员也浏览了

How to Build a Neural Network & Make Predictions with Python AI

TO THE DEEPEST: Convolutional Neural Networks

Fine-Tuning Neural Networks for Deep Learning: Classification and Data Science Explained

The Evolution of Convolutional Neural Networks: From LeNet to EfficientNet

Enhancing Deep Learning Through Key Architectures and Optimization Techniques

Week 3: The Anatomy of a Model: Input, Output, and Parameters. Breaking down what goes into a deep learning model, step-by-step.

Hyperparameter Tuning of Neural Networks: From Dirichlet Lens

A Practical Guide to Recurrent Neural Networks for Enterprise

Recurrent Neural Networks in Deep Learning — Part2

BxD Primer Series: Deep Q-Network (DQN) Reinforcement Learning Models

Neuron and perceptron

Multi-Layer Perceptron (MLP)

领英推荐

ANN Architecture

Jitender Malik的更多文章

Key issues (Post-production) in an ML based solution

6 Core steps for choosing a ML Model

LLM Agents: Reasoning and acting (ReAct)

LLM's: Chain of thought prompting

Logistic regression: A deep learning approach.

Encoder decoder to Transfer learning: An analysis of all research papers contributed towards journey of Transformers Architecture (LLM's)

A data driven approach for scalable Integration testing.

EMIR Refit Pairing and matching : A machine learning approach.

Comparison of Multivariate Data Using Principal Component Analysis

社区洞察

其他会员也浏览了

How to Build a Neural Network & Make Predictions with Python AI

TO THE DEEPEST: Convolutional Neural Networks

Fine-Tuning Neural Networks for Deep Learning: Classification and Data Science Explained

The Evolution of Convolutional Neural Networks: From LeNet to EfficientNet

Enhancing Deep Learning Through Key Architectures and Optimization Techniques

Week 3: The Anatomy of a Model: Input, Output, and Parameters. Breaking down what goes into a deep learning model, step-by-step.

Hyperparameter Tuning of Neural Networks: From Dirichlet Lens

A Practical Guide to Recurrent Neural Networks for Enterprise

Recurrent Neural Networks in Deep Learning — Part2

BxD Primer Series: Deep Q-Network (DQN) Reinforcement Learning Models