登录查看更多内容

Alexnet Architecture

Asif Tandel

Software Engineer (Data & AI) @ Bahwan Cybertek | Python | Generative AI | Data Science | Machine Learning | Predictive Analytics | AWS

发布日期: 2021年7月28日

Introduction:-

AlexNet was designed by Hinton, winner of the 2012 ImageNet competition, and his student Alex Krizhevsky. It was also after that year that more and deeper neural networks were proposed, such as the excellent vgg, GoogleLeNet. Its official data model has an accuracy rate of 57.1% and top 1-5 reaches 80.2%. This is already quite outstanding for traditional machine learning classification algorithms

The following table below explains the network structure of AlexNet:

Why does AlexNet achieve better results?

Relu activation function is used.

Relu function: f (x) = max (0, x)

ReLU-based deep convolutional networks are trained several times faster than tanh and sigmoid- based networks. The following figure shows the number of iterations for a four-layer convolutional network based on CIFAR-10 that reached 25% training error in tanh and ReLU

2. Standardization ( Local Response Normalization )

After using ReLU f (x) = max (0, x), you will find that the value after the activation function has no range like the tanh and sigmoid functions, so a normalization will usually be done after ReLU, and the LRU is a steady proposal (Not sure here, it should be proposed?) One method in neuroscience is called "Lateral inhibition", which talks about the effect of active neurons on its surrounding neurons.

3. Dropout

Dropout is also a concept often said, which can effectively prevent overfitting of neural networks. Compared to the general linear model, a regular method is used to prevent the model from overfitting. In the neural network, Dropout is implemented by modifying the structure of the neural network itself. For a certain layer of neurons, randomly delete some neurons with a defined probability, while keeping the individuals of the input layer and output layer neurons unchanged, and then update the parameters according to the learning method of the neural network. In the next iteration, randomly remove some neurons until the end of training.

领英推荐

Neural Network Gradient Descent: Machine Learning…

Doug Rose 9 个月前

Neural Network Chain Rule: Understanding the…

Doug Rose 9 个月前

A Comprehensive Overview of Classification Methods

Utpal Dutta 7 个月前

4. Enhanced Data ( Data Augmentation ):-

In deep learning, when the amount of data is not large enough, there are generally 4 solutions:

1)?Data augmentation- artificially increase the size of the training set-create a batch of "new" data from existing data by means of translation, flipping, noise.

2)?Regularization- The relatively small amount of data will cause the model to overfit, making the training error small and the test error particularly large. By adding a regular term after the Loss Function , the overfitting can be suppressed. The disadvantage is that a need is introduced Manually adjusted hyper-parameter.

3)?Dropout- also a regularization method. But different from the above, it is achieved by randomly setting the output of some neurons to zero

4)?Unsupervised Pre-training- use Auto-Encoder or RBM's convolution form to do unsupervised pre-training layer by layer, and finally add a classification layer to do supervised Fine-Tuning

Summary:-

1) What I observed is that the speed of training improved when compared to LeNet. This model achieve’s a better accuracy much faster.

2) Loss also reduces at a faster rate when compared to LeNet.

3) The increase in depth of the network and introduction of ReLu had a major impact in Neural Networks. This model inspired the research of future models.

Advantages:-

1) AlexNet was the first major CNN model that used GPU’s for training. This lead to faster training of models.

2) AlexNet is a deeper architecture with 8 layers which means that is better able to extract features when compared to LeNet. It also worked well for the time with color images.

3) The ReLu activation function used in this network has 2 advantages. It does not limit the output unlike other activation functions. This means there isn’t too much loss of features.

4) It negates the negative output of summation of gradients and not the dataset itself. This means that it will further improve model training speed since not all perceptrons are active.

DisAdvantages:-

1) Compared to models used further in this article, the depth of this model is very less and hence it struggles to learn features from image sets.

2) We can see that it takes more time to achieve higher accuracy results compared to future models.

Dharmeish Doshi

3 年

Good Article Asif

1 次回应

查看更多评论

要查看或添加评论，请登录

Asif Tandel的更多文章

K- Nearest Neighbors Explaination

2024年7月8日

K- Nearest Neighbors Explaination

K-Nearest Neighbors Introduction K-nearest neighbors (KNN) is a type of supervised learning algorithm which is used for…

1 条评论
Resnet Architecture

2021年8月2日

Resnet Architecture

Introduction ResNet is a network structure proposed by the He Kaiming, Sun Jian and others of Microsoft Research Asia…
VGG-Net Architecture

2021年7月30日

VGG-Net Architecture

Introduction- The full name of VGG is the Visual Geometry Group, which belongs to the Department of Science and…
What is Deep Learning and Why It is Important?

2021年7月28日

What is Deep Learning and Why It is Important?

To understand what deep learning is, we first need to understand the relationship deep learning has with machine…
Lenet Architecture

2021年7月28日

Lenet Architecture

Basic Introduction:- LeNet-5, from the paper Gradient-Based Learning Applied to Document Recognition, is a very…

3 条评论

See all articles

Alexnet Architecture

Asif Tandel

Software Engineer (Data & AI) @ Bahwan Cybertek | Python | Generative AI | Data Science | Machine Learning | Predictive Analytics | AWS

Why does AlexNet achieve better results?

领英推荐

Asif Tandel的更多文章

社区洞察

其他会员也浏览了

The Evolution of Diffusion Models

How to Implement Convolutional Variational Autoencoder in PyTorch with CUDA?

How to Implement Convolutional Variational Autoencoder in PyTorch with CUDA?

How to Implement Convolutional Variational Autoencoder in PyTorch with CUDA?

Paper Review: YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

Noisy by Nature: How AI Learns to Shush the Static

A Practical Guide to Capsule Networks and Attention Mechanisms for Enterprise

Generative Adversarial Networks (GANs)

Multilayer Network, Threshold Unit, Feedforward Network.

KAN Do

Why does AlexNet achieve better results?

领英推荐

Asif Tandel的更多文章

K- Nearest Neighbors Explaination

Resnet Architecture

VGG-Net Architecture

What is Deep Learning and Why It is Important?

Lenet Architecture

社区洞察

其他会员也浏览了

The Evolution of Diffusion Models

How to Implement Convolutional Variational Autoencoder in PyTorch with CUDA?

How to Implement Convolutional Variational Autoencoder in PyTorch with CUDA?

How to Implement Convolutional Variational Autoencoder in PyTorch with CUDA?

Paper Review: YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

Noisy by Nature: How AI Learns to Shush the Static

A Practical Guide to Capsule Networks and Attention Mechanisms for Enterprise

Generative Adversarial Networks (GANs)

Multilayer Network, Threshold Unit, Feedforward Network.

KAN Do