Going Deeper with Convolutions (Inception | GoogLeNet)

Going Deeper with Convolutions (Inception | GoogLeNet)

1 . What is an inception model?

Inception is?an image recognition model that has been shown to attain greater than 78.1% accuracy on the ImageNet dataset. The model is the culmination of many ideas developed by multiple researchers over the years.

The model comprises symmetric and asymmetric building blocks, including convolutions, average pooling, max pooling, concatenations, dropouts, and fully connected layers. Batch normalization is used extensively throughout the model and applied to activation inputs. Loss is computed using Softmax.

No alt text provided for this image

2 . Inception V1 :

When multiple deep layers of convolutions were used in a model it resulted in the overfitting of the data. To avoid this from happening the Inception-V1 model uses the idea of using multiple filters of different sizes on the same level. Thus in the inception models instead of having deep layers, we have parallel layers thus making our model wider rather than making it deeper.

No alt text provided for this image

The above-depicted Inception module simultaneously performs 1 * 1 convolutions, 3 * 3 convolutions, 5 * 5 convolutions, and 3 * 3 max pooling operations.

Thereafter, it sums up the outputs from all the operations in a single place and builds the next feature. The architecture does not follow the Sequential model approach where every operation such as pooling or convolution is performed one after the other.

The Inception module with dimension reduction works in a similar manner as the na?ve one with only one difference. Here features are extracted on a pixel level using 1 * 1 convolutions before the 3 * 3 convolutions and 5 * 5 convolutions. When the 1 * 1 convolution operation has been performed the dimension of the image is not changed. However, the output achieved offers better accuracy.

Inception architecture :

No alt text provided for this image
No alt text provided for this image

?3 . Inception-V2 :

In the Inception-V2 architecture. The?5×5?convolution is replaced by the two?3×3?convolutions. This also decreases computational time and thus increases computational speed because a?5×5?convolution is 2.78 more expensive than a?3×3?convolution. So, Using two?3×3?layers instead of?5×5?increases the performance of architecture.

No alt text provided for this image

This architecture also converts?nXn?factorization into 1xn and nx1 factorization. As we discussed above that a 3×3 convolution can be converted into?1×3?then followed by a 3×1 convolution which is 33% cheaper in terms of computational complexity as compared to?3×3.

No alt text provided for this image

To deal with the problem of the representational bottleneck, the feature banks of the module were expanded instead of making it deeper. This would prevent the loss of information that causes when we make it deeper.?

No alt text provided for this image
No alt text provided for this image

3 . Inception-V3 :

Inception-v3 mainly focuses on burning less computational power by modifying the previous Inception architectures. This idea was proposed in the paper?Rethinking the Inception Architecture for Computer Vision, published in 2015. It was co-authored by Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, and Jonathon Shlens.

No alt text provided for this image

Inception-v3 architecture:

Inception-v3 is similar to Inception-v2 with some updates in loss functions, optimizer, and batch normalization.

No alt text provided for this image

What’s new ?

These are some updates in Inception-v3 concerning inception-v2 :

  • RMS prop optimizer is used
  • Batch normalization is used in the Auxilary classifier
  • Label Smoothing (A type of regularizing component added to the loss function that prevents the network from overfitting).

4 . Inception-v4 :

The architecture of the network was made deeper in Inception v4 with the change in the stem part (stem refers to the starting part of Inception architecture) and made uniform choices for the Inception blocks.

No alt text provided for this image

What’s new?

  • Change in the stem part
  • The number of Inception modules is increased.
  • Inception modules are made more uniform i.e. same number of filters are used in modules.
  • Three types of inception modules are named A, B, and C ( similar inception modules as that in inception-v2 ).

5 . Inception ResNet v2 :

Inspired by the performance of the?ResNet,?residual connections?are introduced in inception modules.

Input and concatenate output after several operations should have the same dimension, therefore the?padding?is applied in each operation, and?at the end, 1*1 convolution is applied?to make the number of channels equal as shown below.


No alt text provided for this image

Performance of Inception :

No alt text provided for this image
No alt text provided for this image
MY GITHUB
Inception-V1 Paper
Inception-V3 Paper
Inception-V4 Paper

要查看或添加评论,请登录

AYOUB KIROUANE的更多文章

社区洞察

其他会员也浏览了