CNN Anatomy: Revealing Layer Output Dynamics

CNN Anatomy: Revealing Layer Output Dynamics

One of the most significant applications of neural networks lies in the domain of Convolutional Neural Networks, commonly known as CNNs. CNNs are specifically tailored for image processing tasks and find extensive use in various applications such as object classification and image-based classification. In this article, we delve into the architecture of CNNs and dissect the role and impact of each layer.

This article presupposes a basic understanding of concepts like filter maps and max-pooling. CNN models operate by learning features from training images through the application of diverse filters at each layer. Notably, the features learned at each convolutional layer exhibit considerable variation. It's widely observed that initial layers predominantly capture low-level features such as edges, image orientation, and colors—essentially, the fundamental aspects of the image. As the network deepens, higher-level features are extracted, aiding in the discrimination between different image classes.


Let's begin by constructing a basic CNN with 4 convolutional layers, 4 max-pooling layers, 1 dense layer, and an output layer. This model is designed for a binary classification task distinguishing between cats and dogs. For this purpose, the sigmoid activation function is employed. The model architecture is depicted in the image below. The input image size is 150 x 150, and being a color image, it comprises 3 channels representing Red, Blue, and Green.

A Simple CNN Model


The CNN model consists of convolutional layers with 3x3 filters. The first layer contains 32 filters, while the second layer has 64 filters. It's essential to note that each filter encompasses 3 channels, corresponding to the primary colors. Below is the code detailing the CNN layers:

Dimensions of Convolutional layers

After training our model, our next objective is to understand how each filter in the convolutional layer extracts specific features from the image. However, before delving into the impact of these filters on the image, let's first visualize the filters themselves. Below, we present the visualization of 6 filters out of the 32 in the Conv2D layer (layer 0), as mentioned in the image above. The number of rows represents the number of filters we visualize 6 filters, while the number of columns denotes the 3 channels present in each filter. In the visualization below, dark squares indicate small or inhibitory weights, while light squares represent large or excitatory weights.

Filter Maps.

Now, let's proceed to apply the filters to a new image presented to the model. Given that it is a Cat Vs Dog classifier, we have downloaded a picture from the web. Through code, we will preprocess the image to make it suitable for input into our trained model.

Code for Test Image Preprocessing

We can see the output of the above code as below:

Cat Image for our model.

Below is the code snippet to process the output of each layer of our model. For reference, let's inspect the output of the first layer. Since there are 64 filters in the first layer, we create an 8x8 matrix to visualize all filter maps. You can analyze any layer of the model by simply changing the layer number.

Code for generation of 8 x8 Filter maps

Below is the output grid of the above code:

8 x 8 Filter Map Grid of 1st Convolutional Network.

We can easily see the application of filters on the image resulting in images highlighting different areas and thus helping the model in classifying. We can easily see that 3rd image creates an outline of the cat and 7th image highlight its eyes. As already mentioned, initial layers of a model focus on learning the minute details like curves, edges etc.

If we change the layer number to 7, we know that this layer will focus on broader features .

8 x 8 Filter Map Grid of 7st Convolutional Network.

In this manner, we can adjust the number of CNN layers and observe its effects on the image. Throughout this article, we've elucidated the influence of each CNN layer on the image. Given our emphasis on CNN layers in this article, we haven't explicitly predicted the outcome of the model. However, you're encouraged to utilize the model to predict the content of the image. If you're interested in viewing the prediction results, you can access the code for this article in the following GitHub repository.

https://github.com/vishal91-hub/CatvsDog_ImageClassifier/blob/main/CNNLayer_Anatomy.ipynb

if you are interested n reading about transfer learning in CNN models, you can refer to my article on the same here.

https://medium.com/@vishal025/transfer-learning-in-convolution-neural-network-b69504f1d052







要查看或添加评论,请登录

vishal singh的更多文章

  • Decoding LSTMs

    Decoding LSTMs

    In this article, we will delve into one of the most crucial aspects of Recurrent Neural Networks (RNNs) that has…

    1 条评论
  • Transfer learning in Convolution Neural Network

    Transfer learning in Convolution Neural Network

    Transfer learning is a machine learning technique in which a model trained on one task is adapted for a second related…

    1 条评论
  • Essential Mathematical Concepts for Starting Your AI/ML Journey

    Essential Mathematical Concepts for Starting Your AI/ML Journey

    Artificial intelligence has been the buzz word since the beginning of this decades. We are seeing businesses redefine…

社区洞察

其他会员也浏览了