CNN Anatomy: Revealing Layer Output Dynamics
One of the most significant applications of neural networks lies in the domain of Convolutional Neural Networks, commonly known as CNNs. CNNs are specifically tailored for image processing tasks and find extensive use in various applications such as object classification and image-based classification. In this article, we delve into the architecture of CNNs and dissect the role and impact of each layer.
This article presupposes a basic understanding of concepts like filter maps and max-pooling. CNN models operate by learning features from training images through the application of diverse filters at each layer. Notably, the features learned at each convolutional layer exhibit considerable variation. It's widely observed that initial layers predominantly capture low-level features such as edges, image orientation, and colors—essentially, the fundamental aspects of the image. As the network deepens, higher-level features are extracted, aiding in the discrimination between different image classes.
Let's begin by constructing a basic CNN with 4 convolutional layers, 4 max-pooling layers, 1 dense layer, and an output layer. This model is designed for a binary classification task distinguishing between cats and dogs. For this purpose, the sigmoid activation function is employed. The model architecture is depicted in the image below. The input image size is 150 x 150, and being a color image, it comprises 3 channels representing Red, Blue, and Green.
The CNN model consists of convolutional layers with 3x3 filters. The first layer contains 32 filters, while the second layer has 64 filters. It's essential to note that each filter encompasses 3 channels, corresponding to the primary colors. Below is the code detailing the CNN layers:
After training our model, our next objective is to understand how each filter in the convolutional layer extracts specific features from the image. However, before delving into the impact of these filters on the image, let's first visualize the filters themselves. Below, we present the visualization of 6 filters out of the 32 in the Conv2D layer (layer 0), as mentioned in the image above. The number of rows represents the number of filters we visualize 6 filters, while the number of columns denotes the 3 channels present in each filter. In the visualization below, dark squares indicate small or inhibitory weights, while light squares represent large or excitatory weights.
Now, let's proceed to apply the filters to a new image presented to the model. Given that it is a Cat Vs Dog classifier, we have downloaded a picture from the web. Through code, we will preprocess the image to make it suitable for input into our trained model.
We can see the output of the above code as below:
Below is the code snippet to process the output of each layer of our model. For reference, let's inspect the output of the first layer. Since there are 64 filters in the first layer, we create an 8x8 matrix to visualize all filter maps. You can analyze any layer of the model by simply changing the layer number.
领英推荐
Below is the output grid of the above code:
We can easily see the application of filters on the image resulting in images highlighting different areas and thus helping the model in classifying. We can easily see that 3rd image creates an outline of the cat and 7th image highlight its eyes. As already mentioned, initial layers of a model focus on learning the minute details like curves, edges etc.
If we change the layer number to 7, we know that this layer will focus on broader features .
In this manner, we can adjust the number of CNN layers and observe its effects on the image. Throughout this article, we've elucidated the influence of each CNN layer on the image. Given our emphasis on CNN layers in this article, we haven't explicitly predicted the outcome of the model. However, you're encouraged to utilize the model to predict the content of the image. If you're interested in viewing the prediction results, you can access the code for this article in the following GitHub repository.
if you are interested n reading about transfer learning in CNN models, you can refer to my article on the same here.