#11 Finding Nemo: Exploring pre-trained Keras models

#11 Finding Nemo: Exploring pre-trained Keras models

Now that we have a foundation on the basic models, let's experiment with them on a classification task. The task is to detect anemone fish/ clown fish.

For the task, I explored the following pre-trained model:

  1. VGG16: A convolutional neural network architecture with 16 layers, including convolutional layers and fully connected layers.
  2. VGG19: Similar to VGG16 but has 19 layers. It has a deeper architecture compared to VGG16, which might capture more intricate features in images.
  3. ResNet50: A variant of the ResNet (Residual Network) architecture. It introduces residual connections that enable the network to be deeper without suffering from the vanishing gradient problem. ResNet50 has 50 layers and is known for its excellent performance in image classification tasks.
  4. Inception V3: Uses a combination of convolutional layers with different filter sizes to capture features at various scales in the image.

Before seeing the results, I expected ResNet50 to outperform the others.

Let's have a look at how the experiment went!

Sample images:

Images of fish taken from the internet

Results

Both of these models are taken from the paper mentioned in the image below.

In the paper, the effect of increasing depth on accuracy is explored. The convolutional filters are very small (3 x 3). In this work, they evaluated very deep convolutional networks (up to 19 weight layers) for large-scale image classification. It was demonstrated that the representation depth is beneficial for classification accuracy and that state-of-the-art performance on the ImageNet challenge dataset can be achieved using a conventional ConvNet architecture with substantially increased depth.

They also showed that their models generalised well to a wide range of tasks and datasets, matching or outperforming more complex recognition pipelines built around less deep image representations.

ResNet50

Inception V3

The output for Inception V3 is quite interesting. Firstly, it misclassified Puffer fish as goldfish. Secondly, there is a visible pattern in its accuracy to detect fishes other than anemone.

Discussion

Why VGG19 Outperformed Others (possibly):

  • Depth: VGG19 is deeper than VGG16. Deeper networks can capture more complex patterns and features, which might be crucial for accurately identifying Anemone fishes in diverse images.
  • Feature Extraction: The additional layers in VGG19 enable it to extract more detailed and discriminative features from images, enhancing its ability to distinguish Nemo from other objects or backgrounds.
  • Training Dataset: It's possible that the architecture and parameters of VGG19 are better suited for the characteristics of the dataset used for pre-training, which includes a wide variety of images from the ImageNet dataset.


GitHub:

https://github.com/RiyaChhikara/100daysofComputerVision/blob/main/Day11_FindingNemo.ipynb

要查看或添加评论,请登录

Riya Chhikara的更多文章

社区洞察

其他会员也浏览了