登录查看更多内容

#11 Finding Nemo: Exploring pre-trained Keras models

Riya Chhikara

Data Scientist at The Economist | Guest Teacher at LSE

发布日期: 2024年3月20日

Now that we have a foundation on the basic models, let's experiment with them on a classification task. The task is to detect anemone fish/ clown fish.

For the task, I explored the following pre-trained model:

VGG16: A convolutional neural network architecture with 16 layers, including convolutional layers and fully connected layers.
VGG19: Similar to VGG16 but has 19 layers. It has a deeper architecture compared to VGG16, which might capture more intricate features in images.
ResNet50: A variant of the ResNet (Residual Network) architecture. It introduces residual connections that enable the network to be deeper without suffering from the vanishing gradient problem. ResNet50 has 50 layers and is known for its excellent performance in image classification tasks.
Inception V3: Uses a combination of convolutional layers with different filter sizes to capture features at various scales in the image.

Before seeing the results, I expected ResNet50 to outperform the others.

Let's have a look at how the experiment went!

Sample images:

Results

Both of these models are taken from the paper mentioned in the image below.

领英推荐

Kaggle “Dogs vs. Cats” Challenge?—?Complete Step by…

Rajat Garg 6 年前

Ch:14.1 Types of GAN's with?Math.

metamady.eth Ξ ? 6 年前

Symbolic Regression: Bridging Interpretability and…

Brikesh Kumar 5 个月前

In the paper, the effect of increasing depth on accuracy is explored. The convolutional filters are very small (3 x 3). In this work, they evaluated very deep convolutional networks (up to 19 weight layers) for large-scale image classification. It was demonstrated that the representation depth is beneficial for classification accuracy and that state-of-the-art performance on the ImageNet challenge dataset can be achieved using a conventional ConvNet architecture with substantially increased depth.

They also showed that their models generalised well to a wide range of tasks and datasets, matching or outperforming more complex recognition pipelines built around less deep image representations.

ResNet50

Inception V3

The output for Inception V3 is quite interesting. Firstly, it misclassified Puffer fish as goldfish. Secondly, there is a visible pattern in its accuracy to detect fishes other than anemone.

Discussion

Why VGG19 Outperformed Others (possibly):

Depth: VGG19 is deeper than VGG16. Deeper networks can capture more complex patterns and features, which might be crucial for accurately identifying Anemone fishes in diverse images.
Feature Extraction: The additional layers in VGG19 enable it to extract more detailed and discriminative features from images, enhancing its ability to distinguish Nemo from other objects or backgrounds.
Training Dataset: It's possible that the architecture and parameters of VGG19 are better suited for the characteristics of the dataset used for pre-training, which includes a wide variety of images from the ImageNet dataset.

GitHub:

https://github.com/RiyaChhikara/100daysofComputerVision/blob/main/Day11_FindingNemo.ipynb

100 Days of Computer Vision

836 位关注者

要查看或添加评论，请登录

Riya Chhikara的更多文章

#57 Vintage Watch Finder: AI in Luxury Watch Shopping

2024年10月21日

#57 Vintage Watch Finder: AI in Luxury Watch Shopping

Got a cool idea ! We have Google Lens where you can upload images to search for the items. I want to build a…
#56 Connecting the app to AWS S3 bucket

2024年9月22日

#56 Connecting the app to AWS S3 bucket

Now that QualScan works well, and we have integrated Postgres tables into the workflow, we have one more thing left to…
#55: How to build a solid backend for a scalable app?

2024年9月22日

#55: How to build a solid backend for a scalable app?

Now that we have a functional app with a decent interface, we can focus on the backend database storage. I used…
#54: How to integrate alert system into a machine vision app ?

2024年9月20日

#54: How to integrate alert system into a machine vision app ?

This will be a tutorial with code snippets. So, if you are building/ planning to build your app in Python, and want to…
# 53 The app now tracks defects in real-time

2024年9月19日

# 53 The app now tracks defects in real-time

What do real time quality dashboards 'really look' like? I found some results on Google which seemed pretty…
#52: Looks better than yesterday

2024年9月18日

#52: Looks better than yesterday

Today, I made some functional changes. Looks better, and fixed the slider issue.
#51: And the winner for the final model is VGG16

2024年9月17日

#51: And the winner for the final model is VGG16

Quick Recap: Yesterday we created an app that took product images as inputs and predicted the % of defects in it. The…

2 条评论
#50: Machine Vision for checking defects

2024年9月16日

#50: Machine Vision for checking defects

BACK AT IT ! Well, today I read about machine vision used in manufacturing setups. We know that humans can inspect only…
#49: Product Design for Smarter iPhone Search

2024年6月22日

#49: Product Design for Smarter iPhone Search

In the previous article, I mentioned 5 main improvements to be made in the iPhone photo Search. Today, I design…
#48 Tech Review on iPhone's Image Search

2024年6月22日

#48 Tech Review on iPhone's Image Search

As a phone user, I found a pain point in accessing photos from my gallery. Today, I study all the features that Apple…

See all articles

#11 Finding Nemo: Exploring pre-trained Keras models

Riya Chhikara

Data Scientist at The Economist | Guest Teacher at LSE

Sample images:

Results

领英推荐

Discussion

GitHub:

100 Days of Computer Vision

836 位关注者

Riya Chhikara的更多文章

社区洞察

其他会员也浏览了

Deep Learning

Exploring TensorFlow: Computation Graphs, Optimizations, and Differentiation

Continuous value prediction with decision forest algorithm

Torch and PyTorch: A Comprehensive Guide to Deep Learning Frameworks

Deep Clustering (A Self-Supervised Learning System)

Top Stories, Oct 10-16: Deep Learning, AI, and Neural Networks, Explained

Unleashing the Power of Temporal Fusion Transformers in Time Series Forecasting

Long Short-Term Memory (LSTM)

Computer Vision

The performance of three popular machine learning algorithms: Random Forest, XGBoost, and a simple Deep Neural Network

Sample images:

Results

领英推荐

Discussion

GitHub:

100 Days of Computer Vision

836 位关注者

Riya Chhikara的更多文章

#57 Vintage Watch Finder: AI in Luxury Watch Shopping

#56 Connecting the app to AWS S3 bucket

#55: How to build a solid backend for a scalable app?

#54: How to integrate alert system into a machine vision app ?

# 53 The app now tracks defects in real-time

#52: Looks better than yesterday

#51: And the winner for the final model is VGG16

#50: Machine Vision for checking defects

#49: Product Design for Smarter iPhone Search

#48 Tech Review on iPhone's Image Search

社区洞察

其他会员也浏览了

Deep Learning

Exploring TensorFlow: Computation Graphs, Optimizations, and Differentiation

Continuous value prediction with decision forest algorithm

Torch and PyTorch: A Comprehensive Guide to Deep Learning Frameworks

Deep Clustering (A Self-Supervised Learning System)

Top Stories, Oct 10-16: Deep Learning, AI, and Neural Networks, Explained

Unleashing the Power of Temporal Fusion Transformers in Time Series Forecasting

Long Short-Term Memory (LSTM)

Computer Vision

The performance of three popular machine learning algorithms: Random Forest, XGBoost, and a simple Deep Neural Network