登录查看更多内容

Image Recognition with Transfer Learning

Mateo Gallo

Software Engineer - Machine Learning

发布日期: 2023年6月5日

Abstract

This article explores the concept of transfer learning in machine learning. Transfer learning involves utilizing knowledge acquired from solving one problem and applying it to a different but related problem. Instead of starting the learning process from scratch for each new task, this technique allows models to leverage previously learned representations or features. By doing so, models can benefit from existing knowledge and improve their performance on new tasks that work correctly for different but related problems.

The full source code is in my GitHub repository.

Introduction

Studying machine learning at Holberton School, I was assigned a task to develop a model to classify the CIFAR-10 dataset. CIFAR-10 is a commonly used dataset that consists of 32x32 color images of 10 different classes with 6000 images each. Said model should have a validation accuracy of 87% or higher, that means, on a set of images that the model has not seen yet, it has to predict at least 87% of that validation dataset. This assignment also involves preprocessing the data for my model, which includes the training set that consists of the input data for the network and the labels the model has to attempt to predict.

The overall idea is to use another previously trained model to achieve better results on my model. So I would have to transfer the learning from another network to mine in order to improve its complexity and achieve satisfactory results.

Materials and Methods

First thing I did was code the function to preprocess the data. For this I took advantage of the Keras ResNet-50 application for preprocessing the CIFAR-10 images. What this will do is normalize the input images for this used neural network. Then the expected outputs or labels for CIFAR-10 should be easier for the model to interpret, so for that I used the Keras ‘to_categorical’ function to turn the labels into arrays that use the One Hot encryption. Here’s what the preprocess function looks like (TensorFlow’s Keras is imported as K):

def preprocess_data(X, Y):
????X_p = K.applications.resnet50.preprocess_input(X)

????Y_p = K.utils.to_categorical(Y, 10)

????return X_p, Y_p:

Now that the dataset can be preprocessed for further training, I began to build the model. The correct approach when building image recognition neural networks is to follow a? convolutional neural network architecture. But first we must be sure that the model takes the matrices correctly as input, so for that the first layer of our model must be a resizing layer for the input. Keras provides a ‘Lambda’ layer that allows you to apply a custom function to the input data. In this case, a lambda function defined inline will resize the images with the Keras backend method ‘resize_images’. It will take the input image tensor x scaling factors of 2 in both the height and width dimensions like this:

model.add(K.layers.Lambda(lambda x: K.backend.resize_images(x, 2, 2, 'channels_last'), input_shape=(32, 32, 3)))

Now with this, I had to use another neural network, in this case R that has been trained to recognize images. The idea here is to take advantage of already trained models to do different tasks without leaving behind the already achieved learning. ResNet-50 means this model is part of the family of Residual Networks featuring 50 convolutional layers along with other layers like pooling, fully connected and shortcut connections. As this model is really convenient to use I am testing its usage by adding new layers and the objective to classify CIFAR-10. I also had to freeze layers of ResNet-50 that were already suitable for my model given that I don’t want the useful pieces of that model to be modified.

领英推荐

How can neural networks be used for machine learning?

Machine Learning 2 年前

Enhancing SAT solvers with deep learning: A fusion of…

Porsche Digital 1 年前

Understanding Deep Neural Networks Training Course

Bluechip Technologies Asia 12 个月前

This technique of using a neural network already trained to create a new network is called transfer learning. This is how I applied ResNet-50 to my network right after the Lambda layer:

input_t = K.Input(shape=(32, 32, 3)
resnet50 = K.applications.ResNet50(include_top=False, input_tensor=input_t)
for layer in resnet50.layers[:143]:
??  layer.trainable = False
model.add(K.layers.Lambda(lambda x: K.backend.resize_images(x, 2, 2, 'channels_last'), input_shape=(32, 32, 3)))
model.add(resnet50)

What a convolutional neural network does is take an image as input, this being a matrix that contains each pixel; the network will convolve the image in order to extract important features of the image, such as the shapes within the images. Think about it as how humans see things, we don’t necessarily need to look at every feature or every color in the image, we might just need the silhouette of something to know what it is, so does a machine if you give it the right tools.

I took care of adding some regularizers and extra features such as Batch Normalization to achieve a good training speed and stability, Dropout regularization to help the model adapt to unseen data. As the purpose of this model is to classify, I chose to use the RMSprop optimizer with categorical cross entropy loss.

Here’s the code of the full architecture and compilation of the model:

model = K.models.Sequential(

input_t = K.Input(shape=(32, 32, 3))

resnet50 = K.applications.ResNet50(include_top=False, input_tensor=input_t)

for layer in resnet50.layers[:143]:

????layer.trainable = False

model.add(K.layers.Lambda(lambda x: K.backend.resize_images(x, 2, 2, 'channels_last'), input_shape=(32, 32, 3)))

model.add(resnet50)

model.add(K.layers.Flatten())

model.add(K.layers.BatchNormalization())

model.add(K.layers.Dense(256, activation='relu'))

model.add(K.layers.Dropout(0.5))

model.add(K.layers.BatchNormalization())

model.add(K.layers.Dense(128, activation='relu'))

model.add(K.layers.Dropout(0.5))

model.add(K.layers.BatchNormalization())

model.add(K.layers.Dense(64, activation='relu'))

model.add(K.layers.Dropout(0.5))

model.add(K.layers.BatchNormalization())

model.add(K.layers.Dense(10, activation='softmax'))


model.compile(optimizer=K.optimizers.RMSprop(lr=2e-5),? loss='categorical_crossentropy', metrics=['accuracy'])
)

For the training I chose to add a callback that monitors the validation accuracy? and saves the best results during training. I chose to train it through 15 epochs with a batch size of 512.

Results

After some hours of training I obtained successful results with roughly 90% accuracy which is higher than requested. In the end it was an interesting experiment to use a previously trained model to create my own. It could even achieve a better outcome since it uses regulators like Dropout that are excellent features to prevent overfitting so it can be trained for several hours longer.

Discussion

This is none other than an excellent showcase of how many powerful and creative models of machine learning can be made and the best part is you don’t even need to create it all from scratch and do all sorts of stuff. Even with my model you can create another model for other tasks even more complex than classifying CIFAR-10.

要查看或添加评论，请登录

Mateo Gallo的更多文章

'Theseus' Chess bot

2024年3月12日

'Theseus' Chess bot

In 1989, Garry Kasparov faced IBM’s Deep Thought, a computer made to play Chess. This computer was defeated by Kasparov…

2 条评论
Bayesian Optimization

2023年7月30日

Bayesian Optimization

In the rapidly evolving landscape of machine learning, optimization techniques play a pivotal role in solving complex…
I did machine learning in C

2023年6月21日

I did machine learning in C

Introduction First of all, I want to clarify that all the code that I wrote in C to perform machine learning was…
What is AlexNet?

2023年5月19日

What is AlexNet?

Nowadays image recognition is something that we give for granted. For instance, in our phones we have both simple QR…
Machine Learning Regularization

2023年5月8日

Machine Learning Regularization

When building a machine learning model with unsupervised training, there are a lot of uncertainties. For instance, the…
Supervised Learning Optimization

2023年4月30日

Supervised Learning Optimization

Supervised learning is a technique used in machine learning where an algorithm learns from labeled data to make…

See all articles

Image Recognition with Transfer Learning

Mateo Gallo

Software Engineer - Machine Learning

领英推荐

Mateo Gallo的更多文章

社区洞察

其他会员也浏览了

Unveiling the Power of Deep Learning in License Plate Recognition

Deep Learning training

Mathematical foundations of Data Science: Deep Learning algorithms expressed as a mapping of a hidden function

Understanding deep learning models as overcoming limitations of previous models

An overview of deep learning from a mathematical perspective

Deep Learning with Graphs: Part 9 of my Graph Series of blogs

Essential Concepts From Little Book of Deep Learning

Foundational Papers in Deep Learning

Data Transformation and Wrangling: An Extensive Look into ML Data Pipelines In DQN

What are Activation functions in DL?

领英推荐

Mateo Gallo的更多文章

'Theseus' Chess bot

Bayesian Optimization

I did machine learning in C

What is AlexNet?

Machine Learning Regularization

Supervised Learning Optimization

社区洞察

其他会员也浏览了

Unveiling the Power of Deep Learning in License Plate Recognition

Deep Learning training

Mathematical foundations of Data Science: Deep Learning algorithms expressed as a mapping of a hidden function

Understanding deep learning models as overcoming limitations of previous models

An overview of deep learning from a mathematical perspective

Deep Learning with Graphs: Part 9 of my Graph Series of blogs

Essential Concepts From Little Book of Deep Learning

Foundational Papers in Deep Learning

Data Transformation and Wrangling: An Extensive Look into ML Data Pipelines In DQN

What are Activation functions in DL?