登录查看更多内容

Creating a CNN Model for Image Classification with TensorFlow

Shanza Khan

Computer Science | Artifical Intelligence | Machine Learning | Deep Learning | Data Science | Graphic Designer | Social Media Marketing | Diploma in Computer Operator | Networking & Operating System

发布日期: 2024年5月28日

Creating a CNN Model for Image Classification with TensorFlow

Artificial neural networks are an artificial intelligence model inspired by the functioning of the human brain. Artificial neural networks consist of interconnected neurons. Neurons receive input data, process it, and produce output. This structure occurs in layers. Layers process the raw input data to make more useful inferences and contribute to extracting more meaningful information that can help solve relevant problems. Artificial neural networks consist of three different layers: Input Layer, Hidden Layer, and Output Layer.

Input Layer: The layer to which we send the data to the neural network. No changes are made to the input data in this layer, and the values of the input observations entered are transferred to the next layer, which is the Hidden Layer.

Hidden Layer: The learning process takes place in this layer. The information received is processed here.

Output Layer: It is where the desired information to be learned by the neural networks based on the input data and the comparison of the learned value is made. Once the operations here are completed, the result of this process appears in the output layer.

In this article, I discussed the structure of Convolutional Neural Networks (CNN) and the steps of creating a CNN model.

A Convolutional Neural Network (CNN) is used in the field of deep learning for tasks such as image recognition, image classification, and object detection. The CNN algorithm, which classifies by capturing features in different images, consists of different laye

Let’s examine the functions of CNN layers

Convolution Layer: This layer is where features are detected. It distinguishes parts of the images that create differences between images. Filtering operations are performed on the image in this layer.

Pooling Layer: It is used to reduce the size of the image without losing its features. Typically, operations such as maximum pooling or average pooling are used to preserve the most significant features. This way, for example, if there is a large image, the number of pixels is reduced. It is used to speed up the process by reducing computational complexity. Features are preserved in the pooling layer, the size is reduced, and there is no loss of information.

Flattening Layer: It performs the process of resizing the incoming data and preparing the data for the neural network. The matrix is converted into a vector.

Fully-Connected Layer: This is the step where the artificial neural network model is created.

Let’s look at the steps to create a CNN algorithm using a dataset.

1.Get to Know the Dataset

In practice, we will use the CIFAR-10 dataset available on the Kaggle platform. CIFAR-10 is a popular dataset in the field of computer vision

CIFAR-10 dataset consists of 60,000 color images with a resolution of 32x32 pixels, divided into 10 different classes. The training set contains 50,000 images, while the test set contains 10,000 images.

The 10 classes are respectively Plane, Automobile, Bird, Cat, Deer, Dog, Frog, Horse, Ship, and Truck

TensorFlow and Keras are popular libraries for creating and training deep learning and artificial intelligence models. They are often used together. Let’s install the TensorFlow and Keras libraries.

pip install tensorflow
pip install keras

Let’s import the libraries we will use in the project.

import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras import datasets,layers,models
import matplotlib.pyplot as plt

Let’s call the dataset. CIFAR-10 dataset is available within Keras. “xtrain” and “ytrain” contain the training images and labels, while “xtest” and “ytest” contain the test images and labels.

(xtrain,ytrain),(xtest,ytest)=datasets.cifar10.load_data()

ytest=ytest.reshape(-1,)
ytest

The expression “ytest.reshape(-1,)” is used to reshape the “ytest” variable into a one-dimensional array. The dimension of the “ytest” variable is automatically reshaped, resulting in a one-dimensional array. This way, the label data is formatted appropriately.

Classifications are present in the dataset as numerical values. Let’s name them. Let’s create a def function to visualize a specific data point and its corresponding label.

领英推荐

Artificial Neural Networks A Comprehensive Guide

Global Software Consulting 5 个月前

How to Build a Neural Network & Make Predictions with…

Awesome Analytics 7 个月前

TO THE DEEPEST: Convolutional Neural Networks

aNumak & Company ? 2 年前

classname=["airplane","automobile","bird","cat","deer","dog","frog","horse","ship","truck"]

def example(x,y,index):
    plt.figure(figsize=(15,2))
    plt.imshow(x[index]) 
    plt.xlabel(classname[y[index]])

#x: Dataset containing the images.
#y: Dataset containing the labels. Specifies the class of each image.
#index: Specify the index of the data point to be visualized

example(xtest,ytest,8)

The 8th image in the “xtest” dataset and its label from the “ytest” dataset will be visualized, and the output will be displayed.

2.Normalization Operations

Let’s determine the standardization procedures and layer properties of the neural network.To create a CNN (Convolutional Neural Network) model, we will use Keras’s Sequential model. The model we create is a deep learning model designed to classify 32x32-sized color images provided.

It is a common practice to scale the pixel values of images in the dataset to the range [0, 1]. With the code, we normalize the pixel values of the images, which are between 0 and 255, to values between 0 and 1.

xtrain=xtrain/255
xtest=xtest/255

rom keras.layers import Input

model = models.Sequential([
    Input(shape=(32,32,3)),  # InputLayer'? buraya ekleyin
    layers.Conv2D(filters=32, kernel_size=(3,3), activation='relu'),
    layers.MaxPooling2D(2,2),
    
    layers.Conv2D(filters=64, kernel_size=(3,3), activation='relu'),
    layers.MaxPooling2D(2,2),
    
    layers.Flatten(),
    layers.Dense(64, activation="relu"),
    layers.Dense(10, activation="softmax")
])

The Conv2D layer allows extraction of features from images. Relu (Rectified Linear Activation) activation function is used, which ensures that the outputs of the layer are zero for values below zero.

The MaxPooling2D layer is applied as the second step. A maximum pooling operation of size 2x2 is applied, which reduces the size of the image by taking the maximum pixel value from each 2x2 pixel block and helps preserve important features.

I repeated the same steps twice. These operations help in transforming features into higher-level features and further reducing the size.

The Flatten layer converts all feature maps into a one-dimensional vector. This allows CNN to transition to dense layers.

Dense Layers come after the Flatten layer. A dense layer with 64 neurons follows, with ReLU activation function. Then, another dense layer with 10 neurons for classification follows. The outputs of this layer are computed using a softmax activation function, which determines the probabilities of the classes. If it is a binary classification problem, a sigmoid activation function can be used.

model.compile(optimizer="adam",
              loss="sparse_categorical_crossentropy",
              metrics=["accuracy"])

“model.compile” prepares the dataset for training. In the application, the Adam optimization algorithm is used as the optimizer. The “loss” parameter measures how far the model’s predictions are from the actual labels. The “accuracy” metric calculates the rate of correct classification and measures the classification performance of the model.

3.Training the Model

The completion of training is specified with epochs. An epoch is one pass through the entire training dataset on the model. For example, 30 epochs mean the model will go through the training dataset 30 times.

history=model.fit(xtrain,ytrain,epochs=30, validation_data=(xtest,ytest))

“model.evaluate” computes the loss value and the specified metric values of the model based on the provided test dataset and labels. You can create Loss and Accuracy charts with the following codes.

loss,acc= model.evaluate(xtest,ytest,verbose=False)

plt.figure(figsize=(20,5))
plt.subplot(1,2,1)
plt.plot(history.history['accuracy'],color = "b",label= "Training Accuracy")
plt.plot(history.history["val_accuracy"], color = "r", label = "Validation Accuracy")
plt.legend(loc = "lower right")
plt.xlabel("Epoch", fontsize= 16)
plt.ylabel("Accuracy",fontsize = 16)
plt.ylim([min(plt.ylim()),1])
plt.title("Training and Test Performance Graph", fontsize = 16)

plt.figure(figsize= (20,5))
plt.subplot(1,2,2)
plt.plot(history.history["loss"],color= "b",label="Training Loss")
plt.plot(history.history["val_loss"],color="r",label="Validation Loss")
plt.legend(loc= "upper right")
plt.xlabel("Epoch",fontsize=16)
plt.ylabel("Loss",fontsize = 16)
plt.ylim([0,max(plt.ylim())])
plt.title("Training and Test Loss Graph",fontsize= 16)
plt.show()

ypred=model.predict(xtest)
ypred[:3]

The “np.argmax” function returns the index of the largest element in an array, in this case, it will return the index of the class with the highest probability.

ypred1=[np.argmax(element) for element in ypred]
ypred1[:3]

Let’s perform evaluations to assess how close the model’s predictions are to the actual data and to evaluate the model’s performance. If there is similarity between ytest and ypred1, it can be said that the model performs well.

# True classes
y_true = [3, 8, 8, 0]

# Predicted classes
ypred1 = [3, 8, 1, 0]

# Printing the true and predicted classes
for true, pred in zip(y_true[:3], ypred1[:3]):
    print("True Class:", classname[true], "\tPredicted Class:", classname[pred])

要查看或添加评论，请登录

Shanza Khan的更多文章

A Simple Guide to DeepSeek R1: Architecture, Training, Local Deployment, and Hardware Requirements

2025年1月28日

A Simple Guide to DeepSeek R1: Architecture, Training, Local Deployment, and Hardware Requirements

DeepSeek’s Novel Approach to LLM Reasoning DeepSeek has introduced an innovative approach to improving the reasoning…
Choosing the Right AI Agent Framework: LangGraph vs CrewAI vs OpenAI Swarm

2025年1月13日

Choosing the Right AI Agent Framework: LangGraph vs CrewAI vs OpenAI Swarm

In-depth comparison of agent orchestration with the same Agentic Finance App built using 3 different frameworks. What…

4 条评论
What Are AI Agents? A Short Intro And A Step-by-Step Guide to Build Your Own.

2025年1月5日

What Are AI Agents? A Short Intro And A Step-by-Step Guide to Build Your Own.

The next big thing? Gartner believes AI agents are the future. OpenAI, Nvidia and Microsoft are betting on it — as are…

2 条评论
Different ChatModel(LLMs) Applications using LangChain

2024年10月2日

Different ChatModel(LLMs) Applications using LangChain

LangChain is a versatile library that provides a framework for building applications using large language models…
Llama 3.2: Revolutionizing edge AI and vision with open, customizable models

2024年9月29日

Llama 3.2: Revolutionizing edge AI and vision with open, customizable models

Llama 3.2: Revolutionizing edge AI and vision with open, customizable models Takeaways: Today, we’re releasing Llama 3.

1 条评论
RAG using LangChain : Part 4-Retrievers/Vectore Store

2024年9月27日

RAG using LangChain : Part 4-Retrievers/Vectore Store

In the previous article, we touched upon Vector Stores and Retrievers. In this article, we will briefly discuss about…

1 条评论
Llama 3.2 Vision, the new multi-modal LLM by Meta How to use Llama 3.2 and new features explained

2024年9月26日

Llama 3.2 Vision, the new multi-modal LLM by Meta How to use Llama 3.2 and new features explained

The Generative AI space is now at its full swing where someday, we get a PhD level intelligent LLM (OpenAI-o1), while…
Which Vector Database Should You Use? Choosing the Best One for Your Needs

2024年9月20日

Which Vector Database Should You Use? Choosing the Best One for Your Needs

Introduction Vector Databases have become the go-to place for storing and indexing the representations of unstructured…

1 条评论
RAG using LangChain : Part 3- Vector Store/ Embedding Vector and Retrievers

2024年9月9日

RAG using LangChain : Part 3- Vector Store/ Embedding Vector and Retrievers

RAG using LangChain : Part 3- Vector Stores and Retrievers Welcome to the third article in this series of RAG where we…
RAG using LangChain : Part 2- Text Splitters and Embeddings

2024年9月2日

RAG using LangChain : Part 2- Text Splitters and Embeddings

The next step in the Retrieval process in RAG is to transform and embed the loaded Documents. If you are not familiar…

2 条评论

See all articles

Creating a CNN Model for Image Classification with TensorFlow

Shanza Khan

Computer Science | Artifical Intelligence | Machine Learning | Deep Learning | Data Science | Graphic Designer | Social Media Marketing | Diploma in Computer Operator | Networking & Operating System

Creating a CNN Model for Image Classification with TensorFlow

1.Get to Know the Dataset

领英推荐

2.Normalization Operations

3.Training the Model

Shanza Khan的更多文章

社区洞察

其他会员也浏览了

How Convolutional Neural Networks are Revolutionizing Computer Vision

BxD Primer Series: Convolutional Neural Networks

Foundations of Neural Network

Conquer Feed forward Neural Networks with TensorFlow

Empowering Intelligence: Unleashing the Future with Neural Networks!

Artificial Intelligence - Part 5 -Neural Networks

A Practical Guide to Recurrent Neural Networks for Enterprise

Recurrent Neural Networks in Deep Learning — Part2

Introduction to Neural Networks - Basics

BxD Primer Series: Liquid State Machine (LSM) Neural Networks

Creating a CNN Model for Image Classification with TensorFlow

1.Get to Know the Dataset

领英推荐

2.Normalization Operations

3.Training the Model

Shanza Khan的更多文章

A Simple Guide to DeepSeek R1: Architecture, Training, Local Deployment, and Hardware Requirements

Choosing the Right AI Agent Framework: LangGraph vs CrewAI vs OpenAI Swarm

What Are AI Agents? A Short Intro And A Step-by-Step Guide to Build Your Own.

Different ChatModel(LLMs) Applications using LangChain

Llama 3.2: Revolutionizing edge AI and vision with open, customizable models

RAG using LangChain : Part 4-Retrievers/Vectore Store

Llama 3.2 Vision, the new multi-modal LLM by Meta How to use Llama 3.2 and new features explained

Which Vector Database Should You Use? Choosing the Best One for Your Needs

RAG using LangChain : Part 3- Vector Store/ Embedding Vector and Retrievers

RAG using LangChain : Part 2- Text Splitters and Embeddings

社区洞察

其他会员也浏览了

How Convolutional Neural Networks are Revolutionizing Computer Vision

BxD Primer Series: Convolutional Neural Networks

Foundations of Neural Network

Conquer Feed forward Neural Networks with TensorFlow

Empowering Intelligence: Unleashing the Future with Neural Networks!

Artificial Intelligence - Part 5 -Neural Networks

A Practical Guide to Recurrent Neural Networks for Enterprise

Recurrent Neural Networks in Deep Learning — Part2

Introduction to Neural Networks - Basics

BxD Primer Series: Liquid State Machine (LSM) Neural Networks