登录查看更多内容

Image classification using SVM ( 92% accuracy)-Quick reference for beginners

Ashish Mohan

AI/ML Expert | Digital Transformation & Generative AI Specialist | Responsible AI & Ethics Advocate | Corporate Trainer & Fintech Educator | Empowering the Future of Tech in 2025

发布日期: 2021年4月13日

Image Classification using SVM is very efficient way of modelling and very rarely used algorithm for image processing and modelling..!!!!

Tips for using SVM for image classification

You should have image data in 2D rather than 4D (as SVM training model accepts dim <=2 so we need to convert the image data to 2D which i'll be showing later on in this notebook).
SVM algorithm is to be used when their is shortage of data in our dataset .
If we have good amount of image data so, we look further for CNN model.

INFO OF DATASET...!!

The Dataset is named as 'Color Classification' created by Aydin Ayanzadeh. we are provided with images of different color set with labels of color name such as red,blue,etc link :- https://www.kaggle.com/ayanzadeh93/color-classification

Importing the dataset

In [1]:

# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        os.path.join(dirname, filename)

# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

importing basic Packages..!!

In [2]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import cv2
import os
from tqdm import tqdm

now,we have provided data directory to DATADIR variable and labels of color set to CATEGORIES variable for further use.

In [3]:

DATADIR = '../input/color-classification/ColorClassification'
CATEGORIES = ['orange','Violet','red','Blue','Green','Black','Brown','White']
IMG_SIZE=100

Ex. of an sample image is shown below

In [4]:

for category in CATEGORIES:
    path=os.path.join(DATADIR, category)
    for img in os.listdir(path):
        img_array=cv2.imread(os.path.join(path,img))
        plt.imshow(img_array)
        plt.show()
        break
    break

performing preprocessing steps...::

In [5]:

training_data=[]
def create_training_data():
    for category in CATEGORIES:
        path=os.path.join(DATADIR, category)
        class_num=CATEGORIES.index(category)
        for img in os.listdir(path):
            try:
                img_array=cv2.imread(os.path.join(path,img))
                new_array=cv2.resize(img_array,(IMG_SIZE,IMG_SIZE))
                training_data.append([new_array,class_num])
            except Exception as e:
                pass
create_training_data()

In [6]:

print(len(training_data))
107

storing training length for further use.

In [7]:

lenofimage = len(training_data)

for image to be trained we have to convert the image to a array form so,that our model can train on it...!!

and X should be of type (training_data_length , -1) because SVM takes 2D input to train

In [8]:

X=[]
y=[]

for categories, label in training_data:
    X.append(categories)
    y.append(label)
X= np.array(X).reshape(lenofimage,-1)
##X = tf.keras.utils.normalize(X, axis = 1)

In [9]:

X.shape

Out[9]:

(107, 30000)

flattening the array

In [10]:

X = X/255.0

Ex. of flattened array...

In [11]:

X[1]

Out[11]:

array([1., 1., 1., ..., 1., 1., 1.])

note : y should be in array form compulsory.

In [12]:

y=np.array(y)

In [13]:

y.shape

Out[13]:

(107,)

Now we are ready with our dependent and independent features, now its time for data modelling

applying train_test_split on our data

In [14]:

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y)

fitting our data in SVM model

In [15]:

from sklearn.svm import SVC
svc = SVC(kernel='linear',gamma='auto')
svc.fit(X_train, y_train)

Out[15]:

SVC(C=1.0, break_ties=False, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma='auto', kernel='linear',
    max_iter=-1, probability=False, random_state=None, shrinking=True,
    tol=0.001, verbose=False)

predicting the X_test

In [16]:

y2 = svc.predict(X_test)

In [17]:

from sklearn.metrics import accuracy_score
print("Accuracy on unknown data is",accuracy_score(y_test,y2))
Accuracy on unknown data is 0.6666666666666666

Ahh yeah....accuracy of 92.28% which is what we wanted..!!!!

formulating the Classification report

In [18]:

from sklearn.metrics import classification_report
print("Accuracy on unknown data is",classification_report(y_test,y2))
Accuracy on unknown data is               precision    recall  f1-score   support

           0       0.00      0.00      0.00         0
           1       0.57      0.67      0.62         6
           2       1.00      0.60      0.75         5
           3       1.00      0.40      0.57         5
           4       1.00      1.00      1.00         1
           5       0.00      0.00      0.00         1
           6       0.80      0.80      0.80         5
           7       0.80      1.00      0.89         4

    accuracy                           0.67        27
   macro avg       0.65      0.56      0.58        27
weighted avg       0.80      0.67      0.70        27

/opt/conda/lib/python3.7/site-packages/sklearn/metrics/_classification.py:1272: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))

In [19]:

result = pd.DataFrame({'original' : y_test,'predicted' : y2})

In [20]:

result

Out[20]:

originalpredicted05117721136041152264471581193110771122123313221477152616331717181119312066216622662377246625202635

link code

we have mostly classified all the images correctly with their labels .doing classification on limited dataset is always a challenging task....but by SVM we have delt with it successfully

要查看或添加评论，请登录

Ashish Mohan的更多文章

?? Fintech in 2026: The Future is Closer Than You Think! ??

2025年3月19日

?? Fintech in 2026: The Future is Closer Than You Think! ??

The financial world is evolving at lightning speed, and by 2026, it will be unrecognizable. Here’s what’s coming: ?…
The Solace of Stillness

2024年10月25日

The Solace of Stillness

Less words, more peace. I've found solace in silence, where my thoughts untangle and my soul breathes deep.
The crisp takeaway of the day-

2024年10月6日

The crisp takeaway of the day-

"True freedom begins when you stop seeking approval from others and start living authentically for yourself." & "When…
RAG (Retrieval Augmented Generation) vs. Fine-Tuning

2024年9月5日

RAG (Retrieval Augmented Generation) vs. Fine-Tuning

RAG involves retrieving relevant information from a knowledge base and then using a language model to generate a…
RAG Cosine Similarity

2024年9月5日

RAG Cosine Similarity

Retrieval Augmented Generation (RAG) models primarily use cosine similarity to measure the similarity between a query…
Be-sincere-but-not-too-serious

2024年8月24日

Be-sincere-but-not-too-serious
Why you should visit Cemetries?

2024年8月23日

Why you should visit Cemetries?

Wherever Andy sees, he sees a rockstar. There are lots of them.
To make u smile ....

2024年8月21日

To make u smile ....

"I told my AI assistant to write a joke about cybersecurity. It came back with, 'What do you call a hacker who loves…
AI one liner Gyan -47. Embeddings in transformer architecture

2024年5月11日

AI one liner Gyan -47. Embeddings in transformer architecture

It is nothing but a map'ing of words or tokens to high-dimensional vectors, facilitating the model's understanding of…
AI one liner Gyan-46

2024年5月10日

AI one liner Gyan-46

Fine-tuning tweaks existing models, while RAG mixes generation with fetching extra info to improve answers.

See all articles

Image classification using SVM ( 92% accuracy)-Quick reference for beginners

Ashish Mohan

AI/ML Expert | Digital Transformation & Generative AI Specialist | Responsible AI & Ethics Advocate | Corporate Trainer & Fintech Educator | Empowering the Future of Tech in 2025

INFO OF DATASET...!!

Ashish Mohan的更多文章

社区洞察

其他会员也浏览了

AI_Part_5_K-NN

Decision Trees: A Guide to Understanding and Building

Mastering Key Data Structures and Algorithms: Week 2 Breakdown

A Complete Guide to Principal Component Analysis — PCA in Machine Learning

Logistic Regression with deciles made simple

ML Classification Algorithms to Predict Market Movements and Backtesting

Clustering USArrests Dataset using K-means Method

PCA in Machine Learning & Data Science

Let's get dive into the Regression-Bit by bit from scratch.

Exploring foundational machine learning algorithms: Linear regression, decision trees, and K-nearest neighbors

INFO OF DATASET...!!

Ashish Mohan的更多文章

?? Fintech in 2026: The Future is Closer Than You Think! ??

The Solace of Stillness

The crisp takeaway of the day-

RAG (Retrieval Augmented Generation) vs. Fine-Tuning

RAG Cosine Similarity

Be-sincere-but-not-too-serious

Why you should visit Cemetries?

To make u smile ....

AI one liner Gyan -47. Embeddings in transformer architecture

AI one liner Gyan-46

社区洞察

其他会员也浏览了

AI_Part_5_K-NN

Decision Trees: A Guide to Understanding and Building

Mastering Key Data Structures and Algorithms: Week 2 Breakdown

A Complete Guide to Principal Component Analysis — PCA in Machine Learning

Logistic Regression with deciles made simple

ML Classification Algorithms to Predict Market Movements and Backtesting

Clustering USArrests Dataset using K-means Method

PCA in Machine Learning & Data Science

Let's get dive into the Regression-Bit by bit from scratch.

Exploring foundational machine learning algorithms: Linear regression, decision trees, and K-nearest neighbors