登录查看更多内容

SVM (Support Vector Machine)

Ashutosh Chaudhary

Senior Manager Applications

发布日期: 2020年5月24日

An internet search on the topmost Machine Learning Algorithms will list SVM in the top 5 (if not the top three) in most of the listings. The reason is SVM can be used as a classification as well as a Regression Algorithm. It can be used for both Linear as well as Non-Linear Regression.

What makes it so awesome is the flexibility it gives to alter the Tuning parameters. This helps in overcoming the over-fitting problem and also makes it robust to outliers owing to the tuning parameters like Kernel, Cost and Gamma function.

Introduction:

In SVM algorithm, we plot each data item as a point in n-dimensional space (where n is the number of features) with the value of each feature being the value of a particular coordinate. Then, we perform classification by finding the hyper-plane that differentiates the classes aptly. As reflected in the first diagram, all the three lines divide the two categories very well. But (as shown in second diagram) we will need to choose a line which gives the Maximum Margin. The Distance between positive and negative hyperplane is called as Maximum Margin.

Detailing:

Underlying Model - To use SVM as Regression we use SVR instead of SVM.
Tuning -There are primarily three parameters for model tuning - Kernel, Cost and Gamma function.

2.1 Kernel Function:

It gives the option to take a non-linear boundary. The options available for tuning with Kernel are "linear", "poly","rbf" etc. "rbf" and "poly" are useful for non-linear hyper-plane. Default value is "rbf". The figures below clearly demarcate the difference between Linear and Non Linear Kernels.

The Kernel helps to transform from 1D to 2D to 3D. Below two diagrams show the two scenarios. If the the data is in say n dimensions, the SVM will convert it to n+1 dimension.

It is tough to visualize higher dimension graphs. Below figure tries this to some extent

2.2 Gamma Parameter:

It defines the 'far' influence of a single training data.

2.2.1 High Gamma: Higher the values of Gamma , Close is the reach. Implying, only the points closer to the demarcation line have an influence on the line. Higher the value of gamma, the model will try to exactly fit the training data set. This will lead to generalization error and cause over-fitting problem.

2.2.2 Low Gamma: Lower the values of Gamma, Far is the reach. Implying, that far away points have higher influence on the line. From the figure it can be implied about the influence of the far away points on the line, thus making it slightly linear.

Below comparison will give a good gist of the implications of varying Gamma Parameters.

2.3 Cost Parameter 'C':

This parameter works towards enabling the SVM model getting everything right v/s getting the things that it gets 'very' right.

If there is a High C, then the Demarcation hyperplane will be precisely correct.
In Medium C, the aim is to have a larger separation between the points even if the accuracy is decreased a bit
In Low C, it talks about maximizing the margin

The Cost Parameter C is used for Error controlling. It controls the trade off between smooth decision boundary and classifying the training points correctly.

A low C makes the decision surface smooth, while a High C aims at classifying all training examples correctly by giving the model freedom to select more samples as support vectors

These Model Tuning Parameters make SVM a very powerful Machine Learning Algorithm

Steps to Implement a SVM:

1. Split the data into Train and Test DataSet

2. Fit SVM to the Training DataSet

3. Predict the Test DataSet

4. Do Model Evaluation

5. Perform Model Tuning based on the above parameters

To Implement SVM in R, use the e.1071 package

Applications of SVM:

1. Spam Classification

2. Text Categorization

3. Handwriting Recognition

4. Image Classification

5. Speaker Identification

6. Face Detection

7. Bioinformatics

Disadvantages:

In SVM, we are not able to get a view of the variable/feature weightage and individual impact on model generation. So the feature influence is a black box
Since the final model is not easy to see, we cannot do small calibrations to the model hence its tough to incorporate our business logic. This issue aggravates since we are not able to see the impact of weights
SVM also doesn't perform well when the data set has more noise

要查看或添加评论，请登录

Ashutosh Chaudhary的更多文章

Executive Briefing- Computer Vision

2021年3月5日

Executive Briefing- Computer Vision

This is Part I of a series of Five articles from the Exec Briefing Series on Machine Learning. You can read the prelude…
Executive Briefing - Artificial Intelligence

2021年3月5日

Executive Briefing - Artificial Intelligence

Welcome to Executive Briefing for Artificial Intelligence (AI). Executives need to know a bit of all the technologies…

4 条评论
Logistic Regression

2020年5月21日

Logistic Regression

Life isn't always Black or White. There are shades of Grey.
Linear Regression

2020年5月17日

Linear Regression

Whenever one embarks on the journey of Machine Learning and reaches the Algorithm stage, Linear Regression is the first…

6 条评论
KNN - K Nearest Neighbour

2020年5月9日

KNN - K Nearest Neighbour

Whats the need: KNN comes under Supervised Learning Algorithms. Its used where there is NON-Linear Regression and hence…

See all articles

SVM (Support Vector Machine)

Ashutosh Chaudhary

Senior Manager Applications

Introduction:

Detailing:

2.1 Kernel Function:

2.2 Gamma Parameter:

2.3 Cost Parameter 'C':

Steps to Implement a SVM:

Applications of SVM:

Disadvantages:

Ashutosh Chaudhary的更多文章

社区洞察

其他会员也浏览了

Understanding statistical inference

K-Means Clustering in Machine Learning

What is RandomizedSearchCV in Machine Learning

Unveiling the Art of Feature Selection in Machine Learning

???? Navigating the Gradient Descent Landscape: A Comprehensive Exploration of Machine Learning Optimization ????

Model Fine-Tuning

5 Minute AI/ML concepts: Understanding Support Vector Machines (SVMs)

The Power of Prediction: Linear Regression in Machine Learning

Bias and Variance in Good Fit Models

Introduction:

Detailing:

2.1 Kernel Function:

2.2 Gamma Parameter:

2.3 Cost Parameter 'C':

Steps to Implement a SVM:

Applications of SVM:

Disadvantages:

Ashutosh Chaudhary的更多文章

Executive Briefing- Computer Vision

Executive Briefing - Artificial Intelligence

Logistic Regression

Linear Regression

KNN - K Nearest Neighbour

社区洞察

其他会员也浏览了

Understanding statistical inference

K-Means Clustering in Machine Learning

What is RandomizedSearchCV in Machine Learning

Unveiling the Art of Feature Selection in Machine Learning

???? Navigating the Gradient Descent Landscape: A Comprehensive Exploration of Machine Learning Optimization ????

Model Fine-Tuning

5 Minute AI/ML concepts: Understanding Support Vector Machines (SVMs)

The Power of Prediction: Linear Regression in Machine Learning

Bias and Variance in Good Fit Models