SVM (Support Vector Machine)

SVM (Support Vector Machine)

An internet search on the topmost Machine Learning Algorithms will list SVM in the top 5 (if not the top three) in most of the listings. The reason is SVM can be used as a classification as well as a Regression Algorithm. It can be used for both Linear as well as Non-Linear Regression.

What makes it so awesome is the flexibility it gives to alter the Tuning parameters. This helps in overcoming the over-fitting problem and also makes it robust to outliers owing to the tuning parameters like Kernel, Cost and Gamma function.

Introduction:

No alt text provided for this image

In SVM algorithm, we plot each data item as a point in n-dimensional space (where n is the number of features) with the value of each feature being the value of a particular coordinate. Then, we perform classification by finding the hyper-plane that differentiates the classes aptly. As reflected in the first diagram, all the three lines divide the two categories very well. But (as shown in second diagram) we will need to choose a line which gives the Maximum Margin. The Distance between positive and negative hyperplane is called as Maximum Margin.

Detailing:

  1. Underlying Model - To use SVM as Regression we use SVR instead of SVM. 
  2. Tuning -There are primarily three parameters for model tuning - Kernel, Cost and Gamma function.

2.1 Kernel Function:

It gives the option to take a non-linear boundary. The options available for tuning with Kernel are "linear", "poly","rbf" etc. "rbf" and "poly" are useful for non-linear hyper-plane. Default value is "rbf". The figures below clearly demarcate the difference between Linear and Non Linear Kernels.

No alt text provided for this image

The Kernel helps to transform from 1D to 2D to 3D. Below two diagrams show the two scenarios. If the the data is in say n dimensions, the SVM will convert it to n+1 dimension.

No alt text provided for this image
No alt text provided for this image

 It is tough to visualize higher dimension graphs. Below figure tries this to some extent

No alt text provided for this image

2.2 Gamma Parameter:

It defines the 'far' influence of a single training data.

No alt text provided for this image

2.2.1 High Gamma: Higher the values of Gamma , Close is the reach. Implying, only the points closer to the demarcation line have an influence on the line. Higher the value of gamma, the model will try to exactly fit the training data set. This will lead to generalization error and cause over-fitting problem.

No alt text provided for this image

2.2.2 Low Gamma: Lower the values of Gamma, Far is the reach. Implying, that far away points have higher influence on the line. From the figure it can be implied about the influence of the far away points on the line, thus making it slightly linear.

Below comparison will give a good gist of the implications of varying Gamma Parameters.

No alt text provided for this image

2.3 Cost Parameter 'C':

This parameter works towards enabling the SVM model getting everything right v/s getting the things that it gets 'very' right.

  • If there is a High C, then the Demarcation hyperplane will be precisely correct.
  • In Medium C, the aim is to have a larger separation between the points even if the accuracy is decreased a bit
  • In Low C, it talks about maximizing the margin
No alt text provided for this image

The Cost Parameter C is used for Error controlling. It controls the trade off between smooth decision boundary and classifying the training points correctly. 

A low C makes the decision surface smooth, while a High C aims at classifying all training examples correctly by giving the model freedom to select more samples as support vectors

No alt text provided for this image
These Model Tuning Parameters make SVM a very powerful Machine Learning Algorithm

Steps to Implement a SVM:

1. Split the data into Train and Test DataSet

2. Fit SVM to the Training DataSet

3. Predict the Test DataSet

4. Do Model Evaluation

5. Perform Model Tuning based on the above parameters

 To Implement SVM in R, use the e.1071 package

Applications of SVM:

No alt text provided for this image

1. Spam Classification

2. Text Categorization

3. Handwriting Recognition

4. Image Classification

5. Speaker Identification

6. Face Detection

7. Bioinformatics

Disadvantages:

  • In SVM, we are not able to get a view of the variable/feature weightage and individual impact on model generation. So the feature influence is a black box
  • Since the final model is not easy to see, we cannot do small calibrations to the model hence its tough to incorporate our business logic. This issue aggravates since we are not able to see the impact of weights
  • SVM also doesn't perform well when the data set has more noise

要查看或添加评论,请登录

Ashutosh Chaudhary的更多文章

  • Executive Briefing- Computer Vision

    Executive Briefing- Computer Vision

    This is Part I of a series of Five articles from the Exec Briefing Series on Machine Learning. You can read the prelude…

  • Executive Briefing - Artificial Intelligence

    Executive Briefing - Artificial Intelligence

    Welcome to Executive Briefing for Artificial Intelligence (AI). Executives need to know a bit of all the technologies…

    4 条评论
  • Logistic Regression

    Logistic Regression

    Life isn't always Black or White. There are shades of Grey.

  • Linear Regression

    Linear Regression

    Whenever one embarks on the journey of Machine Learning and reaches the Algorithm stage, Linear Regression is the first…

    6 条评论
  • KNN - K Nearest Neighbour

    KNN - K Nearest Neighbour

    Whats the need: KNN comes under Supervised Learning Algorithms. Its used where there is NON-Linear Regression and hence…

社区洞察

其他会员也浏览了