Introduction to Support Vector Machines (SVM)
https://docs.opencv.org/_images/optimal-hyperplane.png

Introduction to Support Vector Machines (SVM)

Support Vector Machines are supervised learning models, algorithms used to analyze data and recognize patterns. They are used for both classification & Regression.

Support Vector Machines (SVM) Introductory Overview

Support Vector Machines are based on the concept of hyper decision planes that define decision boundaries. A decision plane is one that separates between a set of different class types. A ref example is given below.

The objects belong either to class G or R. The separating line defines a boundary on the right side of which all objects are G and to the left of which all objects are R. Any new input coming and falling to the right is identified as G or R if it is on other side.

Let's take a look at a simple schematic example where every object either belongs to G or R

In case of  two-class, separable training data set, intuitively a decision boundary drawn in the middle of the data distribution of the two classes. While some learning methods such as the perceptron algorithm find just any linear separator, others, like Naive Bayes, search for the best linear separator according to some criterion.

The SVM in particular defines the criterion to be looking for a decision surface that is maximally far away from any data point. This distance from the decision surface to the closest data point determines the margin of the classifier. This method of construction necessarily means that the decision function for an SVM is fully specified by a (usually small) subset of the data which defines the position of the separator. These points are referred to as the support vectors. The figure below shows the margin and support vectors for a sample problem.

In SVM other data points play no part in determining the decision surface that is chosen.

 

So in accordance above Maximizing the margin seems good because points near the decision surface represent very uncertain classification decisions: there is almost a 50% chance of the classifier deciding either way. A classifier with a large margin makes no low certainty classification decisions. This gives you a classification safety margin: a slight error in measurement or a slight document variation will not cause a miss-classification.

By Construct SVM classifier insists on a large margin around the decision boundary, when compared to decision hyperplanes. As distance between margin increases so there are fewer choices of where input can be put. So as a result its ability to correctly generalize to test data is increased .

 

Kernel Trick - 

It helps in creating really powerful SVM. As it is unlikely that we can always have a linear dividing boundary, thereby resulting in miss classification of labels. One of the way out is if we can map each Input Vector into a different space via a kernel function where a linear dividing hyperplane is feasible.

There are number of kernels that can be used in Support Vector Machines models. These include linear, polynomial, radial basis function (RBF) and sigmoid. We may even write customized kernels.

Kernel Functions

There are number of kernels that can be used in Support Vector Machines models. These include linear, polynomial, radial basis function (RBF) and sigmoid. We may even customize kernels. 

Shailendra Singh Kathait

Co-Founder & Chief Data Scientist @ Valiance | Envisioning a Future Transformed by AI | Harnessing AI Responsibly | Prioritizing Global Impact |

9 年

We have successfully worked with SVM on 100M+ Customer records in predicting next add to show. But high level of accuracy motivates ways around. @Valiance solutions

回复
Shailendra Singh Kathait

Co-Founder & Chief Data Scientist @ Valiance | Envisioning a Future Transformed by AI | Harnessing AI Responsibly | Prioritizing Global Impact |

9 年

Very True, high memory requirement is challenge. But with advent of cloud and scalability it can be overcome. Even new smart implementation helps you circumvent. But that's trade off between accuracy and ease.

回复
Amro A.

AI | Distributed Systems | Innovation Leadership | MIT

9 年

However, from a practical perspective I believe that one limitation in SVMs is the high algorithmic complexity and extensive memory requirements of the required quadratic programming in large-scale tasks.

回复
Shailendra Singh Kathait

Co-Founder & Chief Data Scientist @ Valiance | Envisioning a Future Transformed by AI | Harnessing AI Responsibly | Prioritizing Global Impact |

9 年

Completely agreed. Kernel is the key, we should also consider VC dimension. SVM is most powerful when understood with Kernel.

回复
Amjad Zaim

Serial AI Entrepreneur & Advocate of Ethical AI for the Public Good

9 年

Good tutorial ! I agree that SVM is a powerful classifier with good generalization ability especially compared to ANN which suffers from local minima convergence, but the choice of kernel function can severly alter model performance and therefore has to be carefully selected.

回复

要查看或添加评论,请登录

Shailendra Singh Kathait的更多文章

社区洞察

其他会员也浏览了