Support Vector Machine- Simple analysis
Debi Prasad Rath
@AmazeDataAI- Technical Architect | Machine Learning | Deep Learning | NLP | Gen AI | Azure | AWS | Databricks
Hi connections. Trust you are doing well. In this article we will be discussing about support vector machine. Let us get started. Happy learning.
Support vector machine or often abbreviated as "SVM" is a supervised learning algorithm. It can be used to solve regression and classification tasks. With regards to a classification task the idea of "SVM" is to find that hyperplane that separates data into two classes . In case of regression tasks it also applies same principle but with mimimum changes to solve the problem. Keep in mind that the hyperplane that maximizes the margin to minimize the error such that it is tolerated by a value <epsilon>. In other words, the goal is to find that approximation function g(x) ~ f(x) that is different to the actual y value but not more than tolerance <epsilon> for each data point.
There exists different kernel functions that essentially takes an input data and transform it to required processing form. Precisely a kernel function is using non-linear decision space to transform it back as a linear equation in higher dimensions. Basically, kernel function calculates dot product between two points along with the feature dimension. Among differeng types of kernels there are linear, polynomial, rbf(radial basis function) and sigmoid kernels to name a few.
Support vectors are the collection of points that lie close to the hyperplane. In fact, if positions of support vectors are changed, then position of hyperplane will be changed. On a similar note , you can think of a hyperplane as a line that separates data points to classify data appropriately. Intuitively, respective data points should be far from the hyperplane so that it is confident to be a right set of classification.
The task at hand is to find the hyperplane that maximises the margin that is the distance between nearest data point and the hyperplane. Therefore the objective function is to find that hyperplane using train data point such that any new data point can be classified correctly.
But, wait, there is a trick. The real world order of data is not that simple. Often times it is seen that data resides in multi-dimensional form making it hard to reprsent in a two dimensional form. Because of different data forms, and its origin there is need to use "kernel function", non-linear mapping that transfroms an input data point to higher dimensions. Iteratively this process keeps on repeating until a hyperplace found to be true, that classifies data points correctly.
领英推荐
Thanks for reading this article. Happy learning. See you soon.
reference:- kdnuggets, geeksforgeeks