登录查看更多内容

Support vector machine classifier with regularisation

Jakub Polec

20+ yrs in Tech & Finance & Quant | ex-Microsoft/Oracle/CERN | IT / Cloud Architecture Leader | AI/ML Data Scientist | SaaS & Fintech

发布日期: 2024年3月19日

The SVM is a powerful, yet intuitive machine learning model that excels in binary classification tasks. It operates on a simple principle: find the best boundary that separates classes of data with the maximum margin. But what makes SVM stand out is its versatility — by incorporating different kernel functions, it can tackle both linear and nonlinear datasets.

Our focus here was on the linear SVM with L2 regularization, which brings an added twist: it not only strives for the best classification boundary but also balances model complexity and generalization by preventing overfitting. The 'hinge loss' component, a staple in SVM models, ensures that predictions that are correct but close to the boundary are penalised, nudging the boundary away from the data points to ensure a buffer zone or 'margin'.

Basically:

L1 regularization (Lasso) adds the sum of the absolute values of the weights to the loss function, which encourages sparsity—meaning some weights can become exactly zero. This can be beneficial when you have many features but believe that only a few are actually important, as it helps with feature selection.

L2 regularization (Ridge) adds the sum of the squares of the weights to the loss function. It tends to spread the error among all the weights, shrinking them closer to zero but rarely to zero. This is helpful when most features have some influence on the output and you want to keep all of them in the model.

In essence, use L1 when you want to reduce the number of features, and L2 when you want to penalize large weights more severely without discarding features entirely.

领英推荐

Data Science #27

Andriy Burkov 5 个月前

Machine learning made easy with 'Lazy Predict'

Sharvari Avhad 1 年前

The math of similarity - Cohesion

Manu Nellutla 4 年前

Let's see code with L2 penalty and high loss, as svm_clf:

import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.svm import LinearSVC
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

# We'll use the Iris dataset for demonstration
# SVMs are binary classifiers, so we need to filter the data to only use two classes
iris = datasets.load_iris()
X = iris["data"][:, (2, 3)]  # we only take petal length and width for simplicity
y = (iris["target"] == 2).astype(np.float64)  # Iris-Virginica

# Set regularization parameter
l1_regularization = 0.5

# Create a LinearSVC model with L2 penalty and hinge loss
svm_clf = Pipeline([
    ("scaler", StandardScaler()),
    ("linear_svc", LinearSVC(C=l1_regularization, loss="hinge", penalty="l2", dual=True, max_iter=10000))
])

# Fit the model
svm_clf.fit(X, y)

# Get the parameters
beta = svm_clf.named_steps["linear_svc"].coef_[0]
intercept = svm_clf.named_steps["linear_svc"].intercept_[0]

# Visualize the dataset and the decision boundary
plt.figure(figsize=(10, 6))
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.bwr)
ax = plt.gca()
xlim = ax.get_xlim()
ylim = ax.get_ylim()

# Create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
Z = (np.dot(xy, beta) - intercept).reshape(XX.shape)

# Plot decision boundary and margins
contours = ax.contour(XX, YY, Z, levels=[0], linestyles=["-"], colors='black')
plt.xlabel("Petal length")
plt.ylabel("Petal width")
plt.title("SVM Classifier with L2 Regularization and Hinge Loss")
plt.show()

and output:

The visualization paints a clear picture: data points on one side belong to one class, and those on the other side to another. The separation is so distinct that even a layperson can appreciate the underlying patterns the SVM has unearthed.

To achieve this, we leveraged Python's rich ecosystem, particularly sklearn, to build and train our SVM model. The process was straightforward: scale the features, fit the model, and draw the boundary. Despite the simplicity in implementation, the outcome is profound — a testament to the power of combining robust algorithms with effective data scaling.

Support vector machine classifier with regularisation

Jakub Polec

20+ yrs in Tech & Finance & Quant | ex-Microsoft/Oracle/CERN | IT / Cloud Architecture Leader | AI/ML Data Scientist | SaaS & Fintech

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

?? Day12 of #100DaysOfPython ??

Navigating the Realm of Support Vector Machines: A Symphony of Data Harmony

Algorithms, why when where

k-Nearest Neighbors in Machine Learning (k-NN)

Let's get dive into the Regression-Bit by bit from scratch.

Will highly correlated variables impact Linear Regression?

How to predict the future

Fast Automated Machine Learning

领英推荐

FAQ in Quantitative Finance

2024年6月9日

Order Execution in Trading: Asynchronous Programming, Threading, and Multiprocessing

2024年5月12日

How to sell stocks wisely - the code for Almgren-Chriss optimal execution

2024年3月6日

Financial Data in Oracle OCI with kdb+ Time-Series DB

2024年3月2日

Stock Market Decisions with Thompson Sampling

2024年2月23日

Portfolio optimization: from the highest Sharpe Ratio to minimum volatility

2024年1月30日

On Signal, Alpha and Strategy

2024年1月28日

The Rolling Hurst Exponent in Python (Trading)

2024年1月23日

Volatility Modeling (part 2): Journey from ARCH to NN and DL

2024年1月19日

Volatility Modeling - journey from ARCH to NN and MCMC

2024年1月17日

社区洞察

其他会员也浏览了

?? Day12 of #100DaysOfPython ??

Navigating the Realm of Support Vector Machines: A Symphony of Data Harmony

Algorithms, why when where

k-Nearest Neighbors in Machine Learning (k-NN)

Let's get dive into the Regression-Bit by bit from scratch.

Will highly correlated variables impact Linear Regression?

How to predict the future

Fast Automated Machine Learning