Regularization in Machine Learning
RISHABH SINGH
Actively looking for Full-time Opportunities in AI/ML/Robotics | Ex-Algorithms & ML Engineer @ Dynocardia Inc | Computer Vision Research Assistant & Robotics Graduate Student @Northeastern University
Regularization is a technique used in machine learning to prevent overfitting, which occurs when a model learns the training data too well, including its noise and outliers, and performs poorly on new, unseen data. Regularization helps create models that generalize better to new data by adding a penalty to the loss function (the function the model tries to minimize during training), which keeps the model’s parameters (like weights in a neural network) smaller and simpler.
How Is Regularization Used?
Regularization is implemented by modifying the loss function. In a typical machine learning model, the loss function measures how well the model’s predictions match the actual data. Regularization adds an extra term to this loss function that penalizes large weights.
Loss = Original Loss + λ × Regularization Term
By minimizing this new loss function, the model not only fits the data but also keeps the weights small, which helps in generalizing better to new data.
Types of Regularization: L1 and?L2
The most common types of regularization are L1 regularization and L2 regularization. They differ in how they penalize the model’s weights.
L1 Regularization (Lasso Regression)
It adds the absolute value of the weights to the loss function.
Loss = Original Loss + λ ∑? |W?|
Imagine we have a dataset with many features (variables), but not all of them are important for predicting the output. Using L1 regularization can help the model focus on the most significant features by reducing the weights of less important ones to zero.
领英推荐
L2 Regularization (Ridge Regression)
It adds the square of the weights to the loss function.
Loss = Original Loss + λ ∑? W?2
Suppose we’re building a model to predict house prices based on various features like size, number of rooms, age, location, etc. L2 regularization helps ensure that the model doesn’t assign too much importance to any one feature and considers all of them in a balanced way.
Example?: House Price Prediction
Scenario: Imagine we’re building a model to predict the price of a house based on various features like:
Without regularization, our model might give too much importance to some features, like the presence of a swimming pool or the age of the house, even if those features don’t significantly influence the price. This could lead to overfitting, especially if the training data contains houses with unusual characteristics (outliers). For example, maybe one very expensive house has a large swimming pool, and the model might learn that “swimming pools” lead to a high price, which isn’t true in general.
How Regularization Helps:
Summary?—?When to Use these Regularization methods: