Standardization and Normalization Techniques in Machine Learning - Part 07
Vinod Kumar G R
Co-founder of ApexIQ | Driving AI Innovation with LLMs & GenAI | Passionate about Transformative AI Solutions
Data is rarely perfect, and it often comes in various shapes and forms, with values that span different scales and ranges. Ensuring that your data is in the right form can make all the difference when training machine learning models.
This is where standardization and normalization come into play, offering strategies to prepare your data for the most optimal model performance.
In this article, we will explore these techniques, their differences, and the scenarios where each is best applied. Whether you’re dealing with feature scaling in the broader context or looking to understand how to make your data machine-learning-ready, the insights you gain here will be invaluable.
In the last article, we discussed Feature scaling and different types of machine learning, and we took a deep dive into the topic it provided a solid foundation in understanding the fundamental concepts clearly.
I’ll give you a simple example of when we use scaling methods,
Suppose you are dealing with an image dataset, you have data of image pixels and it contains pixel values from 0–255. Where 255 is a larger number, although it is a continuous numeric value, the model cannot perform well to capture all those values, so you use scaling methods to scale the data and it will fall into a common range. Where the model can easily capture all the data patterns.
Now let’s get into the topic,
1. Standardization
Note: This Standardization doesn’t scale data to range(0,1) instead it scales the data to have a mean of 0 and standard deviation of 1. [You understand this line clearly when we discuss the normalization topic]
The mathematical formula for standardization:
x' = (x - mean(x)) / std(x)
where
Explanation
You might have doubts, about where you need to use this standardization technique.
I got the image from Google, So you can see the top data graph in the image, it is both right-skewed (which means most of the data points fall on the right side of the graph) and left-skewed (which means most of the data points fall on the left side of the graph).
You can see the scale is around 100 and 200. As there is a high numerical value, it is challenging for the model since it needs to capture the important patterns in the widely spread data.
When you apply the standardization, it scales the data to have a mean of 0 and a standard deviation of 1, which brings the data to the center of the graph.
You can see the bottom image, the data falls at the center of the graph. This Standardization process will improve the training process of the model.
Practical Implementation
"""
This is the basic code for standard scaling implementation
"""
# import the StandardScaler library from sklearn
from sklearn.preprocessing import StandardScaler
# load Sample data
data = [[1.0], [2.0], [3.0], [4.0], [5.0]]
# Create a StandardScaler instance
scaler = StandardScaler()
# Fit and Transform the scaler to the data and transform the data
scaled_data = scaler.fit_transform(data)
# Print the scaled data
print(scaled_data)
I have written code in the colab notebook that I have mentioned below.
Google Colaboratory
You can see the practical implementation of the standardization in the above-given colab link.
So go through this colab notebook once and if you have any questions I’ll provide my email at the end of this article ping me once I’ll try to clarify.
2. Normalization
Let me define, what is normalization.
Normalization is the process of transforming the features (variables) in a dataset to a common scale, typically within the range of (0, 1) or (-1, 1).
The objective of normalization is to ensure that all features have similar scales, which helps prevent certain features from dominating the modeling process due to their larger numerical values.
We have already discussed that it will help in rescaling the data into a common scale in a range of (0, 1) or (-1, 1).
We have seen why feature scaling is important for machine learning in a previous article. If you haven’t read the previous article, please go through the Feature Scaling in Machine Learning article.
I got this image from Google, as you can see the plots before scaling(which is the actual data) and after applying normalization and standardization. When you apply the normalization, the data fall into a range of (0,1).
Above, in standardization content, I have mentioned a note. On that note, I mentioned, “Standard_scalar will not scale data into a range of (0,1) instead it will scale the data to have a mean of 0 and standard deviation of 1.” This is the main difference between standardization and normalization.
Different Methods in Normalization
Yes, in normalization there are mainly 4 different methods of Normalization and they are:
These are the several methods to normalize data, and the choice of method depends on the characteristics of your dataset and the requirements of your modeling task.
1. Min-Max Scaling
Min-Max scaling is also known as Min-Max normalization, transforms data into a specific range, often [0, 1] or [-1, 1]. It rescales the data to ensure that the minimum value maps to 0, and the maximum value scales to 1 (or -1 if using the [-1, 1] range).
Mathematical Formula
For [0, 1] range:
x_normalized = (x - min(x)) / (max(x) - min(x))
For [-1, 1] range:
x_normalized = 2 * ((x - min(x)) / (max(x) - min(x))) - 1
Advantages:
领英推荐
Disadvantages:
2. Mean Normalization Scaling
Mean normalization is also known as Z-Score normalization or Standardization, transforms data to have a mean of 0 and a standard deviation of 1. It is particularly useful when dealing with data that follows a Gaussian distribution.
Mathematical Formula:
x_normalized = (x - mean(x)) / std(x)
Advantages:
Disadvantages:
3. Max-Absolute Scaling
Max-absolute scaling scales data to the [-1, 1] range by dividing each data point by the maximum absolute value in the dataset.
Mathematical Formula
x_normalized = x / max(|x|)
Advantages:
Disadvantages:
4. Robust Scaling (Using IQR Method)
Definition: Robust scaling, often referred to as IQR scaling, is a method that scales data using the Interquartile Range (IQR). It is robust to outliers as it uses the middle 50% of the data.
Mathematical Formula
x_normalized = (x - median(x)) / IQR(x)
Advantages:
Disadvantages:
In the formula, x represents the original data point, min(x) is the minimum value in the dataset, and max(x) is the maximum value in the dataset.
Practical implementation
"""
This is the basic code for MinMaxScaler implementation
"""
# import the MinMaxScaler library from sklearn
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import Normalizer
from sklearn.preprocessing import MaxAbsScaler
from sklearn.preprocessing import RobustScaler
# load Sample data
data = [[1.0], [2.0], [3.0], [4.0], [5.0]]
# Create a MinMaxScaler instance
min_max_scaler = MinMaxScaler()
normalize_scaler = Normalizer()
max_abs_scaler = MaxAbsScaler()
robust_Scaler = RobustScaler()
# Fit and Transform the scaler to the data and transform the data
scaled_data = min_max_scaler.fit_transform(data)
# Print the scaled data
print(scaled_data)
This is the simple code to implement the normalization methods, now take a sample dataset and apply the normalization methods, and let’s see the insights in the data.
Google Colaboratory
Go through with this Colab notebook, I have explained this normalization method with an example dataset.
If you have any queries just write a mail(mentioned below), and I’ll try to respond.
That’s it for today's topic, we’ll discuss another topic in the next articles.
Thank you for taking the time to read this article.
I hope it has provided you with valuable insights into the world of feature scaling and how it can be used to enhance the performance of machine learning models. I’m excited to share these hands-on insights and make the content more engaging.
Stay tuned for upcoming articles.
EMAIL -> [email protected]
Previous article: 6. Feature Sca,ing and Different Feature Scaling Methods in ML.
Next article: 8. Data Encoding in ML.
YouTube Channel