登录查看更多内容

Feature Transformation Techniques

Zuhaib Ashraf

Innovating Today, Shaping Tomorrow: AI Solutions For Every Field. Let's talk about Artificial intelligence| Machine Learning | Deep Learning | Computer Vision | AIOps | MLOps | GDSC AI/ML Lead

发布日期: 2023年7月18日

Introduction:

Data preprocessing is an important step in machine learning projects. Real-life data can be messy and unorganized, so we need to clean it up before using it in our models. This preprocessing step is crucial for getting good results. Feature Transformation is a technique we should always use, no matter what type of model we're working with. It helps us improve the data so our models can perform better.

Explain Feature transformation:

Feature transformation is a technique that we used to boost the performance of our machine learning algorithm with the help of mathematical formulas. We apply mathematical formulas on features to transform them in a form that directly boost the performance of machine learning algorithm.

How feature transformation increase the performance of machine learning algorithm?

The answer is, the distribution of our data is not normally distributed which has the very large impact on linear models like linear regression, logistic regression etc. Feature transformation technique used mathematical formulas to normalize the distribution. In that way, feature transformation boost the performance of machine learning algorithm.

Before applying feature transformation:

No alt text provided for this image — Figure 1: Before applying feature transformation

After applying feature transformation:

How normal distribution gives the boost to the performance of machine learning algorithm?

As we know statistics is the mother of machine learning, when a statistician see a normal distribution he sees a way of solving a particular problem in an easy way, this can also said same for the machine learning algorithm as the base of machine learning algorithm is statistics so, when we give normally distributed data to machine learning algorithm the calculation that the algorithm has to made became more easy, so ultimately it takes less time on training and give best accuracy.

Without applying feature transformation the accuracy:

After applying feature transformation:

As we can see the clear boost in the accuracy of logistic regression.

Types of function transformer:

There are three types of function transformation available in sklearn library,

Function transformer
Power transformer
Quantile transformer

Function Transformer:

In function transformer, there are multiple types of function transformer. The most commonly used are,

Log transform
Reciprocal transform
square transform

Log transformer:

In log transform, we apply log to every value of that particular column to make there distribution normal so that the performance of machine learning algorithm boosts.

Where to use?

When the particular column has only positive values, because we can't take the log of negative values.
When data is positively skewed

How it works?

Sometimes some columns has large scale than other columns, when we apply log transform, it will convert its scale into the range of other data. In that way its distribution transformed into normal distribution.

Before applying log transformation:

Accuracy of machine learning model before log transformation:

Applying log transformation:

Results after applying log transform:

As we can clearly see the improvements after applying log transform.

Reciprocal transform 1/x:

In reciprocal transform, we apply reciprocal of every value of that particular column to make there distribution normal so that the performance of machine learning algorithm boosts.

When to use?

Skewed Data: If your feature exhibits a heavily skewed distribution, with a long tail of large values, taking the reciprocal can help normalize the distribution and reduce the impact of extreme values.
Proportional Relationships: In some cases, the relationship between the feature and the target variable may be inversely proportional. Taking the reciprocal can help capture this relationship more accurately and improve model performance.
Stabilizing Variance: The reciprocal transform can help stabilize the variance of a feature, particularly if the variance increases as the feature values increase. This can be helpful in models that assume constant variance, such as linear regression.

How it works?

Skewness correction: If the original feature has a skewed distribution with a long tail of large values, taking the reciprocal can compress these larger values. This helps in making the distribution more symmetrical and reducing the impact of extreme outliers.
Value scaling: The reciprocal transform can effectively scale down larger values and scale up smaller values. This is because the reciprocal of a large value is smaller, while the reciprocal of a small value is larger. This can be useful when the range of values in the feature is very large, allowing for better representation of values across the feature space.
Proportional relationship capture: In some cases, a reciprocal relationship may exist between the feature and the target variable. By taking the reciprocal of the feature values, this inverse relationship can be captured more effectively. For example, if the feature represents time, and the target variable decreases as time increases, the reciprocal transform can help model this relationship more accurately.

It's important to note that the reciprocal transform may not be suitable for all types of data or all situations. It should be applied judiciously and with an understanding of the underlying data characteristics and the specific problem at hand. Additionally, it's crucial to handle potential issues that may arise, such as division by zero or close-to-zero values, which can impact the effectiveness of the reciprocal transform.

Applying reciprocal transform before and after:

领英推荐

Machine Learning Algorithms Every Data Scientist…

Quantum Analytics NG 9 个月前

5 quick but proven tips to implement machine learning…

Naveen Joshi 6 年前

Data Scientist’s Dilemma: The Cold Start Problem – Ten…

Kirk Borne, Ph.D. 6 年前

square transform:

The square transform computes the square of each feature value. For a given feature value x, the square transform is calculated as x^2.

Where to Use:

The square transform can be useful in various scenarios:

Non-linear Relationships: If there is a non-linear relationship between a feature and the target variable, applying the square transform can help capture this relationship. It can enable a linear model to better fit the data, as it can model curved or quadratic patterns.
Scaling Differences: The square transform can be employed when there are significant differences in the scale or magnitude of feature values. Squaring the values can help balance out these differences and bring them to a more comparable range.
Variance Stabilization: If the variance of a feature increases as the values increase, applying the square transform can help stabilize the variance. This can be beneficial in situations where modeling assumptions, such as constant variance, need to be met.

How it Works:

When the square transform is applied to a feature, it has several effects on the data:

Non-linearity: The squared values introduce non-linearity into the relationship between the feature and the target variable. This allows for modeling more complex patterns and capturing curved relationships that a linear model might struggle to represent.
Magnitude Amplification: The square transform amplifies the differences between smaller values, while compressing the differences between larger values. This can be useful in situations where the magnitude of the feature values carries important information.
Impact on Outliers: The square transform can magnify the impact of outliers, as squaring extremely large or small values can result in even larger values. This effect should be taken into consideration and handled appropriately.

As with any feature transformation technique, the square transform should be applied thoughtfully and in consideration of the data characteristics and the specific problem at hand. It may not always be suitable or beneficial, and it's important to evaluate its impact on the data and model performance.

Applying square transform before and after:

Custom transfomer:

You can use this particular piece of code for making custom mathematical transformer.

Power Transformer:

Power transformer is used when the desired output is more Gaussian based. Power transform has two types:

Box-Cox transform
Yeo-Johnson transform

Box-Cox transform:

Box-Cox require data to be strictly positive, it does not even accepts zero in the data. Formula on the basis of Box-Cox transforms works is,

The exponent here is a variable called λ that varies over the range of -5 to 5 and in the process of searching it examine all possible values of λ. Finally, we choose the optimal values (resulting in the best approximation to a normal distribution) for that particular feature.

Scope:

Applied only on values greater than zero (positive values only zero excluded)

Internal working techniques:

Max likelihood
Bayesian statistics

Distribution Before applying Box-Cox:

Algorithm Accuracy before applying Box-Cox:

Distribution after applying Box-Cox:

Algorithm Accuracy before applying Box-Cox:

Yeo-Johnson transform:

This transformation is the specialized form of Box-Cox transform, we can apply Yeo-Johnson transform as well on the negative values. Formula of Yeo-Johnson,

Distribution Before applying yeo-johnson

Algorithm Accuracy before applying Yeo-Johnson:

Distribution after applying Yeo-Johnson:

Algorithm Accuracy before applying Yeo-Johnson:

Conclusion:

When we are working with linear models it is necessary to normalize the scale of data for the better performance. The feature transformer technique has lots of variants available to do the task of normalizing the distribution, at the end of the day it totally depends on us what approach we will go for.

要查看或添加评论，请登录

Zuhaib Ashraf的更多文章

Column Transformer and Pipelines in Machine Learning

2023年7月14日

Column Transformer and Pipelines in Machine Learning

Introduction: When starting out or participating in competitions, it may seem beneficial to pre-process data in…
Encoding Features

2023年7月1日

Encoding Features

Introduction: Feature encoding is used for the transformation of categorical features into numerical features. Types of…

9 条评论
Introduction to Feature Engineering

2023年6月27日

Introduction to Feature Engineering

Introduction: Feature Engineering is the process of using domain knowledge to extract features from raw data. These…

2 条评论
Understanding Data and performing EDA

2023年6月20日

Understanding Data and performing EDA

Understanding data depends on 2 steps: Step 1: What basic question should be ask? Step 2: Exploratory Data Analysis…
How to frame a machine learning model?

2023年6月11日

How to frame a machine learning model?

Introduction: Framing a machine learning problem involves a series of steps to design and structure the problem…

4 条评论
Tensors

2023年6月9日

Tensors

?? What are tensors? ???? Tensors are fundamental mathematical objects used to describe various physical properties…

9 条评论

See all articles

Feature Transformation Techniques

Zuhaib Ashraf

Innovating Today, Shaping Tomorrow: AI Solutions For Every Field. Let's talk about Artificial intelligence| Machine Learning | Deep Learning | Computer Vision | AIOps | MLOps | GDSC AI/ML Lead

Introduction:

领英推荐

Zuhaib Ashraf的更多文章

社区洞察

其他会员也浏览了

Demystifying Machine Learning Challenges – Imbalanced Data

The Connection Between Machine Learning and Statistics

5 Best Machine Learning APIs for Data Science

Data Cleaning and Transformation for Machine Learning

Hypothesis Testing in Machine Learning

Data Scaling and Training space in Machine Learning. A Statistical perspective.

Feature Engineering for Data Engineers: Building Blocks for ML Success

Navigating Parametric and Non-Parametric Data in Machine Learning

From Data to Insights: Building Machine Learning Models with Low-Code Tools

Not more, get better data!

Introduction:

领英推荐

Zuhaib Ashraf的更多文章

Column Transformer and Pipelines in Machine Learning

Encoding Features

Introduction to Feature Engineering

Understanding Data and performing EDA

How to frame a machine learning model?

Tensors

社区洞察

其他会员也浏览了

Demystifying Machine Learning Challenges – Imbalanced Data

The Connection Between Machine Learning and Statistics

5 Best Machine Learning APIs for Data Science

Data Cleaning and Transformation for Machine Learning

Hypothesis Testing in Machine Learning

Data Scaling and Training space in Machine Learning. A Statistical perspective.

Feature Engineering for Data Engineers: Building Blocks for ML Success

Navigating Parametric and Non-Parametric Data in Machine Learning

From Data to Insights: Building Machine Learning Models with Low-Code Tools

Not more, get better data!