登录查看更多内容

Vanilla Regression VS Robust Regression

Rithwik Chhugani

Data Scientist | Data & AI Consultant

发布日期: 2020年12月19日

Regression is one of the most widely used algorithms for forecasting. Regression is the first thing you'd learn in the ML world. For small problem statements, the basic versions of regression will work totally fine, but there are some cases where you can't use the vanilla regression technique. Having worked with regression, you might have noticed its sensitivity towards outliers. This comes from the least squares method that it uses to calculate the error. Errors with higher magnitude when squared, create an imbalance.

To address the sensitivity of the vanilla regression algorithm, a new technique evolved with emerging use cases of regression. In this article, we'll talk about two approaches: M - Estimation & R - Estimation.

M - Estimation

The idea here is to mimic the least-squares function when the residuals are near 0 and consider absolute values for larger residuals. Let's have a look at Huber's Dispersion Function to understand M - Estimation.

Here c is supposed to be the tolerance. Empirically, it's found that robust regression works the best when 1 < c < 2.

R - Estimation

Here every squared error is multiplied by some weight. This works like a ranking system to minimize the effect of larger residuals.

Where "a" is the weight used to create the rank effect.

The above are basic explanations for the two most used techniques to handle outliers.

要查看或添加评论，请登录

Rithwik Chhugani的更多文章

Why use Spark?

2021年1月10日

Why use Spark?

The very first reason you'll find on the internet is to work with Big Data, but why not use Pandas, Hadoop, or Dask?…
Tools for Smart/Lazy Data Scientists (ft. LazyPredict)

2020年12月22日

Tools for Smart/Lazy Data Scientists (ft. LazyPredict)

Being a data scientist you don't necessarily need to write tons and tons of code to see the performance of your models.…
Likelihood VS Probability

2020年12月17日

Likelihood VS Probability

It may look simple, but it's capable to create head-scratching situations at times. Let's understand in a few words…
Popular CNN Architectures

2020年12月17日

Popular CNN Architectures

Every now and then researchers try to fine-tune their existing model or come up with new architectures to win the…
Types of Hyperparameter Tuning

2020年12月16日

Types of Hyperparameter Tuning

What is hyperparameter tuning? Hyperparameter tuning is an extra step to make sure that your model is using the right…

See all articles

Vanilla Regression VS Robust Regression

Rithwik Chhugani

Data Scientist | Data & AI Consultant

M - Estimation

R - Estimation

Rithwik Chhugani的更多文章

社区洞察

其他会员也浏览了

How to Interpret the Intercept in 6 Linear Regression Examples

R-squared in Regression Analysis

Linear Regression: How to find line of best fit ?

Ridge Regression

What is a Logit Function and Why Use Logistic Regression?

Ridge Regression and Lasso Regression.

Logistic regression introduction

5 Types Regression in 45 lines of code

Is there any confusion between Linear regression and Logistic regression?

Regression: Evaluation Metrics/Loss Functions

M - Estimation

R - Estimation

Rithwik Chhugani的更多文章

Why use Spark?

Tools for Smart/Lazy Data Scientists (ft. LazyPredict)

Likelihood VS Probability

Popular CNN Architectures

Types of Hyperparameter Tuning

社区洞察

其他会员也浏览了

How to Interpret the Intercept in 6 Linear Regression Examples

R-squared in Regression Analysis

Linear Regression: How to find line of best fit ?

Ridge Regression

What is a Logit Function and Why Use Logistic Regression?

Ridge Regression and Lasso Regression.

Logistic regression introduction

5 Types Regression in 45 lines of code

Is there any confusion between Linear regression and Logistic regression?

Regression: Evaluation Metrics/Loss Functions