Vanilla Regression VS Robust Regression
Regression is one of the most widely used algorithms for forecasting. Regression is the first thing you'd learn in the ML world. For small problem statements, the basic versions of regression will work totally fine, but there are some cases where you can't use the vanilla regression technique. Having worked with regression, you might have noticed its sensitivity towards outliers. This comes from the least squares method that it uses to calculate the error. Errors with higher magnitude when squared, create an imbalance.
To address the sensitivity of the vanilla regression algorithm, a new technique evolved with emerging use cases of regression. In this article, we'll talk about two approaches: M - Estimation & R - Estimation.
M - Estimation
The idea here is to mimic the least-squares function when the residuals are near 0 and consider absolute values for larger residuals. Let's have a look at Huber's Dispersion Function to understand M - Estimation.
Here c is supposed to be the tolerance. Empirically, it's found that robust regression works the best when 1 < c < 2.
R - Estimation
Here every squared error is multiplied by some weight. This works like a ranking system to minimize the effect of larger residuals.
Where "a" is the weight used to create the rank effect.
The above are basic explanations for the two most used techniques to handle outliers.