R - Advanced Regression Models
Anandh Shanmugaraj
Group CEO & MD at Gladwin International & Company ?? India's leading Interim Leadership Consulting, Executive Search and Leadership Advisory Firm.
Ad: Free Python Tutorials | Free R Tutorials | Free Deep Learning Tutorials | Free Machine Learning Tutorials | Free Artificial Intelligence Tutorials
Register now on www.gladwinanalytics.com to get started with 7500+ hours of free video tutorials on data science
----------------------------------------------------------------------------------
Each of the regression analysis below contains working code examples with brief use-case explanations covered for each of the regression types in the list below. Many of these code snippets are generic enough so you could use them as a base template to start and build up on for your analyses.
Please note that the information presented in here should not be construed as full and complete analysis, but rather as a template and a hand guide of available modeling options. You are advised to pursue independent and thorough research before arriving at conclusions.
Robust Regression
Robust regression can be used in any situation where OLS regression can be applied. It generally gives better accuracies over OLS because it uses a weighting mechanism to weigh down the influential observations. It is particularly resourceful when there are no compelling reasons to exclude outliers in your data.
Robust regression can be implemented using the rlm() function in MASS package. The outliers can be weighted down differently based on psi.huber, psi.hampel and psi.bisquare methods specified by the psi argument.
How To Specify A Robust Regression Model
library(MASS)
rlm_mod <- rlm(stack.loss ~ ., stackloss, psi = psi.bisquare) # robust reg model
summary(rlm_mod)
#> Call: rlm(formula = stack.loss ~ ., data = stackloss)
#> Residuals:
#> Min 1Q Median 3Q Max
#> -8.91753 -1.73127 0.06187 1.54306 6.50163
#>
#> Coefficients:
#> Value Std. Error t value
#> (Intercept) -41.0265 9.8073 -4.1832
#> Air.Flow 0.8294 0.1112 7.4597
#> Water.Temp 0.9261 0.3034 3.0524
#> Acid.Conc. -0.1278 0.1289 -0.9922
#>
#> Residual standard error: 2.441 on 17 degrees of freedom
Compare Performance of rlm() with lm()
Lets build the equivalent lm() model so we can compare the errors against the respective fitted values.
lm_mod <- lm(stack.loss ~ ., stackloss) # lm reg model
Calculate the Errors
# Errors from lm() model
DMwR::regr.eval(stackloss$stack.loss, lm_mod$fitted.values)
#> mae mse rmse mape
#> 2.3666202 8.5157125 2.9181694 0.1458878
# Errors from rlm() model
DMwR::regr.eval(stackloss$stack.loss, rlm_mod$fitted.values)
#> mae mse rmse mape
#> 2.1952232 9.0735283 3.0122298 0.1317191
As expected, the errors from the robust regression model is lesser than the linear regression model.
Learn more
---------------------------------------------------------------------------------
Ad: Free Python Tutorials | Free R Tutorials | Free Deep Learning Tutorials | Free Machine Learning Tutorials | Free Artificial Intelligence Tutorials
Register now on www.gladwinanalytics.com to get started with 7500+ hours of free video tutorials on data science
----------------------------------------------------------------------------------