Several ways to use lm function

Several ways to use lm function

We are given the data with columns (random variables) and we want to tell if there is a relationship / dependency between the variables. The simplest relationship would be linear relatioships that people usually think of.

However, linear relationships can be extended to non-linear relationships and generalized linear relationships. This means that we can do many amazing things from just understanding the linear model. Let us use the function lm in R language to demonstrate what I mean.

Learning (a, b) of the model Y = a*X + b from data

In the simplest case, we want to answer what is the value of a and b from the data. And we can use lm function to do that.

https://raw.githubusercontent.com/tutrunghieu/sharing/master/lm1/lm1-xy.R

Learning (a, b, c) of the model Y = a*X1 + b*X2 + c from data

When we have two variable, the function lm still works and it gives the linear coefficients for the model.

https://raw.githubusercontent.com/tutrunghieu/sharing/master/lm1/lm2-x1x2-y.R

Learning (a, b, c) of the generalized linear model Y = exp(a*X1+b*X2 + c) from data #output-transformation

This is not the linear model anymore. This is non-linear model. If you try to apply lm directly into the dataset, the error will be very high.

However, we can still make it linear if we take the log from both side. Then we have log(Y) = a*X1 + b*X2 + c instead of the original model. And we see that log(Y) linearly depends on X1 and X2.

As a result, we learn (X1, X2, logY) instead of learning (X1, X2, Y) then we predict by take exp(y) where y is the predicted value.

https://raw.githubusercontent.com/tutrunghieu/sharing/master/lm1/lm3-logY.R

Learning (a, b, c, d) of the model Y = a*X^3 + b*X^2 + c*X + d from data #input-transformation

We can use the lm function to learn the polynomial parameters as well. In this case, we need to add more input variables X3=X^3 and X2=X^2 therefore the input data will have (X, Y, X2, X3) instead of (X, Y). This is called input transformation. The input (X) is transformed into (X, X2, X3).

https://raw.githubusercontent.com/tutrunghieu/sharing/master/lm1/lm4-poly.R

Conclusion

Linear model is very simple. However it is very effective. It can help us understand not only the linear relationships between variables, but also the non-linear relationships if we use it wisely.

Linear model is also a very effective building block to build the complicated learning models with many layers. We can see it in model combining/averaging topics.






要查看或添加评论,请登录

Henry T.H. Tu的更多文章

社区洞察

其他会员也浏览了