Sample Housing Market Problem
(Courtesy: M Yasser H of Kaggle (https://www.kaggle.com/yasserh))
Description:
A simple yet challenging project, to predict the housing price based on certain factors like house area, bedrooms, furnished, nearness to mainroad, etc. The dataset is small yet, it's complexity arises due to the fact that it has strong multicollinearity. Can you overcome these obstacles & build a decent predictive model?
Acknowledgement:
Harrison, D. and Rubinfeld, D.L. (1978) Hedonic prices and the demand for clean air. J. Environ. Economics and Management 5, 81–102.
Belsley D.A., Kuh, E. and Welsch, R.E. (1980) Regression Diagnostics. Identifying Influential Data and Sources of Collinearity. New York: Wiley.
Objective:
Housing Jittered Scatter Plot:
For this dataset given, I wanted to observe its behavior by jittering a scatter plot above as well as creating a scatter plot below.
House Correlation Circle Plot:
Dotplot Comparing House Algorithms:
The models that I felt that could get the best possible predictions are as follows:
I've tested each of those models and monitored them by three measures:
Conclusion:
Overall, based on the dotplot above, there are three models in my opinion that can make the best prediction of where the housing trends will go with minimal error. If you would like to know more about these models in detail, please visit my GitHub site at www.github.com/pc1991/Housing. Thank you very much for your time and consideration. I hope to see you again next time.