1 - Overview of Statistical Modelling

1 - Overview of Statistical Modelling

Functions of Variables

In our model we have response variable on the left side which is the focus of our research - variable that we try to predict -, and predictor variable(s) - variables that are used to predict response variable - on the right side.

Types of Variables

Variables can be continuous, categorical or ordinal. Continuous variables are any numeric measurement like the sale price of a home, Categorical variables are specific non-numeric levels such as heating quality of a home like average, fair, good or excellent. Ordinal variables are similar to categorical variables but have a natural hierarchy like small, medium and large size coffees.

Which Model to Use

So if we have continuous response and categorical predictor variable(s) then we must use the ANOVA Model, If we have both continuous response and predictor variable(s) then we must use the Ordinary Least Squares Regression Model.

Y=βo+β1X1+…+βkXk+?

Finally if we have Categorical-Binary response variable and any type of predictor variable(s) then we use Logistic Regression, here we estimate probability of the desired outcome.

logit(Y)=β0+β1X1+…+βkXk

Explanatory and Predictive Modeling

No matter which statistical model we use, we need to differentiate between explanatory and predictive modelling.

In explanatory modelling we try to understand how X is related to Y. Our main concern is to accurately estimate model parameters. We use p-values and confidence intervals to reach our goal. We have small sample sizes and few variables.

In predictive modelling we predict the future values of a response variable. Our main concern is to make accurate predictions. We use holdout or validation data set to reach our goal. We have larger sample sizes and many variables.


要查看或添加评论,请登录

G?KHAN YAZGAN的更多文章

社区洞察

其他会员也浏览了