Understanding statistical inference
In previous posts, I mentioned that statistical inference is different from machine learning inference. The key difference is that, in statistical inference, it is assumed that the underlying distribution of the data can be known.(is knowable)
If you are trying to learn machine learning - then you have a reasonable idea of machine learning inference. Machine learning inference is the process of using a trained machine learning model to make predictions on new, unseen data. This is the stage where the model, which has already been trained on a dataset, is deployed to perform tasks such as classification, regression, object detection, etc., in real-world scenarios.
So, what are the steps involved in statistical inference?
Because of the need for the underlying distribution to be knowable, statistical inference involves two steps:
领英推荐
In this case, the Observed Data is the actual data collected from experiments or real-world observations. The Expected Data is the data that we would expect to see if the model or theoretical distribution we are testing is correct.?
Because of the process of sampling, in goodness of fit tests, the null hypothesis typically states that the observed data follows the expected distribution. The test aims to either confirm or reject this hypothesis based on the test statistic The test statistic is the value calculated from the observed and expected data. This value is compared to a critical value from a statistical distribution to determine whether to reject the null hypothesis. There are a wide range of Goodness of fit tests depending on the model. These, we shall cover in subsequent sections
Image source
https://en.wikipedia.org/wiki/Cenote cave (I think it looked like something knowabl/unknowable!)