Central Limit Theorem
RAHUL KUMAR
Data engineer with skills :- Python, PySpark, SQL, Azure Data Factory, Azure Data Bricks, Azure Data Lake ,Azure Synapse Analytics.Created pipeline to ingest data from heterogeneous sources.Also build python tools.
It is very difficult to cover entire population for predicting the characteristics of a population. In most of the case we have to deal with samples to observe the characteristics. So here comes the role of Central Limit Theorem. Before jumping to Theorem lets understand the difference between Population and Sample.
A population is the entire group that you want to draw conclusions about. A sample is the specific group that you will collect data from.
Population
- Hard to observe
- Expensive
- Time consuming
Sample
- Easy to contact
- Less costly
- Less time consuming
Central Limit Theorem
The Central Limit Theorem tells us that for a population with any distribution, the distribution of sample means approaches a normal distribution as the sample size increases. Sample sizes equal to or greater than 30 are considered sufficient for the Central Limit theorem to hold. A sufficiently large sample size can predict the characteristics of a population accurately.