"Application of Chebyshev's Theorem"
When we are working with standard normal distribution
- About 68% of values fall within one standard deviation of the mean.
- About 95% of the values fall within two standard deviations from the mean.
- About 99.7% (almost all values) - fall within three standard deviations from the mean.
Example:-
If we measure IQ score of individuals and if they are normally distributed with a mean of 110 and standard deviation of 10, then based on the empirical rule we can conclude the following.
- About 68% of individuals have IQ scores in the interval 110 ± 1 (10) = [100, 110], 100 being the lower range & 110 being the higher range
- About 95% of individuals have IQ scores in the interval 110 ± 2 (10) = [90,120].
- About 99.7% of individuals have IQ scores in the interval 110 ± 3 (10) = [80,130].
3 key takeaway's from Empirical rule :-
- Data distribution should be Normal (bell shaped).
- Percentages are approximately true.
- Does not apply to Non - symmetrical distribution.
The third point above brings us to what is known as Chebyshev's Theorem.
If you have an symmetrical \ asymmetrical distribution and want to find approximately what percentage of data lies under mean + 1 standard deviation \ mean + 2 standard deviation \ mean + 3 standard deviation.
“at least“ 1?1/k2 of the data lie within k standard deviations of the mean, that is, in the interval with endpoints x ± ks for samples and with endpoints μ±kσ for populations, where k is any positive whole number that is greater than 1.
The important word here is “at least” at the beginning. This theorem gives the minimum proportion of the data which must lie within a given number of standard deviations of the mean, the true proportions found within the indicated regions could be greater than or equal to what the theorem mentions.
Advantages of Chebyshev's Theorem:-
Chebyshev’s Theorem applies to all possible data sets (symmetrical \ asymmetrical). It describes the minimum proportion of the measurements that lie must within one, two, or more standard deviations of the mean.
Application of Chebyshev's Theorem:-
If you have a distribution that is normal \ isn’t normal (applies to both), you can use Chebyshev’s theorem to find out minimum percentage (can be >=) of the data is clustered around the mean.
Hope you found this article informative, do let me know your thoughts on the same.