Measures of Dispersion- Range, Varince & Standard Deviation

Measures of Dispersion- Range, Varince & Standard Deviation

Measures of Dispersion help us to know about the dispersion of the data set.

Why we need Measures of Dispersion?

Central Tendency i.e. mean, median and mode are not sufficient to reveal the shape of data set. To know about the variation among the data set values, we need Measures of Dispersion.

We consider three major measures of dispersion — Range, Variance & Standard Deviation.

Range

Range tells us about the lower and upper limits of the data set. It is the difference between the smallest and the largest observations. 

The range is very sensitive to outliers.

Variance

No alt text provided for this image

Variance is a measure of dispersion in a data set. It is measured by first finding the Deviation of each element in a data set from the mean, and then by squaring it. Variance is an average of all squared deviations.

Note: In the sample variance formula, the denominator has n-1 instead of n, where n is the number of observations in the sample. This use of ‘n-1’ is the Bessel’s correction method. The reason behind using this method is, it corrects the bias in the estimation of the population variance.

import numpy as np

data=[312,464,4,32,24,43,6]

np.var(data)


Standard deviation

A standard deviation is a statistic that measures the dispersion of a data set relative to its mean. It tells us about the concentration of data around the mean of the data set.

No alt text provided for this image

Unlike variance, standard deviation has the advantage of being in the same units as the original variable

import numpy as np


data=[312,464,4,32,24,43,6]

np.std(results)

Facts about Standard Deviation:?

  • If the standard deviation is small, the data has little spread (i.e., the majority of points fall very near the mean).
  • If standard deviation = 0, there is no spread. This only happens when all data items are the same value.
  • The standard deviation is significantly affected by outliers and skewed distributions.

Here is a question for you, in the following line plot arrange red, green & blue lines according to their standard deviation. Comment down your answers.

No alt text provided for this image



要查看或添加评论,请登录

社区洞察

其他会员也浏览了