Measures of Variability

Measures of Variability

No alt text provided for this image


Introduction

So far, the measures (Mean, Median….) were discussed which are used to figure the data set central tendency.

Meanwhile, other types of measures need to be considered for determining the variability. In another words, how far the data is scattered or spread out from the center.?


Measures of Variability:

  • Range:

This measure is simply calculated from the largest value minus the smallest one.

  • Interquartile Range (IQR):

Concept wise it is similar to the range measure, but the difference between 25th percentile and 75th percentile is considered.?

In case you are wondering about the percentile; The 25th percentile indicates a value of the data set that 25% of data points values are less than that very same value of the percentile.

IQR = Q3 – Q1 / in which Q1 is the first quartile and Q3 is the third quartile, 25th and 75th percentile for instance.

Following graphs clearly points out the calculation, as well as the concept of IQR:?

No alt text provided for this image

  • Limit:

Since we have already discussed the IQR, it would be a proper moment to elaborate the outliers and the calculation methodology.

Given the below data set;

-80,4,6,8,12,14,16,18

This variable has a range of 98, and if due to any ration, the -80 is excluded, the range becomes 14.?

The data point causing the large range is -80, which is called an outlier. Any data point in the following range, can be considered an outlier, with potential disruptions to the analysis.?

How are outliers detected?

No alt text provided for this image


  • Standard Deviation & Variance:

Variance:

For calculating the variance, we should first figure the mean value, the calculate the difference between mean and each value of the variable, and after squaring it, the average is the result:?

No alt text provided for this image

Standard Deviation is the square root of the variance.?

No alt text provided for this image

Skewness:

A qualitative description for data distribution.

Skewness determines the distribution asymmetry level and is usually demonstrated on a bell curve which is the same density plot of histogram.

Do not expect a normal distribution all the time.

There are three major types of skewness:?

No alt text provided for this image

If the data set has many extremely large values (more values far greater than the mean) the distribution is considered as a negatively skewed distribution, and on the other hand, is the data has far more relatively smaller values (more values far smaller than the mean), then it is considered as a positively skewed distribution (Right figure).

Clearly, a normal distribution has a skewness of 0, while the skewness number for positively skewed distribution is positive, and negative, for a negatively skewed distribution.

Skewness does not have complicated formula, but there are many tools such JAMOVI or SPSS that easily figures the value of skewness for a given data set.?

There are further methods such as Kurtosis to understand the data distribution, but the skewness is way more widely being applied.

#data #dataanalytics #statistics #statisticalanalysis #distributions #histogram #bellcurve #skweness #variance #SD #Deviation #leftskewed #rightskewed

Source: Learning statistics with jamovi: a tutorial for psychology students and other beginners, by Danielle Navarro, David Foxcroft

要查看或添加评论,请登录

社区洞察

其他会员也浏览了