Skewness

Skewness

Skewness is among the first insight into the data set that we get from its visualization. Skewness is defined as a measure of the dataset’s symmetry. A more proper definition would be - Skewness refers to a distortion or asymmetry that deviates from the symmetrical bell curve, or normal distribution, in a set of data. If the curve is shifted to the left or to the right, it is said to be skewed.

No alt text provided for this image

A perfectly symmetrical data set will have zero skew. It mean, median & mode lies on the same line.

No alt text provided for this image

Skewness can be also defined as the degree of asymmetry observed in a probability distribution.

Measuring Skewness

No alt text provided for this image

Where: X = Mean value

Mo = Mode value

s = Standard deviation of the sample data

# skewness along the index axis

df.skew(axis = 0, skipna = True)

Significance of Skewness

  • The skewness of data helps us in creating better linear models.
  • It tells us about the direction of outliers.

How Do We Transform Skewed Data?

  • Power Transformation
  • Log Transformation
  • Exponential Transformation



Alok Singh Bhadauria

Analytics Manager at EXL | Credit Risk, Logistic Models, Data Science, Machine Learning

3 年

Difference between Dispersion and Skewness is very interesting, Dispersion measures the tendency of data set distributed over range in statistical analysis, where Skewness measures the asymmetry in a statistical distribution from the normal distribution.? #dataanalyst #statistics #datascience

要查看或添加评论,请登录

Arjun Panwar的更多文章

社区洞察

其他会员也浏览了