IPL Fever ... Some stats gyan

How would you interpret the batting average of 99.6 of Bradman and 53.7 of Tendulkar. Does it mean that in every innings that Bradman played, he scored 99.6 runs or did it mean that over every 10 innings, Tendulkar scores 537 runs? Does it mean that if Bradman scored a zero, then in the next innings he will score 199 to get to an average of 99,6? What can you say about the probability of Tendulkar scoring a 50? Is it 1, 0.75 or 0.5? What about for Bradman?

Mean is a mis-leading number in this situation. The individual scores of a batsman has a distribution which is skewed to the right - which in simple words mean, that most of the scores are centered around a number with a few large scores. Tendulkar scored 51 centuries over 329 innings while Bradman scored 29 centuries over 80 innings. For variables which have a skewed distribution, median, the score which divides all the scores into two equal parts is much more of representative number. For example, the median score for Bradman is 56.5. This means that out of 80 innings that Bradman played, 40 scores were below 56.6 and 40 above. For Tendulkar, the median is 32, meaning that out of 329 innings, 164 innings were below 32 and others above 32. The median also allows you to say, that there is a 50% chance that Bradman will score above 56 while for Tendulkar you can similarly say, that there is 50% chance that he might not cross 32.

I think I can understand 32 much better than the average of 53.7 because the memories related to each of the 51 centuries are not that strong while the memories of 164 smaller than 32 scores are stronger.

Also not counting the not out innings in calculating the means is not a very good way of adjusting for censoring while median is much better as individual scores do not effect it much.

We should use medians rather than means to rank batsmen on what they do in an individual innings. Even with this measure, Bradman remains the king!

Raw data downloaded from Cricinfo statsguru

Shantanu Nagayech

Lead, Rewards People Insights

6 年

This calls out for mann Whitney u test to really see who is better .....just a thought :-)

回复
Dr. Aswin Kumar

Pharmaceutical Physician/Medical Writing

6 年

Really liked the simple analogy by using a comparison between scores of Sachin and Bradman. Even in clinical trial data, sometimes when the sample is less, or not normal, then median (min-max) provides a better view of the results compared to mean (SD). It is such small tricks, which make report writing such an art, rather than science.

回复
Sai Kailash Uppalapati

SAS (Software), Statistical Programming |Open to Contract, Remote and Hybrid opportunities, | Statistical Programmer | Senior Statistical Programmer | San Francisco Bay Area |

6 年

This is called, batting , bowling and feelding with data....nice prespective

回复
Anurag Bagaria, Ph.D.

Associate Director (Data Science), AI solutions at Scale for Pharma through Data and Tech

6 年

Makes complete sense.

回复

要查看或添加评论,请登录

Ashwini Mathur的更多文章

  • Openness, Creativity, Longevity, & Best Music 2023

    Openness, Creativity, Longevity, & Best Music 2023

    Openness is a trait which has been linked with longevity. Openness is associated with better response to stress and a…

  • ChatGPT experiment

    ChatGPT experiment

    My experiment with #ChatGPT. I took a passage from #SalmanRushdie.

    6 条评论
  • Heuristics and/or Rationality

    Heuristics and/or Rationality

    Is it time for a heuristics based or a gut feeling solution rather than a fully rational scientific solution for the…

    1 条评论
  • World Statistics Day - Celebrated every 5 years

    World Statistics Day - Celebrated every 5 years

    Some random ramblings to celebrate World Statistics Day today ..

  • Some famous historical (mis)quotes about Data

    Some famous historical (mis)quotes about Data

    Recent past indicates that Data and Analytics is in vogue. When I researched this a bit on Google, I realized that this…

    5 条评论
  • Data Science Bias; Lying Computers

    Data Science Bias; Lying Computers

    I read this absolutely brilliant thriller by Terry Hayes called I am Pilgrim where one of the characters in the book is…

    5 条评论
  • The 7 habits of highly NON-EFFECTIVE people

    The 7 habits of highly NON-EFFECTIVE people

    Observing effective people and finding out what behaviors drive their effectiveness is a difficult task, as there are…

    6 条评论
  • Data Story Telling .....

    Data Story Telling .....

    There is some preliminary research coming out which is suggesting that people with good storytelling skills also…

    3 条评论
  • The "I" Game ....

    The "I" Game ....

    The "I" game - no it is not the EGO game. That game, we are very good at without ever being trained on it.

    2 条评论
  • Data Scientific New Year

    Data Scientific New Year

    Wishing all a data scientific new year which will automatically make it a very Happy New Year because you will address…

    1 条评论

社区洞察

其他会员也浏览了