Do you understand what is correlation?  If you don't know Simpson's paradox, then you don't understand correlation!
Picture credit: https://predictivehacks.com/simpsons-paradox-example/

Do you understand what is correlation? If you don't know Simpson's paradox, then you don't understand correlation!


A few days ago, I published my first LinkedIn article on "Fundamental Attribution Error, and Correlation vs. Causality". It was well received among some of my friends, and that encouraged me to think deeper on related subjects.

Today I am going to talk about correlation in the real world, which is usually in the form of "I like A because A is simple, and simplicity is usually better". While we all understand the first flaw (correlation vs causality), there is actually a second flaw as well (positive vs negative correlation).

Before we dive into it, I first want to say that while this kind of reasoning has its flaws, I use it myself frequently as well because it's a very efficient way of thinking. It may not be a bad idea to reason with flaws because the goal is not just accurate reasoning but also efficient reasoning.

Now let's dive in with the example "simplicity is usually better", or more generally, "X and Y often happen together", or "The group with X=1 is more likely to have attribute Y than those with X=2". This is usually concluded from a set of data points from our earlier experience or investigation or knowledge. Ok. So we are data-driven already, right? The correlation must be true because it was from real data!

What's wrong? There is something called Simpson's paradox. With that, we can see from exactly the same dataset, we can use statistics to conclude both "The group with X=1 is more likely to have attribute Y than those with X=2", and "The group with X=2 is more likely to have attribute Y than those with X=1". Whether we gets the former or the latter conclusion really depends on what third variable(s) do we fix!

If you are interested in this topic, here are some further readings from the same Wikipedia page. I would conclude this article with this summary: Being data-driven is easy, being data-driven without flaws takes a lot of effort!

Carrie Graham, PhD

Contract Learning Solutions Architect ?? Strategic Workforce Development: Linking C-Suite Vision to Measurable Training ROI ?? Workplace Learning Strategist

2 年

Interesting post Zheng Shao. Using #datadrivendecisionmaking in business is critical to overall #kpi success. Here's a conversation that compliments your article https://drcarriegraham.com/insights/mission-driven-data

回复

要查看或添加评论,请登录

Zheng Shao的更多文章

社区洞察

其他会员也浏览了