Stop Confusing Correlation with Causation
Have you heard the story about the lotto winner who decided to buy a beachfront resort?
Well after the acquisition was finalized, he was given all the data that the resort had been collecting over the last several years.?His sister was in business and always talking about the importance of data and making data driven decisions.
So, he started combing through all the data to see if anything popped up.?
And it did.?He noticed something that could expose him to additional liability, terrible PR and most importantly, could put his guests at risk.
He found that the lifeguards had to save more people from drowning the more scoops of ice crème the ice crème store sold that day.?There was a very clear direct relationship in the data that pointed to more ice crème sales equating to more people needing to be rescued.
Thankfully, he caught this in time right before the summer and made the decision to shut down the ice crème shop permanently.?He felt good that night knowing that he probably saved someone’s life this year.
As crazy as the story above sounds, it happens in business all the time.?When presented with data:
·?????We are wired to find patterns, even when one doesn’t exist
·?????We tend to focus on the data in front of us even when more important information and data is not available
·?????We tend to focus on finding data that supports our initial hypothesis vs challenging it?
In the story above, obviously the important data missing was the number of resort guests per day.?That data would have shown that as the resort guests climb, ice crème sales climb as well as does the number of guests that need to be saved from drowning.?The more guests the more likely lifeguards will need to save someone.
Another great example of this was a lecture one of my college professor told in a queueing theory class I took on the very first day.?It had to do with marijuana as a gateway drug.?
He stated there is some very frightening and damning evidence from studies conducted that over 90% of heroin users first started with marijuana.?90% is a big, scary number.?He also said this topic has been debated and these stats had been used in Congress when talking about the dangers of marijuana.
But, he said, this is the opposite way to be looking at the data.?90%+ of heroin users also drank milk before going on to use these drugs, but we shouldn’t be banning milk.?The problem is that whenever you divide a large proportion (more common event) by a small portion (uncommon event) the quotient will always be quite large.?The more impactful data point would be of marijuana users, what % went on to use heroin.?And then how does that compare to those that never did.
领英推荐
Ultimately, he claimed he didn’t know or have an opinion on the harm or even if marijuana was a gateway drug, just that the argument that was currently taking place in Congress was based on an illogical application of math.
Or how about a business specific example:
As a company we sent 1,000 emails out in February, but when we sent 3,000 emails out in March, we doubled our revenue.?Thus, the conclusion is we need to send out more emails.
Although a very basic example, these types of decisions based on correlations happen frequently.?Questions about who sent the emails, sales cycles, historical data, what region were the emails sent, etc. oftentimes are never asked.
Overall, we have access to more data and information than we’ve ever had before.?But it also means we’re all more susceptible to incorrect conclusions being drawn and assuming causation from correlating data.?This can have an enormous effect on the bottom line for businesses.
Knowing this, here are some ways to prevent making poor business decisions based on the above:
·?????Use common sense and think through what’s being presented
·?????Ask more questions (where the data came from, why this thesis from the data, what other components might be at play here, etc)
·?????Think about the methodology of generating this data and what data might be missing
·?????Does the data line up with historical trends
·?????Get multiple individuals/departments involved in analyzing the data and conclusions made, as biases can occur otherwise
·?????Think through ways of diversifying the collection of data
·?????Get outside third parties to validate your thesis
From everything from board meetings to TikTok, we’re being inundated with data and more causal claims than ever before.?Before taking anything at face value, think about the source, ask questions, and get different and diversified perspectives as well. ??
Founder at RevdUp
1 年So true! We often struggle deciphering between causation and correlation especially when looking at Gong data (I’m admittedly a Gong super fan). For instance, through Gong we discovered that our win rate jumps nearly 20% when pricing is discussed at the Discovery stage. What I struggled to determine was if these deals close at a higher rate because the seller proactively discussed pricing early on, or if the higher-intent buyers were asking about pricing earlier on. Through listening to the actual calls we determined it was a combination of the two and have since trained our team to discuss pricing earlier in the sales process. However, it gets trickier when looking at interaction trackers such as talk ratio. Do our top sellers have a higher listen-to-talk ratio because they ask better questions, listen more actively, or because they target better prospects? And as a correlary, which trainings should we then prioritize and traits should we hire for? If you or anyone reading this has any advice on how to unpack causation vs correlation when looking at Gong or Chorus trackers, I’d love to hear how you think through it.
Realtor Associate @ Next Trend Realty LLC | HAR REALTOR, IRS Tax Preparer
1 年Thank you for Sharing.