A Guide to Accurate Analysis: Overcoming Common Mistakes and Biases

A Guide to Accurate Analysis: Overcoming Common Mistakes and Biases

Have you ever found yourself convinced by an argument, only to realize later that it was based on flawed reasoning?

We all fall prey to analytical mistakes and biases, but the good news is that they can be identified and avoided.

In this article, we dive into the common pitfalls of analysis and provide tips on making more accurate and unbiased decisions.




Thinking Trap #1 — Survivorship bias

A selection error can lead to the wrong conclusions.

A classic example

People judge the “kindness” of dolphins by those victims who have been pushed ashore by dolphins.

At the same time, those who have been carried away by dolphins to the open sea are not included in the analysis, although there are many more of them.

Example from business

You interviewed your client base and noticed that no one among your clients is on Instagram.

The conclusion that you have a “special” audience or that Instagram is generally unpopular is unfounded.

You probably didn’t use this channel to attract customers, and your Instagram audience needed somewhere to learn about your company.

How to avoid the trap?

Check that the targets in both your target and comparison group are selected from a joint base.




Follow my blog or follow me on LinkedIn for more information about Software Development, Machine Learning and The Modern Developer Lifestyle




Cognitive Trap #2 — Correlation Error

Correlation ≠ Causation.

A joint change in two variables in the dynamics does not indicate a causal relationship between them.

Classic example

The fact that happy people eat more sweets may indicate the following:

Sugar consumption leads to happiness;

Happy people take everything from life, including lots of sweets;

There is a variable that explains the propensity for sugar and joy (the sweet life gene!);

Sugar and happiness are NOT related; the available data is a random coincidence.

Random, like the 0.67 correlation between the number of Nicolas Cage movies and the number of deaths in swimming pools or the 0.99 correlation between the number of divorces and margarine consumption.

Other ridiculous correlations are collected on the spurious correlations website.

A business example

The closer the summer season gets, the more a company spends on remarketing and the more orders that company has.

How to avoid the trap?

The only way to establish a causal relationship between two variables is to conduct a controlled experiment (AB test).




Thought Trap #3 — Multicollinearity

This is a case of correlation error, which is explained by the presence of a third variable related to both traits being studied.

Classic example

The more churches there are in a city, the more crime there is. Does this mean that churches generate crime (or vice versa?) — NO!

This strange behaviour of a pair of attributes is explained by a third variable: the size of the city. The bigger it is, the more churches and crimes it will have.

An example from business

It has been observed that those who leave angry reviews on the app have a much higher LTV than the rest.

Hypotheses began to be born that these customers were emotionally involved with the product…

Or those who care about the product will criticize it because they use it often and sincerely want the service to change…

The actual explanation turned out to be like with the size of a city: the longer a customer “lives” with a company, the more likely they are to leave an angry review sooner or later.

How do you avoid the trap?

In simple terms, you must fix the time factor as a constant for both groups.

To do this, compare the LTV of customers who left a review in the first seven days with those who didn’t leave a review but definitely used the product in the first 7 days.

Controlled experiments (AB-tests) are a universal way to establish true causality.

As part of the experiment, we expose the test group to an effect (an ad, a discount, a new product feature) and place the control group in precisely the same environment but NOT exposed.

Then we look at how the test and control group’s target metric (conversion, average check, LTV) differs. The difference in the metric (if any) is attributed to the effect of the only factor that distinguished the test and control group experience.

AB tests help to test hypotheses and assumptions about potential improvements, but there are also many pitfalls in the process.




Cognitive Illusion #4 — Heterogeneous groups

When designing experiments, use random mixing and quotas common to the control and test groups.

Verify that the test and control groups have a homogeneous composition.

Suppose the test group is initially filled and contains a more favourable audience.

In that case, its metrics will be higher, not because of the influence of the factor being studied but because of the control group’s composition advantage.

A classic example

The researchers believe that the new Nice cleats will help players play better.

To do this, the test group, the England soccer team, was given the new Nice cleats, while the control group, the East Timorese team, played in their usual shoes.

England won, but the cleats didn’t help.

A business example

A food delivery service decided to test how unexpected surprises on March 8 would affect the LTV of female customers.

A logical control segment might seem to be male (since they don’t get surprises on March 8)… But comparing LTV between such groups would be a mistake.

Males, on average, eat more and are richer (temporarily, unfairly, but in fact), which means they order more food and have a higher LTV.

How to avoid the trap?

When designing experiments, use random mixing and quotas common to the control and test groups. Verify that the test and control groups have a homogeneous composition.




Cognitive Bias #5 — Small samples

In sampling studies (when we judge the entire population by a fraction of subjects), we often find a segment in which the metric is higher or lower than the average.

It may be tempting to draw far-reaching conclusions, but such findings would be erroneous without a confidence interval calculation.

Remember: if the average proportion of a trait in a sample of 200 people is 10%, then the balance in the general population, with 95% probability, lies in the range of 6–14%.

The smaller the sample, the more comprehensive the range.

A classic example

You have two perfectly shaped coins: a green one and a blue one.

You flip the green one ten times and the blue one a thousand times.

The green one tails out 30% of the time, and the blue one tails out 49.4% of the time.

Does this mean that colour affects how often you get seats?

Business example

Another measurement of brand health metrics caught an increase in intuitive knowledge among a senior audience, celebrated at a corporate event, and gave a bonus to co-workers who work with newspapers.

On the subsequent measurement, the metric among old audiences bounced back to typical values, even though they spent an even larger budget on newspapers.

How to avoid the trap?

Always look at the sample average with an eye on the confidence interval.


Cognitive Mistake #6 — The Peeping Error

If we constantly peek at the intermediate results of an experiment, then one day, we will get the results we want, and the temptation to stop the investigation at that point will be too great.

A classic example

A new product manager came to your company and told you that he knew a way to get 100% conversion on a tails flip: to do it, you had to paint a coin red.

You offered to test this hypothesis with an experiment, to which your new colleague agreed but stopped the investigation as soon as tails fell twice in a row and presented this as proof of the success of his idea.

Please think of the perfect pennies we flipped in the example from last week’s trap.

Example from business

You’ve launched an AB test, and you’re so interested that you stop by to check the results every day.

For three days in a row, the test group’s performance was better than the control group’s, so you decided to close the experiment early, declaring it a success.

How do you avoid the trap?

Calculate a sufficient sample size, and wait to peek until it accumulates.




Understanding and avoiding analytical mistakes and biases is crucial for making informed decisions.

By being aware of our biases, critically evaluating information, and considering alternative perspectives, we can improve the accuracy and objectivity of our analysis.

Remember, no one is immune to errors in reasoning, but with effort and practice, we can all strive to become better and more trustworthy analysts.

So, let’s put these tips into practice and make our analyses as robust as possible.


Follow my blog or follow me on LinkedIn for more information about Software Development, Machine Learning and The Modern Developer Lifestyle

要查看或添加评论,请登录

社区洞察

其他会员也浏览了