Cramer’s V
Detective Conan.Image source : pinterest

Cramer’s V

In the previous articles, we saw different forms of correlation, this time , we delve into the realm of statistical analysis, focusing on Cramer’s V.

Cramer’s V is a statistical measure used to assess the strength of association between two categorical variables in a contingency table. It’s particularly useful for tables larger than 2x2, meaning it can be applied to variables with more than two categories.

What the difference between Cramer’s V and Phi Cofficient :

  • Phi Coefficient is specifically used for 2x2 contingency tables, meaning it’s applicable only when both variables are binary (each having two categories) but Cramer’s V is a more general measure used for contingency tables of any size. It’s applicable for tables larger than 2x2, making it suitable for variables with more than two categories.
  • The formula for Cramer’s V is also based on the Chi-square statistic but adjusted for the size of the contingency table.

Note : Both Cramer's V and the Phi coefficient range from 0 to 1, where 0 indicates no association and 1 indicates a perfect association.

You can find more information about the Chi-square test in the article below

How do we compute it?

Cramer’s V formula. Image source: empirical-methods

  • V = Cramer’s V coefficient
  • χ2 = Chi-square statistic from the chi-square test
  • n = Total number of observations
  • r = Number of rows in the contingency table
  • c = Number of columns in the contingency table

Solving the Cramer’s V Puzzle with Detective Conan

Detective Conan. Image source : anitrendz

Detective Conan a curious detective decided to explore whether a suspect’s job influenced the outcome of a case. To do this, he gathered data from various episodes, focusing on two main aspects: the occupation of the suspects and whether they were found guilty or innocent.

He created a table, where he listed the jobs down one side ; artists, businesspeople, doctors, and teachers. Along the top, he marked two outcomes: ‘Guilty’ and ‘Innocent.’ Under each job, he carefully tallied how many suspects were found guilty and how many were innocent.

Detective Conan.Image source : pinterest

With his table complete, our detective had a new challenge: figuring out if the job really mattered in the outcome of the cases.

This is where Cramer’s V, a magical formula to measure the strength of connection between two things, came into play. He knew this formula would tell him if the suspect’s job and the case outcome were just coincidentally aligned or if there was a real connection.

First, he needed a special number called the Chi-square statistic, a clue to how much the outcomes differed from what one would expect by chance.

He remembered the formula from the previous article ??

Chi-square statistic formula. Image source : analyticsvidhya

O is the observed frequency and E is the expected frequency under the null hypothesis (no association between the variables).

The expected frequency for each cell in a contingency table is calculated as:

Expected frequency formula. Image source Dr. Walid Soula

  • RT is Row Total
  • CT is Column Total
  • N is the Grand total

1/ Expected Frequencies

For the ‘Artist’ and ‘Guilty’: E = (15 x 40)/70 and it’s equal to 8.57

Note : We calculate E for each cell in the table.

2/ Chi-square Calculation : Now that we have the Expected values we need to plug them in the formula and sum all the results to get the Chi-square statistic which is approximately 7.05

Now that we has the Chi-square statistic and the total number of cases, which added up to 70.

Using the Cramer’s V formula the detective worked his way through the calculation.

  • The top part of the fraction was the Chi-square number (7.05) divided by the total cases (70).
  • The bottom part was the smaller number between the job types minus one (4–1=3) and the outcomes minus one (2–1=1). In this case, it was 1.
  • So it’s √(7.05/70) and Cramer’s V is approximately 0.317

Detective Conan.Image source : detectiveconanworld

Indicating then, a moderate association between a suspect’s occupation and the case outcome. But rest assured, in the real world, your occupation won’t lead you into a “Detective Conan” episode!

It’s always fun to explore fictional scenarios with a bit of humor and creativity, especially when they intersect with the world of statistics and data analysis. I really think it’s easier to understand


If you found this helpful, consider Resharing ?? and follow me Dr. Oualid Soula for more content like this.

Join the journey of discovery and stay ahead in the world of data science and AI! Don't miss out on the latest insights and updates - subscribe to the newsletter for free ????https://lnkd.in/eNBG5dWm , and become part of our growing community!

要查看或添加评论,请登录

Dr. Oualid S.的更多文章

  • Herfindahl-Hirschman Index (HHI)

    Herfindahl-Hirschman Index (HHI)

    In this article, I will discuss a key metric in market research known as the Herfindahl-Hirschman Index (HHI), which is…

  • Evaluating a company’s portfolio with the MABA Analysis

    Evaluating a company’s portfolio with the MABA Analysis

    In this article, we will cover another tool that can be used in international marketing called MABA Analysis. This tool…

  • 7S McKinsey Model for Internal Analysis

    7S McKinsey Model for Internal Analysis

    It's been quite a while since I wrote an article on business strategies, so I thought I'd kick off this week by…

    2 条评论
  • Step by Step guide A/B for UX (Binary Data)

    Step by Step guide A/B for UX (Binary Data)

    In the last article I covered how to execute a hypothesis test illustrated by a UX research design where we compared…

  • Retail Analytics project

    Retail Analytics project

    This article is an introduction to the world of machine learning, for anyone wanting to participate in small-scale…

  • From Sci-Fi to Reality | Exploring the root of AI

    From Sci-Fi to Reality | Exploring the root of AI

    For people who have not jumped into AI or are just hooked on generative AI and want to understand how things work?…

  • Apache Airflow Building End To End ETL Project

    Apache Airflow Building End To End ETL Project

    In that article I will cover the essential that you need to know about Airflow, if you don’t know what it is, I wrote…

  • Diving Deep into Significance Analysis

    Diving Deep into Significance Analysis

    In the constantly changing landscape of scientific research, the pursuit of significance extends well beyond the usual…

  • Volcano Plots

    Volcano Plots

    In this article, I will cover a well-known plot used mainly in genomics called the volcano plot. It is used to…

  • Simpson’s Paradox

    Simpson’s Paradox

    In this article, I will cover a well-known statistical phenomenon that you may have heard of before called ‘Simpson’s…

社区洞察

其他会员也浏览了