When Correlation is Causation: Understanding the Exceptions to the Rule

When Correlation is Causation: Understanding the Exceptions to the Rule

In the field of statistics and scientific investigation, there is a known saying that goes "correlation does not necessarily mean causation." This phrase acts as a reminder that the fact two variables change, in tandem does not automatically imply that one variable directly causes the other. For example, there is a high correlation between ice cream sales and drowning. When more ice cream is sold, more people drown. In the opposite direction, we the pirate population decreases, the average annual temperature of the whole planet increases in a very consistent rate. Should we bring pirates back and ban ice cream to save people and prevent global warming? Of course (not)! These examples are just a few and they are well explored in many articles and videos everywhere.

However, there are situations where correlation can offer hints or strong indications of causation. Correlation often is not causation but sometimes it is. This article explores some of these scenarios providing insights, into when and how correlation can provide a glimpse into causal relationships.

Understanding Correlation and Causation

Before diving into the exceptions, it's crucial to define what correlation and causation mean. Correlation refers to a statistical relationship between two variables, where changes in one variable are mirrored by changes in another. Causation, on the other hand, implies that one event is the result of the occurrence of the other event; one is the cause, the other is the effect. Image 1 shows the sales of cigarettes per adult per day in the United States. Compare it to image 2, which demonstrates the lung cancer death rates in the United States per gender.

Image 1: Avg daily sales of cigarettes
Image 2: Lung cancer deaths per 100k people by gender


Notice how the lines have a very similar shape in almost the same years? That indicates a high correlation: as cigarette sales increase, the lung cancer death rate also increases in a very similar proportion. Does that mean smoking causes lung cancer? The answer is not that simple. The correlation indicates there is something there to investigate but determining causation is a more complex task. You need to consider other factors (variables) such as changes in pollution, urban population rate, changes in demographics, and more.

The Ladder of Causation

To understand when correlation might imply causation, it's helpful to refer to Judea Pearl's "ladder of causation," which consists of three levels: association (seeing), intervention (doing), and counterfactuals (imagining). At the association level, we find correlations. Moving up the ladder, if interventions in one variable consistently lead to changes in another, we edge closer to establishing causation. The more you climb the ladder, more evidence of a true causation you get.

Context and Controlled Conditions

The context in which a correlation is observed plays a significant role. In controlled experimental settings, where researchers manipulate one variable and observe the effect on another, correlations can more reliably suggest causation. Randomized controlled trials (RCTs) are a prime example where correlation between the intervention and outcome can indicate a causal link.

Longitudinal Studies and Time Sequence

Longitudinal studies, which observe subjects over a long period, can also provide clues about causation. If a consistent correlation is observed in a specific sequence (where the cause precedes the effect), it strengthens the argument for causation.

Strength, Consistency, and Specificity

When correlations are strong, consistent across different studies, and specific to particular conditions or populations, the likelihood of causation increases. These principles, often used in epidemiology to infer causal relationships, help differentiate between mere coincidences and more meaningful connections.

Causal Inference Techniques

Advancements in statistical methods have improved our ability to infer causation from correlations. Techniques like Granger causality tests, instrumental variables, and propensity score matching attempt to untangle complex relationships and hint at causal links.

Real-World Examples Where Correlation Implied Causation

- Smoking and Lung Cancer: The strong, consistent, and specific correlation between smoking and lung cancer, backed by biological plausibility and animal studies, led to the conclusion that smoking causes lung cancer.

- Thalidomide and Birth Defects: The temporal correlation between the use of Thalidomide and the spike in birth defects was a key factor in establishing the drug’s teratogenic effects.

- High Blood Pressure and Heart Disease: The correlation between high blood pressure and increased risk of heart disease is supported by extensive clinical data, showing a direct causal relationship.

- Alcohol Consumption and Liver Cirrhosis: Repeated observations have shown that excessive alcohol consumption is causally linked to the development of liver cirrhosis.

- Air Pollution and Respiratory Diseases: Epidemiological studies have demonstrated a causal relationship between air pollution and an increase in respiratory diseases like asthma and bronchitis.

- Vaccine Administration and Disease Prevention: The administration of vaccines and the subsequent reduction in specific infectious diseases provide a clear causal relationship through clinical trials and population studies.

- Lead Exposure and Cognitive Impairment in Children: Studies have consistently shown that exposure to lead, especially in children, causes cognitive impairment and developmental delays.

- Sun Exposure and Skin Cancer: The correlation between ultraviolet (UV) radiation from sun exposure and the increased incidence of skin cancers like melanoma is supported by both epidemiological and biological evidence.

- Sedentary Lifestyle and Cardiovascular Health Risks: There is a well-established causal link between a sedentary lifestyle and increased risks of cardiovascular diseases, supported by numerous studies.

Conclusion

While the maxim "correlation does not imply causation" holds true in many instances, understanding the context, methodology, and underlying mechanisms can reveal scenarios where correlation does, in fact, suggest causation. These instances, though requiring careful analysis and often supplementary evidence, remind us of the nuanced and complex relationship between correlated variables.


Yuriy Myakshynov

Senior Director of Technology | Insurance & Fintech Expert

10 个月

Igor, thanks for sharing!

回复

要查看或添加评论,请登录

Igor Alcantara的更多文章

社区洞察

其他会员也浏览了