When Correlation is Causation: Understanding the Exceptions to the Rule
Igor Alcantara
Qlik MVP | Qlik Partner Ambassador | AI | Data Science | Analytics | Podcasting
In the field of statistics and scientific investigation
However, there are situations where correlation can offer hints or strong indications of causation. Correlation often is not causation but sometimes it is. This article explores some of these scenarios providing insights, into when and how correlation can provide a glimpse into causal relationships.
Understanding Correlation and Causation
Before diving into the exceptions, it's crucial to define what correlation and causation mean. Correlation refers to a statistical relationship between two variables
Notice how the lines have a very similar shape in almost the same years? That indicates a high correlation: as cigarette sales increase, the lung cancer death rate also increases in a very similar proportion. Does that mean smoking causes lung cancer? The answer is not that simple. The correlation indicates there is something there to investigate but determining causation is a more complex task. You need to consider other factors (variables) such as changes in pollution, urban population rate, changes in demographics, and more.
The Ladder of Causation
To understand when correlation might imply causation, it's helpful to refer to Judea Pearl's "ladder of causation," which consists of three levels: association (seeing), intervention (doing), and counterfactuals (imagining). At the association level, we find correlations. Moving up the ladder, if interventions in one variable consistently lead to changes in another, we edge closer to establishing causation. The more you climb the ladder, more evidence of a true causation you get.
Context and Controlled Conditions
The context in which a correlation is observed plays a significant role. In controlled experimental settings
Longitudinal Studies
Longitudinal studies, which observe subjects over a long period, can also provide clues about causation. If a consistent correlation is observed in a specific sequence (where the cause precedes the effect), it strengthens the argument for causation.
Strength, Consistency, and Specificity
When correlations are strong, consistent across different studies, and specific to particular conditions or populations, the likelihood of causation increases. These principles, often used in epidemiology to infer causal relationships, help differentiate between mere coincidences and more meaningful connections.
领英推荐
Advancements in statistical methods have improved our ability to infer causation from correlations. Techniques like Granger causality tests, instrumental variables, and propensity score matching attempt to untangle complex relationships and hint at causal links.
Real-World Examples Where Correlation Implied Causation
- Smoking and Lung Cancer: The strong, consistent, and specific correlation between smoking and lung cancer, backed by biological plausibility and animal studies, led to the conclusion that smoking causes lung cancer.
- Thalidomide and Birth Defects: The temporal correlation between the use of Thalidomide and the spike in birth defects was a key factor in establishing the drug’s teratogenic effects.
- High Blood Pressure and Heart Disease: The correlation between high blood pressure and increased risk of heart disease is supported by extensive clinical data, showing a direct causal relationship.
- Alcohol Consumption and Liver Cirrhosis: Repeated observations have shown that excessive alcohol consumption is causally linked to the development of liver cirrhosis.
- Air Pollution and Respiratory Diseases: Epidemiological studies have demonstrated a causal relationship between air pollution and an increase in respiratory diseases like asthma and bronchitis.
- Vaccine Administration and Disease Prevention: The administration of vaccines and the subsequent reduction in specific infectious diseases provide a clear causal relationship through clinical trials and population studies.
- Lead Exposure and Cognitive Impairment in Children: Studies have consistently shown that exposure to lead, especially in children, causes cognitive impairment and developmental delays.
- Sun Exposure and Skin Cancer: The correlation between ultraviolet (UV) radiation from sun exposure and the increased incidence of skin cancers like melanoma is supported by both epidemiological and biological evidence.
- Sedentary Lifestyle and Cardiovascular Health Risks: There is a well-established causal link between a sedentary lifestyle and increased risks of cardiovascular diseases, supported by numerous studies.
Conclusion
While the maxim "correlation does not imply causation" holds true in many instances, understanding the context, methodology, and underlying mechanisms can reveal scenarios where correlation does, in fact, suggest causation. These instances, though requiring careful analysis and often supplementary evidence, remind us of the nuanced and complex relationship between correlated variables.
Senior Director of Technology | Insurance & Fintech Expert
10 个月Igor, thanks for sharing!