Causal Inference: Difference between revisions

Revision as of 14:37, 23 April 2026

How to read this page: This article maps the topic from beginner to expert across six levels � Remembering, Understanding, Applying, Analyzing, Evaluating, and Creating. Scan the headings to see the full scope, then read from wherever your knowledge starts to feel uncertain. Learn more about how BloomWiki works ?

Causal Inference is the process of drawing a conclusion about a causal connection based on the conditions of the occurrence of an effect. While traditional statistics is often summarized by the phrase "Correlation is not Causation," Causal Inference is the science of determining when and how we can conclude that one thing actually causes another. This field is essential for policy-making, medicine, and AI, as we need to know not just that two things happen together (e.g., ice cream sales and shark attacks), but if changing one will change the other (e.g., if we ban ice cream, will shark attacks decrease?).

Remembering

Causal Inference — The branch of statistics concerned with identifying cause-and-effect relationships.
Counterfactual — The "What if?" scenario; what would have happened if a different action had been taken.
Confounder — A variable that influences both the cause and the effect, creating a "spurious" correlation (e.g., 'Heat' causes both ice cream sales and shark attacks).
Randomized Controlled Trial (RCT) — The "Gold Standard" of causal inference, where participants are randomly assigned to groups to eliminate confounders.
Observational Study — A study where the researcher does not control the assignment of treatment (common in economics and sociology).
Selection Bias — When the people who choose a treatment are different from those who don't (e.g., people who take vitamins are already more health-conscious).
Instrumental Variable (IV) — A variable that affects the treatment but has no direct effect on the outcome, used to "isolate" a causal effect in observational data.
Propensity Score Matching — A technique that attempts to estimate the effect of a treatment by accounting for the covariates that predict receiving the treatment.
Directed Acyclic Graph (DAG) — A visual map of causal relationships (nodes and arrows).
Do-calculus — A mathematical framework developed by Judea Pearl for intervening in a causal system.
Average Treatment Effect (ATE) — The average difference in outcomes between the treated and untreated groups.
Natural Experiment — An empirical study where individuals are exposed to the experimental and control conditions as determined by nature or other factors outside the control of the investigators (e.g., a change in law in one state but not another).

Understanding

Causal inference is the quest for the Counterfactual.

The Fundamental Problem of Causal Inference: You can never observe the same person in two different states at the same time. You either took the pill or you didn't. We can never know for sure what would have happened if you hadn't taken it. Therefore, we have to find clever ways to "simulate" the counterfactual.

Judea Pearl's Ladder of Causation: 1. Association (Seeing): "If I see X, how likely is Y?" (Standard Machine Learning). 2. Intervention (Doing): "If I do X, what will happen to Y?" (Causal Inference). 3. Counterfactuals (Imagining): "If I had done X instead of Z, what would have happened?" (The highest form of human/AI reasoning).

The Back-Door Criterion: If you want to know if X causes Y, you must "close the back door"—meaning you must control for all the variables that might be causing both X and Y. If you don't, your result will be "biased" by the Confounder.

Applying

Simulating a Confounder (Spurious Correlation): <syntaxhighlight lang="python"> import numpy as np

def simulate_spurious_correlation(n_samples):

   """
   Shows how 'Heat' causes both Ice Cream and Shark Attacks.
   Without knowing about 'Heat', we might think Ice Cream 
   causes Shark Attacks.
   """
   # The Confounder (The true cause)
   heat = np.random.normal(25, 5, n_samples)
   
   # Effects
   ice_cream = 2 * heat + np.random.normal(0, 2, n_samples)
   sharks = 0.5 * heat + np.random.normal(0, 1, n_samples)
   
   # Correlation between Ice Cream and Sharks
   correlation = np.corrcoef(ice_cream, sharks)[0, 1]
   
   return correlation

print(f"Correlation between Ice Cream and Sharks: {simulate_spurious_correlation(1000):.3f}")

This is a 'Spurious' correlation. Controlling for 'Heat'
would bring this correlation to near zero.

</syntaxhighlight>

Causal Tools in Action: A/B Testing → Using RCTs to see if a specific website change "causes" more sales.; Difference-in-Differences (Diff-in-Diff) → Comparing a "Treatment" group (e.g., a state that raised the minimum wage) to a "Control" group (a neighbor state that didn't).; Regression Discontinuity → Comparing people just above and just below a cutoff (e.g., students who just barely passed an exam vs. those who just barely failed).; Mediation Analysis → Exploring the "mechanism"—does X cause Y directly, or does X cause M, which then causes Y?

Analyzing

Correlation vs. Causation
Feature	Correlation	Causation
Symmetry	Symmetric (If A correlates with B, B correlates with A)	Asymmetric (A causes B, but B doesn't necessarily cause A)
Prediction	Good for "What usually happens?"	Good for "What happens if I change things?"
Math	Covariance, Pearson's r	Do-calculus, DAGs, Structural Equations
Requirement	Observation	Intervention or clever identification

Collider Bias: This is a tricky trap. If you control for a variable that is caused by both your treatment and your outcome, you can accidentally create a fake correlation where none existed. For example, if you only study "Famous Actors," you might find that "Acting Talent" and "Physical Beauty" are negatively correlated—not because they are in real life, but because you need one of them to be famous in the first place.

Evaluating

Evaluating a causal claim:

Exogeneity: Was the treatment really assigned randomly (or "as-if" randomly)?
SUTVA: Does one person's treatment affect another person's outcome (spillover)?
Internal Validity: Is the causal effect true for the group studied?
External Validity (Transportability): Will this causal effect work in a different city or a different decade?

Creating

Future Frontiers:

Causal AI: Moving beyond "Pattern Recognition" (Large Language Models) to "Causal Reasoning" (systems that can answer 'Why?' and 'What if?').
Synthetic Controls: Using AI to create a "perfect" simulated control group for situations where no real control exists.
Causal Discovery: Algorithms that can look at a dataset and "infer" the DAG (the map of arrows) automatically.
Precision Policy: Using causal models to predict which specific individual will benefit from a specific intervention (Heterogeneous Treatment Effects).

@@ Line 1: / Line 1: @@
 {{BloomIntro}}
-Causal inference in AI is the study and application of methods for reasoning about cause-and-effect relationships, not merely statistical correlations. Traditional machine learning excels at finding patterns — "X correlates with Y" — but causal inference asks a different and deeper question: "Does X cause Y, and if I change X, what will happen to Y?" This distinction is crucial for decision-making, policy evaluation, fairness analysis, and building AI systems that can reason reliably about interventions in the world. Causal inference bridges statistics, computer science, economics, and philosophy.
+Causal Inference is the process of drawing a conclusion about a causal connection based on the conditions of the occurrence of an effect. While traditional statistics is often summarized by the phrase "Correlation is not Causation," Causal Inference is the science of determining ''when'' and ''how'' we can conclude that one thing actually causes another. This field is essential for policy-making, medicine, and AI, as we need to know not just that two things happen together (e.g., ice cream sales and shark attacks), but if changing one will change the other (e.g., if we ban ice cream, will shark attacks decrease?).
 == Remembering ==
-* '''Causation''' — A relationship where one event (the cause) brings about another event (the effect). Distinct from correlation.
+* '''Causal Inference''' — The branch of statistics concerned with identifying cause-and-effect relationships.
-* '''Correlation''' — A statistical association between two variables that does not imply causation ("correlation is not causation").
+* '''Counterfactual''' — The "What if?" scenario; what would have happened if a different action had been taken.
-* '''Confounding variable''' — A hidden variable that influences both the apparent cause and effect, creating a spurious association.
+* '''Confounder''' — A variable that influences both the cause and the effect, creating a "spurious" correlation (e.g., 'Heat' causes both ice cream sales and shark attacks).
-* '''Intervention''' — Actively changing the value of a variable (rather than just observing it), denoted do(X=x) in do-calculus.
+* '''Randomized Controlled Trial (RCT)''' — The "Gold Standard" of causal inference, where participants are randomly assigned to groups to eliminate confounders.
-* '''Observational data''' — Data collected without intervening; correlations in observational data may not reflect causal relationships.
+* '''Observational Study''' — A study where the researcher does not control the assignment of treatment (common in economics and sociology).
-* '''Randomized Controlled Trial (RCT)''' — The gold standard for establishing causation: randomly assign units to treatment or control, then measure outcomes.
+* '''Selection Bias''' — When the people who choose a treatment are different from those who don't (e.g., people who take vitamins are already more health-conscious).
-* '''Counterfactual''' — A hypothetical: "What would have happened if X had been different?" e.g., "Would this patient have survived if they had received the drug?"
+* '''Instrumental Variable (IV)''' — A variable that affects the treatment but has no direct effect on the outcome, used to "isolate" a causal effect in observational data.
-* '''Potential outcomes framework''' — A formalization of causal inference using Y(1) (outcome if treated) and Y(0) (outcome if not treated) for each unit.
+* '''Propensity Score Matching''' — A technique that attempts to estimate the effect of a treatment by accounting for the covariates that predict receiving the treatment.
-* '''Average Treatment Effect (ATE)''' — The average causal effect of a treatment across a population: E[Y(1) - Y(0)].
+* '''Directed Acyclic Graph (DAG)''' — A visual map of causal relationships (nodes and arrows).
-* '''DAG (Directed Acyclic Graph)''' — A graphical model where nodes are variables and directed edges represent causal relationships; no cycles.
+* '''Do-calculus''' — A mathematical framework developed by Judea Pearl for intervening in a causal system.
-* '''Backdoor criterion''' — A graphical criterion for identifying which variables to condition on to block spurious correlations (confounding paths) in a causal DAG.
+* '''Average Treatment Effect (ATE)''' — The average difference in outcomes between the treated and untreated groups.
-* '''Do-calculus''' — A set of rules (developed by Judea Pearl) for computing the effect of interventions from observational data and a causal DAG.
+* '''Natural Experiment''' — An empirical study where individuals are exposed to the experimental and control conditions as determined by nature or other factors outside the control of the investigators (e.g., a change in law in one state but not another).
-* '''Instrumental variable''' — A variable that affects the treatment but has no direct effect on the outcome except through the treatment; used to estimate causal effects when confounding is present.
-* '''Causal discovery''' — Algorithms for inferring causal structure (the DAG) from observational data.
-* '''Selection bias''' — Bias arising when the sample used for analysis is not representative of the population of interest.
 == Understanding ==
-The fundamental problem of causal inference is what statistician Donald Rubin called the '''Fundamental Problem of Causal Inference''': we can never observe both potential outcomes for the same unit at the same time. Either a patient received the drug (Y(1) observed, Y(0) unobserved) or they didn't (Y(0) observed, Y(1) unobserved). We can never know what would have happened to the same person under the alternative treatment — the counterfactual.
+Causal inference is the quest for the '''Counterfactual'''.
-Judea Pearl's '''Ladder of Causation''' describes three levels of causal reasoning:
+'''The Fundamental Problem of Causal Inference''': You can never observe the same person in two different states at the same time. You either took the pill or you didn't. We can never ''know'' for sure what would have happened if you hadn't taken it. Therefore, we have to find clever ways to "simulate" the counterfactual.
-. '''Association''' (rung 1): "What is?" — Observing and predicting correlations. Standard ML lives here.
-. '''Intervention''' (rung 2): "What if I do X?" — Reasoning about the effect of deliberate actions. Requires a causal model.
-. '''Counterfactual''' (rung 3): "What if I had done X instead?" — Imagining alternate histories. Requires a complete structural causal model.
-Most ML systems operate only on rung 1. To make reliable decisions and avoid discrimination, AI systems often need rung 2 or 3.
+'''Judea Pearl's Ladder of Causation''':
+. '''Association''' (Seeing): "If I see X, how likely is Y?" (Standard Machine Learning).
+. '''Intervention''' (Doing): "If I ''do'' X, what will happen to Y?" (Causal Inference).
+. '''Counterfactuals''' (Imagining): "If I had done X instead of Z, what ''would have'' happened?" (The highest form of human/AI reasoning).
-'''Why this matters for AI''':
+'''The Back-Door Criterion''': If you want to know if X causes Y, you must "close the back door"—meaning you must control for all the variables that might be causing both X and Y. If you don't, your result will be "biased" by the '''Confounder'''.
-* '''Spurious correlations''': A model that classifies "pneumonia" as lower risk may have learned that pneumonia patients sent to the ICU have lower final mortality — confusing treatment effect with baseline risk.
-* '''Fairness''': Is a model discriminating based on race, or is it using variables that are correlated with race but causally related to the outcome? Causal fairness criteria give precise answers.
-* '''Policy decisions''': If we deploy an AI to recommend interventions, we must understand the causal effect of those interventions — not just their correlation with past outcomes.
-* '''Robustness''': Models that learn causal relationships rather than spurious correlations generalize better when the environment changes (distribution shift).
 == Applying ==
-'''Estimating causal treatment effects with DoWhy:'''
+'''Simulating a Confounder (Spurious Correlation):'''
 <syntaxhighlight lang="python">
-import dowhy
-from dowhy import CausalModel
-import pandas as pd
 import numpy as np
-# Generate synthetic data: drug → recovery, age → both drug and recovery (confounder)
+def simulate_spurious_correlation(n_samples):
-np.random.seed(42)
+    """
-n = 1000
+    Shows how 'Heat' causes both Ice Cream and Shark Attacks.
-age = np.random.normal(50, 15, n)
+    Without knowing about 'Heat', we might think Ice Cream
-drug = (0.3 * age + np.random.normal(0, 10, n) > 30).astype(int)  # older → more likely to receive drug
+    causes Shark Attacks.
-recovery = 0.5 * drug - 0.02 * age + np.random.normal(0, 1, n)   # drug helps, but age hurts
+    """
+    # The Confounder (The true cause)
+    heat = np.random.normal(25, 5, n_samples)
+    # Effects
+    ice_cream = 2 * heat + np.random.normal(0, 2, n_samples)
+    sharks = 0.5 * heat + np.random.normal(0, 1, n_samples)
+    # Correlation between Ice Cream and Sharks
+    correlation = np.corrcoef(ice_cream, sharks)[0, 1]
+    return correlation
-df = pd.DataFrame({'age': age, 'drug': drug, 'recovery': recovery})
+print(f"Correlation between Ice Cream and Sharks: {simulate_spurious_correlation(1000):.3f}")
+# This is a 'Spurious' correlation. Controlling for 'Heat'
-# Step 1: Define the causal model as a DAG
+# would bring this correlation to near zero.
-model = CausalModel(
-    data=df,
-    treatment="drug",
-    outcome="recovery",
-    common_causes=["age"]  # age is a confounder
-)
-# Step 2: Identify the causal effect
-identified_estimand = model.identify_effect(proceed_when_unidentifiable=True)
-print(identified_estimand)
-# Step 3: Estimate the causal effect (controlling for age via backdoor adjustment)
-estimate = model.estimate_effect(
-    identified_estimand,
-    method_name="backdoor.linear_regression",
-    control_value=0,
-    treatment_value=1,
-)
-print(f"Estimated ATE: {estimate.value:.3f}")
-# Should recover ~0.5 (the true causal effect)
-# Naive regression ignoring confounding:
-naive_corr = df[df.drug==1]['recovery'].mean() - df[df.drug==0]['recovery'].mean()
-print(f"Naive (biased) correlation: {naive_corr:.3f}")
-# Will be biased because older people receive drug more but recover less
-# Step 4: Refute the estimate (robustness checks)
-refutation = model.refute_estimate(estimate,
-                                   method_name="random_common_cause")
-print(refutation)  # Good estimate: adding random confounder shouldn't change result
 </syntaxhighlight>
-; Causal inference methods by scenario
+; Causal Tools in Action
-: '''RCT available''' → Compute difference in means (no adjustment needed — randomization handles confounding)
+: '''A/B Testing''' → Using RCTs to see if a specific website change "causes" more sales.
-: '''Observational, confounders known''' → Propensity score matching, IPW, doubly-robust estimators
+: '''Difference-in-Differences (Diff-in-Diff)''' → Comparing a "Treatment" group (e.g., a state that raised the minimum wage) to a "Control" group (a neighbor state that didn't).
-: '''Observational, confounders unknown''' → Instrumental variables (IV), regression discontinuity
+: '''Regression Discontinuity''' → Comparing people just above and just below a cutoff (e.g., students who just barely passed an exam vs. those who just barely failed).
-: '''Time series, sequential treatments''' → G-computation, marginal structural models
+: '''Mediation Analysis''' → Exploring the "mechanism"—does X cause Y directly, or does X cause M, which then causes Y?
-: '''Causal structure unknown''' → Causal discovery: PC algorithm, FCI, LiNGAM, NOTEARS
-: '''Heterogeneous effects''' → Causal forests, meta-learners (S, T, X-learner)
 == Analyzing ==
 {| class="wikitable"
-|+ Causal Inference Methods Comparison
+|+ Correlation vs. Causation
-! Method !! Assumptions !! When to Use !! Python Library
+! Feature !! Correlation !! Causation
-|-
-| Regression adjustment || No unmeasured confounding, correct functional form || Known confounders, sufficient data || statsmodels, DoWhy
-|-
-| Propensity score matching || No unmeasured confounding || Binary treatment, observational data || DoWhy, CausalML
 |-
-| Instrumental variables || Valid instrument exists || Hidden confounders, instrument available || DoWhy, linearmodels
+| Symmetry || Symmetric (If A correlates with B, B correlates with A) || Asymmetric (A causes B, but B doesn't necessarily cause A)
 |-
-| Difference-in-differences || Parallel trends assumption || Panel data, natural experiment || CausalPy, statsmodels
+| Prediction || Good for "What usually happens?" || Good for "What happens if I change things?"
 |-
-| Causal forest || No unmeasured confounding || Heterogeneous treatment effects || EconML, GRF (R)
+| Math || Covariance, Pearson's r || Do-calculus, DAGs, Structural Equations
 |-
-| Regression discontinuity || Local continuity at threshold || Sharp threshold in treatment assignment || RDD (R), DoWhy
+| Requirement || Observation || Intervention or clever identification
 |}
-'''Key pitfalls and failure modes:'''
+'''Collider Bias''': This is a tricky trap. If you control for a variable that is caused by ''both'' your treatment and your outcome, you can accidentally create a fake correlation where none existed. For example, if you only study "Famous Actors," you might find that "Acting Talent" and "Physical Beauty" are negatively correlated—not because they are in real life, but because you need ''one'' of them to be famous in the first place.
-* '''Conditioning on colliders''' — Incorrectly conditioning on a variable that is a common effect (not cause) of treatment and outcome opens spurious paths rather than blocking them. Using a DAG is essential to identify what to condition on.
-* '''Positivity violation''' — If some subgroups never receive (or always receive) treatment, causal effects for those subgroups cannot be estimated from data. Check overlap in propensity score distributions.
-* '''Model misspecification''' — Parametric methods (regression adjustment) assume a specific functional form. Use doubly-robust or non-parametric methods (causal forests) to reduce this risk.
-* '''Weak instruments''' — IV estimation with a weak instrument (low correlation with treatment) produces large, unreliable estimates. Test for instrument strength (F-statistic > 10 rule of thumb).
-* '''Extrapolation beyond support''' — Causal effect estimates are only reliable within the range of the observed data. Be cautious about extrapolating to new populations or intervention levels.
 == Evaluating ==
-Causal inference evaluation is uniquely challenging because we can never observe the true counterfactual:
+Evaluating a causal claim:
+# '''Exogeneity''': Was the treatment really assigned randomly (or "as-if" randomly)?
-'''Simulation studies (synthetic data)''': Generate data from a known causal model where the true ATE is known. Evaluate whether each estimator recovers the true ATE. This is the standard way to compare methods.
+# '''SUTVA''': Does one person's treatment affect another person's outcome (spillover)?
+# '''Internal Validity''': Is the causal effect true for the group studied?
-'''Semi-synthetic benchmarks''': Use real covariate data but simulate outcomes from a known causal model. ACIC (Atlantic Causal Inference Conference) benchmarks provide standardized challenges.
+# '''External Validity (Transportability)''': Will this causal effect work in a different city or a different decade?
-'''Sensitivity analysis''': Test how robust the estimate is to violations of key assumptions (e.g., unmeasured confounding). E-values quantify the minimum strength of unmeasured confounding that would overturn the result.
-'''Refutation tests (DoWhy)''': Specific tests designed to detect estimation problems:
-* Adding a random confounder shouldn't change a good estimate
-* Replacing treatment with a random variable should produce ATE ≈ 0
-* Placebo treatment (observed but causally irrelevant) should produce ATE ≈ 0
-Expert practitioners present causal estimates with explicit assumption documentation, sensitivity analyses, and refutation test results — not just a point estimate. An ATE that fails refutation tests or is sensitive to unmeasured confounding should be reported with appropriate uncertainty.
 == Creating ==
-Designing a causal inference analysis pipeline for business decision-making:
+Future Frontiers:
+# '''Causal AI''': Moving beyond "Pattern Recognition" (Large Language Models) to "Causal Reasoning" (systems that can answer 'Why?' and 'What if?').
-'''1. Problem formulation'''
+# '''Synthetic Controls''': Using AI to create a "perfect" simulated control group for situations where no real control exists.
-<syntaxhighlight lang="text">
+# '''Causal Discovery''': Algorithms that can look at a dataset and "infer" the DAG (the map of arrows) automatically.
-Define: What is the treatment? What is the outcome? What is the population?
+# '''Precision Policy''': Using causal models to predict which specific individual will benefit from a specific intervention (Heterogeneous Treatment Effects).
-    ↓
-Draw the causal DAG: variables, causal paths, potential confounders
-    ↓
-Apply backdoor criterion: which variables block all confounding paths?
-    ↓
-Check identifiability: can the causal effect be estimated from available data?
-    ↓
-Assess data availability: are the required conditioning variables measured?
-</syntaxhighlight>
-'''2. Estimation pipeline'''
-<syntaxhighlight lang="text">
-Data collection: ensure confounders are measured, check positivity
-    ↓
-Covariate balance check: compare treatment/control distributions
-    ↓
-Propensity score modeling (if observational): logistic regression or GBM
-    ↓
-Effect estimation: doubly-robust AIPW or causal forest for heterogeneous effects
-    ↓
-Sensitivity analysis: E-value, Rosenbaum bounds
-    ↓
-Refutation tests: DoWhy refute suite
-    ↓
-Report: ATE ± CI, assumptions, sensitivity, subgroup effects
-</syntaxhighlight>
-'''3. Causal ML in production (uplift modeling)'''
-* Estimate heterogeneous treatment effects: which users benefit most from intervention?
-* Use X-learner or causal forest to estimate CATE (Conditional ATE) per user
-* Target intervention to users with highest predicted CATE
-* Monitor actual vs. predicted uplift in A/B tests post-deployment
-* Continuously retrain causal model as new experimental data arrives
-[[Category:Artificial Intelligence]]
-[[Category:Machine Learning]]
-[[Category:Causal Inference]]
 [[Category:Statistics]]
+[[Category:Science]]
+[[Category:Economics]]

Causal Inference: Difference between revisions

Revision as of 14:37, 23 April 2026

Contents

Remembering

Understanding

Applying

Analyzing

Evaluating

Creating

Navigation menu

Causal Inference: Difference between revisions

Revision as of 14:37, 23 April 2026

Remembering

Understanding

Applying

Analyzing

Evaluating

Creating

Navigation menu

Search