Causal Inference: Difference between revisions

From BloomWiki
Jump to navigation Jump to search
BloomWiki: Causal Inference
Tag: Manual revert
BloomWiki: Causal Inference
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
<div style="background-color: #4B0082; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;">
{{BloomIntro}}
{{BloomIntro}}
Causal inference in AI is the study and application of methods for reasoning about cause-and-effect relationships, not merely statistical correlations. Traditional machine learning excels at finding patterns — "X correlates with Y" — but causal inference asks a different and deeper question: "Does X cause Y, and if I change X, what will happen to Y?" This distinction is crucial for decision-making, policy evaluation, fairness analysis, and building AI systems that can reason reliably about interventions in the world. Causal inference bridges statistics, computer science, economics, and philosophy.
Causal Inference is the process of drawing a conclusion about a causal connection based on the conditions of the occurrence of an effect. While traditional statistics is often summarized by the phrase "Correlation is not Causation," Causal Inference is the science of determining ''when'' and ''how'' we can conclude that one thing actually causes another. This field is essential for policy-making, medicine, and AI, as we need to know not just that two things happen together (e.g., ice cream sales and shark attacks), but if changing one will change the other (e.g., if we ban ice cream, will shark attacks decrease?).
</div>


== Remembering ==
__TOC__
* '''Causation''' — A relationship where one event (the cause) brings about another event (the effect). Distinct from correlation.
* '''Correlation''' — A statistical association between two variables that does not imply causation ("correlation is not causation").
* '''Confounding variable''' — A hidden variable that influences both the apparent cause and effect, creating a spurious association.
* '''Intervention''' — Actively changing the value of a variable (rather than just observing it), denoted do(X=x) in do-calculus.
* '''Observational data''' — Data collected without intervening; correlations in observational data may not reflect causal relationships.
* '''Randomized Controlled Trial (RCT)''' — The gold standard for establishing causation: randomly assign units to treatment or control, then measure outcomes.
* '''Counterfactual''' — A hypothetical: "What would have happened if X had been different?" e.g., "Would this patient have survived if they had received the drug?"
* '''Potential outcomes framework''' — A formalization of causal inference using Y(1) (outcome if treated) and Y(0) (outcome if not treated) for each unit.
* '''Average Treatment Effect (ATE)''' — The average causal effect of a treatment across a population: E[Y(1) - Y(0)].
* '''DAG (Directed Acyclic Graph)''' — A graphical model where nodes are variables and directed edges represent causal relationships; no cycles.
* '''Backdoor criterion''' — A graphical criterion for identifying which variables to condition on to block spurious correlations (confounding paths) in a causal DAG.
* '''Do-calculus''' — A set of rules (developed by Judea Pearl) for computing the effect of interventions from observational data and a causal DAG.
* '''Instrumental variable''' — A variable that affects the treatment but has no direct effect on the outcome except through the treatment; used to estimate causal effects when confounding is present.
* '''Causal discovery''' — Algorithms for inferring causal structure (the DAG) from observational data.
* '''Selection bias''' — Bias arising when the sample used for analysis is not representative of the population of interest.


== Understanding ==
<div style="background-color: #000080; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;">
The fundamental problem of causal inference is what statistician Donald Rubin called the '''Fundamental Problem of Causal Inference''': we can never observe both potential outcomes for the same unit at the same time. Either a patient received the drug (Y(1) observed, Y(0) unobserved) or they didn't (Y(0) observed, Y(1) unobserved). We can never know what would have happened to the same person under the alternative treatment — the counterfactual.
== <span style="color: #FFFFFF;">Remembering</span> ==
* '''Causal Inference''' — The branch of statistics concerned with identifying cause-and-effect relationships.
* '''Counterfactual''' — The "What if?" scenario; what would have happened if a different action had been taken.
* '''Confounder''' — A variable that influences both the cause and the effect, creating a "spurious" correlation (e.g., 'Heat' causes both ice cream sales and shark attacks).
* '''Randomized Controlled Trial (RCT)''' — The "Gold Standard" of causal inference, where participants are randomly assigned to groups to eliminate confounders.
* '''Observational Study''' — A study where the researcher does not control the assignment of treatment (common in economics and sociology).
* '''Selection Bias''' — When the people who choose a treatment are different from those who don't (e.g., people who take vitamins are already more health-conscious).
* '''Instrumental Variable (IV)''' — A variable that affects the treatment but has no direct effect on the outcome, used to "isolate" a causal effect in observational data.
* '''Propensity Score Matching''' — A technique that attempts to estimate the effect of a treatment by accounting for the covariates that predict receiving the treatment.
* '''Directed Acyclic Graph (DAG)''' — A visual map of causal relationships (nodes and arrows).
* '''Do-calculus''' — A mathematical framework developed by Judea Pearl for intervening in a causal system.
* '''Average Treatment Effect (ATE)''' — The average difference in outcomes between the treated and untreated groups.
* '''Natural Experiment''' — An empirical study where individuals are exposed to the experimental and control conditions as determined by nature or other factors outside the control of the investigators (e.g., a change in law in one state but not another).
</div>


Judea Pearl's '''Ladder of Causation''' describes three levels of causal reasoning:
<div style="background-color: #006400; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;">
1. '''Association''' (rung 1): "What is?" — Observing and predicting correlations. Standard ML lives here.
== <span style="color: #FFFFFF;">Understanding</span> ==
2. '''Intervention''' (rung 2): "What if I do X?" — Reasoning about the effect of deliberate actions. Requires a causal model.
Causal inference is the quest for the '''Counterfactual'''.
3. '''Counterfactual''' (rung 3): "What if I had done X instead?" — Imagining alternate histories. Requires a complete structural causal model.


Most ML systems operate only on rung 1. To make reliable decisions and avoid discrimination, AI systems often need rung 2 or 3.
'''The Fundamental Problem of Causal Inference''': You can never observe the same person in two different states at the same time. You either took the pill or you didn't. We can never ''know'' for sure what would have happened if you hadn't taken it. Therefore, we have to find clever ways to "simulate" the counterfactual.


'''Why this matters for AI''':
'''Judea Pearl's Ladder of Causation''':
* '''Spurious correlations''': A model that classifies "pneumonia" as lower risk may have learned that pneumonia patients sent to the ICU have lower final mortality — confusing treatment effect with baseline risk.
1. '''Association''' (Seeing): "If I see X, how likely is Y?" (Standard Machine Learning).
* '''Fairness''': Is a model discriminating based on race, or is it using variables that are correlated with race but causally related to the outcome? Causal fairness criteria give precise answers.
2. '''Intervention''' (Doing): "If I ''do'' X, what will happen to Y?" (Causal Inference).
* '''Policy decisions''': If we deploy an AI to recommend interventions, we must understand the causal effect of those interventions — not just their correlation with past outcomes.
3. '''Counterfactuals''' (Imagining): "If I had done X instead of Z, what ''would have'' happened?" (The highest form of human/AI reasoning).
* '''Robustness''': Models that learn causal relationships rather than spurious correlations generalize better when the environment changes (distribution shift).


== Applying ==
'''The Back-Door Criterion''': If you want to know if X causes Y, you must "close the back door"—meaning you must control for all the variables that might be causing both X and Y. If you don't, your result will be "biased" by the '''Confounder'''.
'''Estimating causal treatment effects with DoWhy:'''
</div>


<div style="background-color: #8B0000; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;">
== <span style="color: #FFFFFF;">Applying</span> ==
'''Simulating a Confounder (Spurious Correlation):'''
<syntaxhighlight lang="python">
<syntaxhighlight lang="python">
import dowhy
from dowhy import CausalModel
import pandas as pd
import numpy as np
import numpy as np


# Generate synthetic data: drug → recovery, age → both drug and recovery (confounder)
def simulate_spurious_correlation(n_samples):
np.random.seed(42)
    """
n = 1000
    Shows how 'Heat' causes both Ice Cream and Shark Attacks.
age = np.random.normal(50, 15, n)
    Without knowing about 'Heat', we might think Ice Cream
drug = (0.3 * age + np.random.normal(0, 10, n) > 30).astype(int) # older → more likely to receive drug
    causes Shark Attacks.
recovery = 0.5 * drug - 0.02 * age + np.random.normal(0, 1, n)   # drug helps, but age hurts
    """
    # The Confounder (The true cause)
    heat = np.random.normal(25, 5, n_samples)
   
    # Effects
    ice_cream = 2 * heat + np.random.normal(0, 2, n_samples)
    sharks = 0.5 * heat + np.random.normal(0, 1, n_samples)
   
    # Correlation between Ice Cream and Sharks
    correlation = np.corrcoef(ice_cream, sharks)[0, 1]
   
    return correlation


df = pd.DataFrame({'age': age, 'drug': drug, 'recovery': recovery})
print(f"Correlation between Ice Cream and Sharks: {simulate_spurious_correlation(1000):.3f}")
 
# This is a 'Spurious' correlation. Controlling for 'Heat'  
# Step 1: Define the causal model as a DAG
# would bring this correlation to near zero.
model = CausalModel(
    data=df,
    treatment="drug",
    outcome="recovery",
    common_causes=["age"]  # age is a confounder
)
 
# Step 2: Identify the causal effect
identified_estimand = model.identify_effect(proceed_when_unidentifiable=True)
print(identified_estimand)
 
# Step 3: Estimate the causal effect (controlling for age via backdoor adjustment)
estimate = model.estimate_effect(
    identified_estimand,
    method_name="backdoor.linear_regression",
    control_value=0,
    treatment_value=1,
)
print(f"Estimated ATE: {estimate.value:.3f}")
# Should recover ~0.5 (the true causal effect)
 
# Naive regression ignoring confounding:
naive_corr = df[df.drug==1]['recovery'].mean() - df[df.drug==0]['recovery'].mean()
print(f"Naive (biased) correlation: {naive_corr:.3f}")
# Will be biased because older people receive drug more but recover less
 
# Step 4: Refute the estimate (robustness checks)
refutation = model.refute_estimate(estimate,
                                  method_name="random_common_cause")
print(refutation)  # Good estimate: adding random confounder shouldn't change result
</syntaxhighlight>
</syntaxhighlight>


; Causal inference methods by scenario
; Causal Tools in Action
: '''RCT available''' → Compute difference in means (no adjustment needed — randomization handles confounding)
: '''A/B Testing''' → Using RCTs to see if a specific website change "causes" more sales.
: '''Observational, confounders known''' → Propensity score matching, IPW, doubly-robust estimators
: '''Difference-in-Differences (Diff-in-Diff)''' → Comparing a "Treatment" group (e.g., a state that raised the minimum wage) to a "Control" group (a neighbor state that didn't).
: '''Observational, confounders unknown''' → Instrumental variables (IV), regression discontinuity
: '''Regression Discontinuity''' → Comparing people just above and just below a cutoff (e.g., students who just barely passed an exam vs. those who just barely failed).
: '''Time series, sequential treatments''' → G-computation, marginal structural models
: '''Mediation Analysis''' → Exploring the "mechanism"—does X cause Y directly, or does X cause M, which then causes Y?
: '''Causal structure unknown''' → Causal discovery: PC algorithm, FCI, LiNGAM, NOTEARS
</div>
: '''Heterogeneous effects''' → Causal forests, meta-learners (S, T, X-learner)


== Analyzing ==
<div style="background-color: #8B4500; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;">
== <span style="color: #FFFFFF;">Analyzing</span> ==
{| class="wikitable"
{| class="wikitable"
|+ Causal Inference Methods Comparison
|+ Correlation vs. Causation
! Method !! Assumptions !! When to Use !! Python Library
! Feature !! Correlation !! Causation
|-
|-
| Regression adjustment || No unmeasured confounding, correct functional form || Known confounders, sufficient data || statsmodels, DoWhy
| Symmetry || Symmetric (If A correlates with B, B correlates with A) || Asymmetric (A causes B, but B doesn't necessarily cause A)
|-
|-
| Propensity score matching || No unmeasured confounding || Binary treatment, observational data || DoWhy, CausalML
| Prediction || Good for "What usually happens?" || Good for "What happens if I change things?"
|-
|-
| Instrumental variables || Valid instrument exists || Hidden confounders, instrument available || DoWhy, linearmodels
| Math || Covariance, Pearson's r || Do-calculus, DAGs, Structural Equations
|-
|-
| Difference-in-differences || Parallel trends assumption || Panel data, natural experiment || CausalPy, statsmodels
| Requirement || Observation || Intervention or clever identification
|-
| Causal forest || No unmeasured confounding || Heterogeneous treatment effects || EconML, GRF (R)
|-
| Regression discontinuity || Local continuity at threshold || Sharp threshold in treatment assignment || RDD (R), DoWhy
|}
|}


'''Key pitfalls and failure modes:'''
'''Collider Bias''': This is a tricky trap. If you control for a variable that is caused by ''both'' your treatment and your outcome, you can accidentally create a fake correlation where none existed. For example, if you only study "Famous Actors," you might find that "Acting Talent" and "Physical Beauty" are negatively correlated—not because they are in real life, but because you need ''one'' of them to be famous in the first place.
* '''Conditioning on colliders''' — Incorrectly conditioning on a variable that is a common effect (not cause) of treatment and outcome opens spurious paths rather than blocking them. Using a DAG is essential to identify what to condition on.
</div>
* '''Positivity violation''' — If some subgroups never receive (or always receive) treatment, causal effects for those subgroups cannot be estimated from data. Check overlap in propensity score distributions.
* '''Model misspecification''' — Parametric methods (regression adjustment) assume a specific functional form. Use doubly-robust or non-parametric methods (causal forests) to reduce this risk.
* '''Weak instruments''' — IV estimation with a weak instrument (low correlation with treatment) produces large, unreliable estimates. Test for instrument strength (F-statistic > 10 rule of thumb).
* '''Extrapolation beyond support''' — Causal effect estimates are only reliable within the range of the observed data. Be cautious about extrapolating to new populations or intervention levels.
 
== Evaluating ==
Causal inference evaluation is uniquely challenging because we can never observe the true counterfactual:


'''Simulation studies (synthetic data)''': Generate data from a known causal model where the true ATE is known. Evaluate whether each estimator recovers the true ATE. This is the standard way to compare methods.
<div style="background-color: #483D8B; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;">
 
== <span style="color: #FFFFFF;">Evaluating</span> ==
'''Semi-synthetic benchmarks''': Use real covariate data but simulate outcomes from a known causal model. ACIC (Atlantic Causal Inference Conference) benchmarks provide standardized challenges.
Evaluating a causal claim:
 
# '''Exogeneity''': Was the treatment really assigned randomly (or "as-if" randomly)?
'''Sensitivity analysis''': Test how robust the estimate is to violations of key assumptions (e.g., unmeasured confounding). E-values quantify the minimum strength of unmeasured confounding that would overturn the result.
# '''SUTVA''': Does one person's treatment affect another person's outcome (spillover)?
 
# '''Internal Validity''': Is the causal effect true for the group studied?
'''Refutation tests (DoWhy)''': Specific tests designed to detect estimation problems:
# '''External Validity (Transportability)''': Will this causal effect work in a different city or a different decade?
* Adding a random confounder shouldn't change a good estimate
</div>
* Replacing treatment with a random variable should produce ATE ≈ 0
* Placebo treatment (observed but causally irrelevant) should produce ATE ≈ 0
 
Expert practitioners present causal estimates with explicit assumption documentation, sensitivity analyses, and refutation test results — not just a point estimate. An ATE that fails refutation tests or is sensitive to unmeasured confounding should be reported with appropriate uncertainty.
 
== Creating ==
Designing a causal inference analysis pipeline for business decision-making:
 
'''1. Problem formulation'''
<syntaxhighlight lang="text">
Define: What is the treatment? What is the outcome? What is the population?
    ↓
Draw the causal DAG: variables, causal paths, potential confounders
    ↓
Apply backdoor criterion: which variables block all confounding paths?
    ↓
Check identifiability: can the causal effect be estimated from available data?
    ↓
Assess data availability: are the required conditioning variables measured?
</syntaxhighlight>
 
'''2. Estimation pipeline'''
<syntaxhighlight lang="text">
Data collection: ensure confounders are measured, check positivity
    ↓
Covariate balance check: compare treatment/control distributions
    ↓
Propensity score modeling (if observational): logistic regression or GBM
    ↓
Effect estimation: doubly-robust AIPW or causal forest for heterogeneous effects
    ↓
Sensitivity analysis: E-value, Rosenbaum bounds
    ↓
Refutation tests: DoWhy refute suite
    ↓
Report: ATE ± CI, assumptions, sensitivity, subgroup effects
</syntaxhighlight>


'''3. Causal ML in production (uplift modeling)'''
<div style="background-color: #2F4F4F; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;">
* Estimate heterogeneous treatment effects: which users benefit most from intervention?
== <span style="color: #FFFFFF;">Creating</span> ==
* Use X-learner or causal forest to estimate CATE (Conditional ATE) per user
Future Frontiers:
* Target intervention to users with highest predicted CATE
# '''Causal AI''': Moving beyond "Pattern Recognition" (Large Language Models) to "Causal Reasoning" (systems that can answer 'Why?' and 'What if?').
* Monitor actual vs. predicted uplift in A/B tests post-deployment
# '''Synthetic Controls''': Using AI to create a "perfect" simulated control group for situations where no real control exists.
* Continuously retrain causal model as new experimental data arrives
# '''Causal Discovery''': Algorithms that can look at a dataset and "infer" the DAG (the map of arrows) automatically.
# '''Precision Policy''': Using causal models to predict which specific individual will benefit from a specific intervention (Heterogeneous Treatment Effects).


[[Category:Artificial Intelligence]]
[[Category:Machine Learning]]
[[Category:Causal Inference]]
[[Category:Statistics]]
[[Category:Statistics]]
[[Category:Science]]
[[Category:Economics]]
</div>

Latest revision as of 01:48, 25 April 2026

How to read this page: This article maps the topic from beginner to expert across six levels � Remembering, Understanding, Applying, Analyzing, Evaluating, and Creating. Scan the headings to see the full scope, then read from wherever your knowledge starts to feel uncertain. Learn more about how BloomWiki works ?

Causal Inference is the process of drawing a conclusion about a causal connection based on the conditions of the occurrence of an effect. While traditional statistics is often summarized by the phrase "Correlation is not Causation," Causal Inference is the science of determining when and how we can conclude that one thing actually causes another. This field is essential for policy-making, medicine, and AI, as we need to know not just that two things happen together (e.g., ice cream sales and shark attacks), but if changing one will change the other (e.g., if we ban ice cream, will shark attacks decrease?).

Remembering[edit]

  • Causal Inference — The branch of statistics concerned with identifying cause-and-effect relationships.
  • Counterfactual — The "What if?" scenario; what would have happened if a different action had been taken.
  • Confounder — A variable that influences both the cause and the effect, creating a "spurious" correlation (e.g., 'Heat' causes both ice cream sales and shark attacks).
  • Randomized Controlled Trial (RCT) — The "Gold Standard" of causal inference, where participants are randomly assigned to groups to eliminate confounders.
  • Observational Study — A study where the researcher does not control the assignment of treatment (common in economics and sociology).
  • Selection Bias — When the people who choose a treatment are different from those who don't (e.g., people who take vitamins are already more health-conscious).
  • Instrumental Variable (IV) — A variable that affects the treatment but has no direct effect on the outcome, used to "isolate" a causal effect in observational data.
  • Propensity Score Matching — A technique that attempts to estimate the effect of a treatment by accounting for the covariates that predict receiving the treatment.
  • Directed Acyclic Graph (DAG) — A visual map of causal relationships (nodes and arrows).
  • Do-calculus — A mathematical framework developed by Judea Pearl for intervening in a causal system.
  • Average Treatment Effect (ATE) — The average difference in outcomes between the treated and untreated groups.
  • Natural Experiment — An empirical study where individuals are exposed to the experimental and control conditions as determined by nature or other factors outside the control of the investigators (e.g., a change in law in one state but not another).

Understanding[edit]

Causal inference is the quest for the Counterfactual.

The Fundamental Problem of Causal Inference: You can never observe the same person in two different states at the same time. You either took the pill or you didn't. We can never know for sure what would have happened if you hadn't taken it. Therefore, we have to find clever ways to "simulate" the counterfactual.

Judea Pearl's Ladder of Causation: 1. Association (Seeing): "If I see X, how likely is Y?" (Standard Machine Learning). 2. Intervention (Doing): "If I do X, what will happen to Y?" (Causal Inference). 3. Counterfactuals (Imagining): "If I had done X instead of Z, what would have happened?" (The highest form of human/AI reasoning).

The Back-Door Criterion: If you want to know if X causes Y, you must "close the back door"—meaning you must control for all the variables that might be causing both X and Y. If you don't, your result will be "biased" by the Confounder.

Applying[edit]

Simulating a Confounder (Spurious Correlation): <syntaxhighlight lang="python"> import numpy as np

def simulate_spurious_correlation(n_samples):

   """
   Shows how 'Heat' causes both Ice Cream and Shark Attacks.
   Without knowing about 'Heat', we might think Ice Cream 
   causes Shark Attacks.
   """
   # The Confounder (The true cause)
   heat = np.random.normal(25, 5, n_samples)
   
   # Effects
   ice_cream = 2 * heat + np.random.normal(0, 2, n_samples)
   sharks = 0.5 * heat + np.random.normal(0, 1, n_samples)
   
   # Correlation between Ice Cream and Sharks
   correlation = np.corrcoef(ice_cream, sharks)[0, 1]
   
   return correlation

print(f"Correlation between Ice Cream and Sharks: {simulate_spurious_correlation(1000):.3f}")

  1. This is a 'Spurious' correlation. Controlling for 'Heat'
  2. would bring this correlation to near zero.

</syntaxhighlight>

Causal Tools in Action
A/B Testing → Using RCTs to see if a specific website change "causes" more sales.
Difference-in-Differences (Diff-in-Diff) → Comparing a "Treatment" group (e.g., a state that raised the minimum wage) to a "Control" group (a neighbor state that didn't).
Regression Discontinuity → Comparing people just above and just below a cutoff (e.g., students who just barely passed an exam vs. those who just barely failed).
Mediation Analysis → Exploring the "mechanism"—does X cause Y directly, or does X cause M, which then causes Y?

Analyzing[edit]

Correlation vs. Causation
Feature Correlation Causation
Symmetry Symmetric (If A correlates with B, B correlates with A) Asymmetric (A causes B, but B doesn't necessarily cause A)
Prediction Good for "What usually happens?" Good for "What happens if I change things?"
Math Covariance, Pearson's r Do-calculus, DAGs, Structural Equations
Requirement Observation Intervention or clever identification

Collider Bias: This is a tricky trap. If you control for a variable that is caused by both your treatment and your outcome, you can accidentally create a fake correlation where none existed. For example, if you only study "Famous Actors," you might find that "Acting Talent" and "Physical Beauty" are negatively correlated—not because they are in real life, but because you need one of them to be famous in the first place.

Evaluating[edit]

Evaluating a causal claim:

  1. Exogeneity: Was the treatment really assigned randomly (or "as-if" randomly)?
  2. SUTVA: Does one person's treatment affect another person's outcome (spillover)?
  3. Internal Validity: Is the causal effect true for the group studied?
  4. External Validity (Transportability): Will this causal effect work in a different city or a different decade?

Creating[edit]

Future Frontiers:

  1. Causal AI: Moving beyond "Pattern Recognition" (Large Language Models) to "Causal Reasoning" (systems that can answer 'Why?' and 'What if?').
  2. Synthetic Controls: Using AI to create a "perfect" simulated control group for situations where no real control exists.
  3. Causal Discovery: Algorithms that can look at a dataset and "infer" the DAG (the map of arrows) automatically.
  4. Precision Policy: Using causal models to predict which specific individual will benefit from a specific intervention (Heterogeneous Treatment Effects).