External Validity: From Do-Calculus to Transportability Across Populations
Abstract
- External Validity
- Transportability
- Introducing selection diagrams
INTRODUCTION:
THREATS VS. ASSUMPTIONS
INTRODUCTION: THREATS VS. ASSUMPTIONS
- Why generalization?
- Arbitrary or drastically different environments
- Sufficiently similar environments
- Prior methods:
- Meta analysis
- Hierarchical models
- Rarely make explicit distinction between experimental and observational regime
INTRODUCTION: THREATS VS. ASSUMPTIONS
- This paper:
- Limits on what can be achieved in practice
- Problems that are likely to be encountered when populations differ significantly
-
What population differences can be circumvented
-
What differences constitute theoretical impediments
- Limits on what can be achieved in practice
INTRODUCTION: THREATS VS. ASSUMPTIONS
- Standard literature:
- Studying threats over licensing assumptions. Why?
- Safer to cite, little risk related to endorsing something
- Assumptions are self-destructive in their honesty.
- Threats can be communicated in plain English
- Studying threats over licensing assumptions. Why?
INTRODUCTION: THREATS VS. ASSUMPTIONS
- Create licenses to transport using:
- Causal diagrams
- Models of interventions
- Counterfactuals
- Using Do-Calculus to:
- Test the feasibility of transport
- Estimating causal effects in the target population
PRELIMINARIES: THE LOGICAL FOUNDATIONS OF CAUSAL INFERENCE
PRELIMINARIES: THE LOGICAL FOUNDATIONS OF CAUSAL INFERENCE
- (nonparametric) Structural Equations Models (SEM)

Causal Models as Inference Engines
- Causal assumptions
- An inference engine

Assumptions in Nonparametric Models (SEM)
- A set U of background or exogenous variables, representing factors outside the model.
- A set V = {V1, . . . , Vn} of endogenous variables, assumed to be observable.
- A set F of functions {f1,...,fn} such that each fi determines the value of Vi ∈ V.
- A joint probability distribution P(u) over U.
Assumptions in Nonparametric Models (SEM)


Representing Interventions, Counterfactuals and Causal Effects
- Interventions through a mathematical operator called do(x)
- For example, let's use do(x0) on the previous model, now we have:



Identification, d-Separation and Causal Calculus
- Identification in linear parametric settings
- Identification in nonparametric formulation
- Identifiability:
- Acausal query Q(M) is identifiable, given a set of assumptions A, if for any two (fully specified) models, M1 and M2, that satisfy A, we have:

The Rules of do-Calculus
- the graph obtained by deleting from G all arrows pointing to nodes in X
- the graph obtained by deleting from G all arrows emerging from nodes in X.
- Z(W) is the set of Z-nodes
that are not ancestors of
any W-node in .
G_{\bar{X}}
G_{\bar{X}}

G_{\underline{X}}
G_{\bar{X}}
INFERENCE ACROSS POPULATIONS: MOTIVATING EXAMPLES
Example 1


Example 2


Example 3

External Validity - From Do Calculus to Transportability
By Amin Mohamadi
External Validity - From Do Calculus to Transportability
- 264