Inference of Nonlinear Causal Effects with Application to TWAS with GWAS Summary Data

(Joint work with C. Li, H. Xue, X. Shen and W. Pan)
Ben Dai (CUHK)
Frontiers of Data Science, HZ (2024)

Causal diagram for IV
Goal. Infer the causal effect from exposure to outcome
Issues. Simple regression?
- yields biased estimator of β (unobserved confounders)
- The promise of instrumental variables (IVs):
-
unbiased estimation of the causal effect is possible without explicitly enumerating all confounders.
-

Causal diagram for IV
Goal. Infer the causal effect from exposure to outcome
Issues. Simple regression?
- yields biased estimator of β (unobserved confounders)
- The promise of instrumental variables (IVs):
-
unbiased estimation of the causal effect is possible without explicitly enumerating all confounders.
-


Source: Howell et al. (2018)
Causal diagram for IV
Goal. Infer the causal effect from exposure to outcome
Issues. Simple regression?
- yields biased estimator of β (unobserved confounders)
- The promise of instrumental variables (IVs):
-
unbiased estimation of the causal effect is possible without explicitly enumerating all confounders.
-
- Random allocation alleles suggests SNPs are IVs for gene testing

TWAS data types
-
Controlled access to indiv. level data, e.g., GTEx dataset
-
The sample size is much smaller than GWAS
TWAS accepts various forms of input data types:
Individual-level gene expression data + GWAS

SNPs -> Gene expression
SNPs -> Outcome

-
GWAS boasts a large sample size:
ukb-b
(~400K)
Recall: 2SLS (with invalid IVs)
x=zTθ+w, y=βx+zTα +ε. (1)
- β∈R, α∈Rp, θ∈Rp are unknown parameters
-
(w,ε) are correlated (confounder), and (w,ε)⊥z (IVs)
-
α=0 indicates the violation of IV assumptions
Goal: estimation and statistical inference on β
- application: potential causal genes for AD

Recall: 2SLS
x=zTθ+w, y=βx+zTα +ε. (1)
solves θ and (β,α) separately based on two independent data.
D1=(Z1,x1) with n1; and D2=(Z2TZ2,Z2Ty2) with n2
θ^=(Z1TZ1)−1Z1Tx1, and impute x^=zTθ^
S1
By plugging the Stage 1 into the Stage 2, we obtain
y=zTθβ+zTα+e,e=wβ+ε, E(e)=0, E(e2)=σe2.
(now, z is uncorrelated with e)
2SLS
Obs
minβ,α(θ^β+α)TZ2TZ2(θβ+α)−2y2TZ2(θ^β+α), ∥α∥0≤K
S2
- ∥⋅∥0 penalty can be replaced by SCAD and MCP
- 2SLS only requires summary statistics:
- (Z1TZ1,Z1Tx1) and Z2TZ2,Z2Ty2)
- Identifiability conditions: majority / plurality rules
Kang et al (2016a, 2016b) and Guo at al. (2018)
Ref
Recall: 2SLS
x=zTθ+w, y=βx+zTα +ε. (1)
solves θ and (β,α) separately based on two independent data.
D1=(Z1,x1) with n1; and D2=(Z2TZ2,Z2Ty2) with n2
θ^=(Z1TZ1)−1Z1Tx1, and impute x^=zTθ^
S1
By plugging the Stage 1 into the Stage 2, we obtain
y=zTθβ+zTα+e,e=wβ+ε, E(e)=0, E(e2)=σe2.
(now, z is uncorrelated with e)
2SLS
Obs
minβ,α(θ^β+α)TZ2TZ2(θβ+α)−2y2TZ2(θ^β+α), ∥α∥0≤K
S2
- ∥⋅∥0 penalty can be replaced by SCAD and MCP
- 2SLS only requires summary statistics:
- (Z1TZ1,Z1Tx1) and (Z2TZ2,Z2Ty2)
- Identifiability conditions: majority / plurality rules
Haavelmo (1943), Theil (1953), Kang et al (2016) and Guo et al. (2018)
Ref
Nonlinear effect?
Common Lab Tests Normal Ranges. (Source: Healthline)
- U-shaped (nonlinear) causal effect
- Other examples:
- body weight / cholesterol levels -> longevity
-
ongevityalcohol consumption -> CAD
-
exercise -> immune responsive disease resistance
Component | Normal range |
---|---|
White blood cells | 3,500 to 10,500 cells/mcL |
Platelets glucose CO2 Ca+ |
150,000 to 450,000/mcL 70-99 mg/dL 23-29 mEq/L 8.6-10.2 mg/dL |
Nonlinear effect?
Common Lab Tests Normal Ranges. (Source: Healthline)
- U-shaped (nonlinear) causal effect
- Other examples:
- body weight / cholesterol levels -> longevity
-
ongevityalcohol consumption -> CAD
-
exercise -> immune responsive disease resistance
Component | Normal range |
---|---|
White blood cells | 3,500 to 10,500 cells/mcL |
Platelets glucose CO2 Ca+ |
150,000 to 450,000/mcL 70-99 mg/dL 23-29 mEq/L 8.6-10.2 mg/dL |
Difficulty
- The sample size of individual data is relatively small
-
GWAS can not be used to learn nonlinear pattern but you still to use it
- It may be too "expensive" to accurately estimate the non-parametric nonlinear causal effect
Nonlinear causal model
Suppose (z,x,y) satisfy a nonlinear causal model:
ϕ(x)=zTθ+w,y=βϕ(x)+zTα+ε.
- β and ϕ are only identifiable up to a multiplicative scalar. Thus, we fix ∥θ∥2=1 and β≥0
-
ϕ(⋅) is an arbitrary nonlinear transformation
-
Incorporates the classical 2SLS and PT-2SLS

Suppose (z,x,y) satisfy a nonlinear causal model:
ϕ(x)=zTθ+w,y=βϕ(x)+zTα+ε.
Interpretation
β is called the marginal causal effect
Suppose (z,x,y) satisfy a nonlinear causal model:
ϕ(x)=zTθ+w,y=βϕ(x)+zTα+ε.
Interpretation
ϕ(⋅) is called the nonlinear causal transformation
Suppose (z,x,y) satisfy a nonlinear causal model:
ϕ(x)=zTθ+w,y=βϕ(x)+zTα+ε.
Interpretation
βϕ(⋅) is called the nonlinear causal effect
Suppose (z,x,y) satisfy a nonlinear causal model:
ϕ(x)=zTθ+w,y=βϕ(x)+zTα+ε.
Interpretation
- β>0 indicates the presence of the causal relation, and its hypothesis testing and CI are developed
-
ϕ(⋅) can also be estimated
-
If the model is well-specified, βϕ(⋅)→ ATE
Difficulty
- The sample size of individual data is relatively small
-
GWAS can not be used to learn nonlinear pattern but you still to use it
- It may be too "expensive" to accurately estimate the non-parametric nonlinear causal effect
Suppose (z,x,y) satisfy a nonlinear causal model:
ϕ(x)=zTθ+w,y=βϕ(x)+zTα+ε.
Method
Observation
- Very similar to the single index model: x⊥z∣zTθ
- θ can be estimated via sliced inverse regression (SIR; Li (1991)) or sufficient dimension reduction (SDR; Cook (2009))
-
WITHOUT estimating ϕ(⋅) !!!
Once θ^ is obtained ...
- Impute ϕ(x) as zTθ^
- Plugging into Stage 2, solve β via a sparse reg as in 2SLS
Suppose (z,x,y) satisfy a nonlinear causal model:
ϕ(x)=zTθ+w,y=βϕ(x)+zTα+ε.
Method
Observation
- Very similar to the single index model: x⊥z∣zTθ
- θ can be estimated via sliced inverse regression (SIR; Li (1991)) or sufficient dimension reduction (SDR; Cook (2009))
-
WITHOUT estimating ϕ(⋅) !!!
Once θ^ is obtained ...
- Impute ϕ(x) as zTθ^
- Plugging into Stage 2, solve β via a sparse reg as in 2SLS

Suppose (z,x,y) satisfy a nonlinear causal model:
ϕ(x)=zTθ+w,y=βϕ(x)+zTα+ε.
Method
Observation
- Very similar to the single index model: x⊥z∣zTθ
- θ can be estimated via sliced inverse regression (SIR; Li (1991)) or sufficient dimension reduction (SDR; Cook (2009))
-
WITHOUT estimating ϕ(⋅) !!!
Once θ^ is obtained ...
- Impute ϕ(x) as zTθ^
- Plugging into Stage 2, solve β via a sparse reg as in 2SLS
Inference
Consider the hypotheses:
H0:β=0,H1:β>0
where rejecting the null hypothesis H0 indicates an evidence for causal influence of the exposure x on the outcome y.
Define the pivotal test statistic:
T=σe(θTΣθ−θTΣ∗A(ΣAA)−1ΣA∗θ)1/2n21/2β
where A={j:αj=0}, Σ∗A,ΣA∗ denote the columns and rows of Σ indexed by A, respectively.

Does NOT require an estimation of ϕ(⋅)
Misspecified nonlinearity
It is possible that the nonlinear transformation ϕ(⋅) could be misspecified in practice, especially when two structural equations do not share the same transformation for the exposure:
ϕ(x)=zTθ+w, y=βψ(x)+zTα+ε,
where ϕ=ψ are two different nonlinear functions, hypothesis testing remains valid
Corollary 1. In the above model, with the same conditions and the same test, then the Type-I error is controlled by α under the null hypothesis.
ϕ(x)=zTθ+w,y=βϕ(x)+zTα+ε.
Estimation of nonlinear TF

-
ϕ can be estimated by a two-stage procedure.
-
Estimate E(zTθ∣x) by a non-parametric regression
-
ρ is est via the uncorrelatedness between zTθ and w
-
Simulation
- The performance for both β and ϕ(⋅) are considered.
- For proposed method (2SIR), we propose to combine tests based on different slices, denoted as Comb-2SIR, using the Cauchy combining method (Liu et al. 2020)
- Specifically, the results are compared against 2SLS and 2SLS based on the Yeo-Johnson power transformation (a generalized Box-Cox transformation (Yeo 2000)), denoted as 2SLS and PT-2SLS
-
Six transformations are considered in the simulation:
-
linear: ϕ(x)=x;
-
logarithm function: ϕ(x)=log(x);
-
inverse function: ϕ(x)=1/x
-
piecewise linear function: ϕ(x)=xI(x≤0)+0.5xI(x>0)
-
cube root function: ϕ(x)=x1/3;
-
quadratic function: ϕ(x)=x2
Simulation

Empirical Type I error (β0=0) and power (β0=0.05,0.10,0.15) of the proposed nonlinear causal test for the simulated example (marginal effect inference).


Simulation

Empirical Type I error (β0=0) and power (β0=0.05,0.10,0.15) of the proposed nonlinear causal test for the simulated example (marginal effect inference).


Simulation

Empirical Type I error (β0=0) and power (β0=0.05,0.10,0.15) of the proposed nonlinear causal test for the simulated example (marginal effect inference).


Application
The bar-plot of p-values of significant genes for AD by at least one method, where the y-axis represents −log10(p). The results are based on ADNI + IGAP GWAS datasets.

- 12 were significant by 2SLS and/or PT-2SLS, 18 are significant by SIR and/or Comb-2SIR.
-
7 genes, including TOMM40, are only identified by Comb-2SIR. We searched these genes in GWAS results and found ALL of them have been reported to be significantly associated with AD.
More results

APOC1: a significant gene over all methods.
More results

APOC1: a significant gene over all methods.

BCL3: a significant gene only identified by 2SIR/Comb-2SIR.
More results

APOC1: a significant gene over all methods.

Negative control derived from ADNI, where outcomes are permuted.

More results

APOC1: a significant gene over all methods.

Negative control derived from ADNI, where outcomes are permuted.

More simulated examples based on different sample sizes and dimensions with:
- Standard setting
- Invalid IVs
- Categorical IVs
- Weak IVs
- Non-additive and epistatic effects
- Misspecified models
Software
Software

Software


Compared with 2SLS
Strength
-
2SIR relaxes the linear assumption underlying the relationships between (z,x,y).
-
Compatibility: The method exhibits minimal power loss when the underlying true model is linear and the same datasets are used.
-
Easy to use, well-documented software, more power
Weakness
-
Additional assumptions on instrumental variables (IVs): zz should follow an elliptical symmetric distribution; however, this issue appears to be relatively minor in TWAS, see Example 3.
-
Cannot use summary statistics data in Stage 1: ZTx
Thank you!


If you like nl-causal
please star 🌟 our Github repository, thank you!


nl-causal
By statmlben
nl-causal
[CLeaR2024] Inference of Nonlinear Causal Effects with Application to TWAS with GWAS Summary Data
- 89