MetaanalysisNew using Stata
Houssein Assaad
Senior Statistician and Software Developer
StataCorp LLC
Stata Conference
July 30 , 2020
Philadelphia Inlaws' basement
Outline
 What is metaanalysis? Why and when you should use it ?
 Data setup: Effectsizes and Metaanalysis models
 The meta suite: Exploring the syntax
 Examples: Two case studies
 Summary
 Subgroupanalysis:
 Publication bias: NSAIDS data
 The meta control panel
 BCG vaccine efficacy against tuberculosis
 MA is the science of combining results from multiple studies addressing a similar scientific question
What is metaanalysis (MA) ?
 MA has been mostly used in medicine, but also in econometrics, ecology, psychology, and education to name a few
 The goal of MA is to explore consistencies and discrepancies among the studies , and if sensible, provide a unified conclusion
 Potential problem: Publication bias, which occur when the results of the published literature in a certain domain differ systematically in its results from all the relevant research results
Why would you want to use metaanalysis ?
 An increase in power and improvement in precision
 The ability to answer questions not posed by individual studies and to settle controversies arising from conflicting claims
Potential advantages of MA include:
Data setup
 \(K\) studies
treatment group
control group
 Study \(j\) estimates effect size, \(\theta_j\) and its standard error \(\sigma_j\)
 Effect size (ES): a value that reflects the magnitude of group differences or the strength of a relationship between 2 variables
vs
ES
e.g. OR, RR, RD, Hedges's \(g\), Cohen's \(d\) etc.
variable 1
variable 2
ES
e.g. correlation coef. \(r\), regression coef \(\beta\) etc.
MA models
\(K\) independent studies, each reports:
 An estimate, \(\hat{\theta}_j\), of the true (unknown) effect size \(\theta_j\)
 An estimate, \(\hat{\sigma}_j\) , of its standard error
 Estimating \(\theta\) (and \(\tau^2\) with RE model) is one of the main goals of MA
Model  Assumption  Target of inference 

common effect (CE)  common value 

fixed effects (FE)  fixed  
Random effects (RE) 
Forest Plot
The meta suite
Exploring the syntax
 Precomputed (generic) effect sizes
 Effect sizes for binary data
 Effect sizes for continuous data
meta set
Data setup and MA declaration
create variables starting with _meta_ (e.g. _meta_es, _meta_se) to be used with all other commands
etc.
meta funnelplot
meta forestplot
meta summarize
meta esize
meta regress
Binary summary data:
++
 study nt1 nt0 nc1 nc0 

 1 4 119 11 128 
 2 6 300 29 274 
 3 3 228 11 209 
 4 62 13536 248 12619 
 5 33 5036 47 5761 
++
Precomputed effect size data:
++
 study ES ES_se 

 1 .03 .125 
 2 .12 .147 
 3 .14 .167 
 4 1.18 .373 
 5 .26 .369 
++
Continuous summary data:
++
 study n1 m1 sd1 n2 m2 sd2 

 1 13 0.096 0.020 14 0.920 0.047 
 2 18 0.000 0.066 11 1.110 0.094 
 3 10 0.054 0.088 11 0.956 0.040 
 4 15 0.000 0.019 20 0.899 0.098 
 5 15 0.036 0.020 10 1.102 0.014 
++
Precomputed effect size data: (es and CI)
++
 study ES cil ciu 

 1 .03 .2149955 .2749955 
 2 .12 .16811471 .40811471 
 3 .14 .46731399 .18731399 
 4 1.18 .44893343 1.9110666 
 5 .26 .46322671 .98322671 
++
meta esize nt1 nt0 nc1 nc0
meta esize n1 m1 sd1 n2 m2 sd2
meta set ES ES_se
meta set ES cil ciu
Scenario II
Effect sizes computed from summary data
Binary summary data
(\(2\times 2\) tables)
Continuous summary data
(sample size, mean, and standard deviation for each group)
Hedges's \(g\), Cohen's \(d\), Glass's \(\Delta_1\) and \(\Delta_2\), and (raw) mean difference \(D\)
log oddsratio \(\log\)(OR), \(\log\)(ORpeto), log riskratio \(\log\)(RR) , and risk difference \(RD\)
Scenario I
Precomputed Effect sizes
Correlation \(r\), \(\log\)(HR), \(\text{logit}\)(\(p\)), etc.
meta esize n1 m1 sd1 n2 m2 sd2
meta esize nt1 nt0 nc1 nc0
meta set ES ES_se
Effect sizes for binary data
. webuse bcg, clear
(Efficacy of BCG vaccine against tuberculosis)
. keep studylbl npost  nnegc
. describe

storage display value
variable name type format label variable label

studylbl str27 %27s Study label
npost int %9.0g Number of TB positive cases in treated group
nnegt long %9.0g Number of TB negative cases in treated group
nposc int %9.0g Number of TB positive cases in control group
nnegc long %9.0g Number of TB negative cases in control group

group  TB+  TB 

Vaccinated  npost = 4  nnegt = 119 
control  nposc = 11  nnegc = 128 
each study, \(j\), yield a \(2\times 2\) table, e.g. for study 1:
++
 studylbl npost nnegt nposc nnegc 

1.  Aronson, 1948 4 119 11 128 
2.  Ferguson & Simes, 1949 6 300 29 274 
3.  Rosenthal et al., 1960 3 228 11 209 
++
. list in 1/3
their SEs and CIs
computes one of
meta esize
Effect sizes for binary data
. meta esize npost nnegt nposc nnegc
Metaanalysis setting information
Study information
No. of studies: 13
Study label: Generic < controlled by studylabel()
Study size: _meta_studysize
Summary data: npost nnegt nposc nnegc
Effect size
Type: lnoratio < controlled by esize()
Label: Log OddsRatio < controlled by eslabel()
Variable: _meta_es
Zerocells adj.: None; no zero cells < controlled by zerocells()
Precision
Std. Err.: _meta_se
CI: [_meta_cil, _meta_ciu]
CI level: 95% < controlled by level()
Model and method < controlled by random[()], fixed[()],
Model: Randomeffects and common[()]
Method: REML
 We can now use, for example, meta summarize to compute the overall effect size (mean log oddsratio in this example)
. meta summarize
Effectsize label: Log OddsRatio
Effect size: _meta_es
Std. Err.: _meta_se
Metaanalysis summary Number of studies = 13
Randomeffects model Heterogeneity:
Method: REML tau2 = 0.3378
I2 (%) = 92.07
H2 = 12.61

Study  Log OddsRatio [95% Conf. Interval] % Weight
+
Study 1  0.939 2.110 0.233 4.98
Study 2  1.666 2.560 0.772 6.34
Study 3  1.386 2.677 0.096 4.49
(Output omitted)
Study 11  0.341 0.560 0.121 9.88
Study 12  0.447 0.986 1.879 3.97
Study 13  0.017 0.542 0.507 8.45
+
theta  0.745 1.110 0.381

Test of theta = 0: z = 4.01 Prob > z = 0.0001
Test of homogeneity: Q = chi2(12) = 163.16 Prob > Q = 0.0000
 Compute \(\log(RR)\) (esize(lnrratio)) and use a RE model based on the DerSimonianLaird method (random(dlaird))
. meta esize npost nnegt nposc nnegc, esize(lnrratio) random(dlaird)
. meta update, esize(lnrratio) random(dlaird)
Or equivalently,
Metaanalysis setting information
Study information
No. of studies: 10
(omitted output)
Effect size
Type: lnrratio
Label: Log RiskRatio
Variable: _meta_es
Zerocells adj.: None; no zero cells
(omitted output)
Model and method
Model: Randomeffects
Method: DerSimonianLaird
You may change the default MA model using one of options random[()], common or fixed and the default effect size via option esize()
. meta update, studylabel(studylbl) eslabel("My label")
You may provide more descriptive labels for the studies and the effect size using options studylabel() and eslabel()
Metaanalysis setting information from meta esize
Study information
No. of studies: 13
Study label: studylbl
Study size: _meta_studysize
Summary data: npost nnegt nposc nnegc
Effect size
Type: lnrratio
Label: My label
Variable: _meta_es
Zerocells adj.: None; no zero cells
Precision
Std. Err.: _meta_se
CI: [_meta_cil, _meta_ciu]
CI level: 95%
Model and method
Model: Randomeffects
Method: DerSimonianLaird
 Had there been zero cells, you may specify how to handle them via the zerocells() option
. meta update, zerocells(.2)
// or
. meta update, zerocells(tacc)
We will construct a forest plot for the 1st 4 studies to see the effect of adding study labels and effect size label
. meta update, studylabel(studylbl) eslabel("Log(RR)")
. meta forestplot in 1/4
studylabel(studylbl)
eslabel("Log(RR)")
Forest plot without options studylabel() and eslabel()
 At any point in your analysis, you may use meta query to remind yourself of your current MA settings
. meta query
> meta esize npost nnegt nposc nnegc , esize(lnrratio) studylabel(studylbl) eslabel(My
> label) random(dlaird)
Metaanalysis setting information from meta esize
Study information
No. of studies: 13
Study label: studylbl
Study size: _meta_studysize
Summary data: npost nnegt nposc nnegc
Effect size
Type: lnrratio
Label: My label
Variable: _meta_es
Zerocells adj.: None; no zero cells
Precision
Std. Err.: _meta_se
CI: [_meta_cil, _meta_ciu]
CI level: 95%
Model and method
Model: Randomeffects
Method: DerSimonianLaird
syntax
 If you have access to summary data, use meta esize to compute and declare effect sizes such as an odds ratio or a Hedges’s \(g\).
 To check whether your data are already meta set or to see the current meta settings, use meta query
 To update some of your metaanalysis settings after the declaration, use meta update.
 Alternatively, if you have only precomputed (generic) effect sizes, use meta set.
Summary I
 meta set and meta esize create system variables with names starting with _meta_ to be used by all subsequent meta commands.
Data sets used
Two data sets (bcg.dta and nsaids.dta) will be used throughout this webinar, you may further explore them below
Exploring heterogenity
subgroupanalysis
Case study: Efficacy of BCG vaccine against tuberculosis
 Heterogeneity: Variability among the effect sizes beyond what is expected due to random sampling (chance).
 Exploring the possible reasons for heterogeneity between studies is an important aspect of a MA
. webuse bcgset, clear
(Efficacy of BCG vaccine against tuberculosis; set with meta esize)
. describe npost  studylbl

storage display value
variable name type format label variable label

npost int %9.0g Number of TB positive cases in treated group
nnegt long %9.0g Number of TB negative cases in treated group
nposc int %9.0g Number of TB positive cases in control group
nnegc long %9.0g Number of TB negative cases in control group
latitude byte %9.0g Absolute latitude of the study location (in
degrees)
studylbl str27 %27s Study label

 MA consists of 13 studies (Colditz et al. [1994]) to evaluate the efficacy of the BCG vaccine against tuberculosis (TB)
 Vaccine efficacy has been controversial
++
 author npost nnegt nposc nnegc latitude 

1.  Aronson 4 119 11 128 44 
2.  Ferguson & Simes 6 300 29 274 55 
3.  Rosenthal et al. 3 228 11 209 42 
++
. list author npost  nnegc latitude in 1/3
. meta esize npost  nnegc, esize(lnrratio) studylabel(studylbl)
. meta forestplot
. meta forest, eform nullrefline
nonsignificant RR
nonoverlapping CIs
Quantifying heterogeneity
Sampling error
Betweenstudy heterogeneity
Total observed heterogeneity
 Subgroup analysis focuses on explaining
(withinstudy heterogeneity)
meta summarize and meta forestplot report
Subgroup analysis
 Subgroup analysis involves dividing the data into subgroups, in order to make comparisons between them.
 The studies are grouped based on study or participants’ characteristics, and an overall effectsize estimate is computed for each group
 The goal of subgroup analysis is to compare these overall estimates across groups and determine whether the considered grouping helps explain some of the observed betweenstudy heterogeneity.
Compare the BCG vaccine efficacy in cold vs hot climate
Berkey et al (1995) and Borenstein et al (2009) suggested that latitude (as a surrogate for climate) could explain some of the variation in the efficacy of the BCG vaccine
 We will dichotomize latitude into two categories: hotter climate vs colder climate
. generate byte latitude_01 = latitude_c > 0
. label define latval 0 "hot climate" 1 "cold climate"
. label values latitude_01 latval
++
 studylbl latitude latitude_01 

1.  Aronson, 1948 44 cold climate 
2.  Ferguson & Simes, 1949 55 cold climate 
3.  Rosenthal et al., 1960 42 cold climate 
4.  Hart & Sutherland, 1977 52 cold climate 
5.  FrimodtMoller et al., 1973 13 hot climate 
++
. list studylbl latitude latitude_01 in 1/5
. meta forestplot, subgroup(latitude_01) nullrefline rr
summary for each group
Test of \(H_0: \theta_{grp1} = \theta_{grp2}\)
 You may report your results as vaccine efficacies via the transform() option
meta forest, subgroup(latitude_01) ///
transform("Vaccine efficacy": efficacy)
Other supported transformations within the transform() option are: corr, exp, invlogit, and tanh.
Summary II
 Heterogeneity is the variability among the ES beyond what is expected due to random sampling.
 \(I^2\), \(H^2\) are statistics used to quantify heterogeneity among the ES
 Whenever possible, reasons behind heterogeneity should always be explored via subgroup analysis or metaregression.
 Large unexplained heterogeneity could mean that:
 overall ES has no meaningful interpretation in practice
 it does not make sense to conduct a metaanalysis.
Smallstudy effect (Publication bias)
 Smallstudy effects (Sterne, Gavaghan, and Egger 2000) is used in MA to describe the cases when the results of smaller studies differ systematically from the results of larger studies.
 One of the reasons behind smallstudy effect is publication bias (or more generally reporting bias)
 Publication bias arises when the decision to publish a study depends on the statistical significance of its results.
Random subset
 Suppose that we are missing some of the studies in our MA.
Observed studies
valid conclusions albeit wider CIs, less powerful tests (less info)
systematically different
Studies not included in the MA (missing studies)
(e.g. when smaller studies with nonsignificant findings are suppressed from publication)
our metaanalytic results will be biased and decisions based on them are invalid
Tools for smallstudy effects analysis
 The funnel plot
 Tests for smallstudy effects
 The trimandfill analysis
 Simple funnel plot
 Contourenhanced funnel plot (onesided and twosided significance contours)
 Several precision metrics
 Egger's, Peters's, and Harbord's regressionbased tests with the possibility to include moderators to account for heterogeneity
 Begg and mazumdar's test
meta funnelplot
meta bias
meta trimfill
Smallstudy effect (potentially due to publication bias)
Little evidence of Smallstudy effect
which means the individual ES should be distributed randomly around the overall ES
Large and small studies tell the same story about \(\theta\)
Large and small studies tell different stories about \(\theta\)
large studies
small studies
large studies
small studies
. webuse nsaidsset, clear
(Effectiveness of nonsteroidal antiinflammatory drugs; set with meta esize)
. meta funnelplot
Effectsize label: Log OddsRatio
Effect size: _meta_es
Std. Err.: _meta_se
Model: Commoneffect
Method: Inversevariance
gap (missing studies ?)
 You may enhance the contour funnel plot via the addplot() option
. scalar theta = r(theta) // obtained from previous meta funnel command r() results
// position legend at 10 o'clock inside the graph region
. local legopts ring(0) position(10) cols(1) size(small) symxsize(*0.6)
. local opts horizontal range(0 1.6) lpattern(dash) lcolor("red") ///
legend(order(1 2 3 4 5 6) label(6 "95% pseudo CI") `legopts')
. meta funnel, contours(1 5 10) ///
addplot(function theta1.96*x, `opts'  function theta+1.96*x, `opts')
. meta bias, harbord
 We will test for funnelplot asymmetry and use the Harbord's test instead of the Egger's test as we are working with \(\log\)(OR)
Effectsize label: Log OddsRatio
Effect size: _meta_es
Std. Err.: _meta_se
Regressionbased Harbord test for smallstudy effects
Randomeffects model
Method: REML
H0: beta1 = 0; no smallstudy effects
beta1 = 3.03
SE of beta1 = 0.741
z = 4.09
Prob > z = 0.0000
Nonparametric trimandfill analysis of publication bias
Linear estimator, imputing on the left
Iteration Number of studies = 47
Model: Randomeffects observed = 37
Method: REML imputed = 10
Pooling
Model: Randomeffects
Method: REML

Studies  Log OddsRatio [95% Conf. Interval]
+
Observed  1.322 1.031 1.613
Observed + Imputed  1.035 0.726 1.343

. meta trimfill, funnel(contours(1 5 10) legend(`legopts'))
 We can perform a trimandfill analysis to assess the effect of missing studies on the overall ES and request a contourenhanced funnel plot based on the complete (observed + filled) set studies
Summary III
 Publication bias occurs if studies with favourable results are more likely to be published than studies with unfavourable results.
 Smallstudy effect is manifested graphically by funnelplot asymmetry
 Publication bias is only one of the reasons behind funnelplot asymmetry.
 Publication bias should be assessed after you have accounted for heterogeneity in your MA (see ex8 of meta funnelpot and ex1 of meta bias)
 You can investigate smallstudy effects visually via meta funnelplot, test for it via meta bias, and assess its impact on the overall ES via meta trimfill
Other features
 Cumulative (and stratifiedcumulative) MA forest plots
 L'Abbé plots
 Multiple subgroup analyses forest plots
 Metaregression
 Bubble plots after metaregression
 KnappHartung (aka SidikJonkman) adjustments
 Effect sizes for continuous data (Hedges's \(g\), Cohen's \(d\), etc.)
 Precomputed effect sizes (correlation \(r\), \(\log(HR)\), etc.)
 Stratified funnel plots with various precision metrics
The meta control panel
Prefer to avoid typing commands ? Everything I have showed you can be done in the meta control panel with few mouse clicks
meta set
meta esize
meta summarize
meta forestplot
meta labbeplot
meta regress
estat bubbleplot
meta funnelplot
meta bias
meta trimfill
A tour of the meta control panel
Summary
 Effect sizes for binary and continuous data may be computed via meta esize and generic (precomputed) ES may be specified via meta set
 It is important to include an assessment of publication bias to insure the integrity of the MA . This may be done using the meta funnelplot, meta bias and meta trimfill commands
 When substantial heterogeneity is present among the studies, the reasons behind this heterogeneity should be explored via subgroup analysis ( meta summarize, subgroup()) or metaregression ( meta regress)
 Use meta update and meta query to update and describe your current MA settings, respectively
 Results of a MA are best summarized numerically using meta summarize, or graphically using meta forestplot. This includes subgroupanalysis forest plots and CMA forest plots.
Thank You!
Meta: Stata UGM
By H Assaad