A unified model for calling variants in tumors while controlling the false discovery rate
Text
Johannes Köster
University of Duisburg-Essen https://koesterlab.github.io
A unified model capturing all
sources of uncertainty
Calling variants from tumor/normal pairs
Results
https://prosic.github.io
+
CATCATTGAAATA----GGCACATGCTGCTCGAA
CAGCATTGAAATATATAGGCACATGCTGCTCGAA
CAGCATTGAAATATATAGGCACAT------CGAA
tumor
healthy
reference
somatic
deletion
germline
insertion
somatic
SNV
P(Z_i^t\mid\theta_h,\theta_c) = \overbrace{\pi_i^t}^{\text{correctly mapped}} \Big( \overbrace{\alpha \big(\overbrace{\theta_c \tau_t p_i}^{\text{variant}} + \overbrace{(1 - \theta_c \tau_t) a_i}^{\text{reference}} \big)}^{\text{from cancer cell}} + \overbrace{(1 - \alpha) \big(\overbrace{\theta_h \tau_h p_i}^{\text{variant}} + \overbrace{(1 - \theta_h \tau_h) a_i}^{\text{reference}} \big)}^{\text{from healthy cell}} \Big) + \overbrace{(1 - \pi_i^h) o_i}^{\text{wrongly mapped}}
Mapping uncertainty
Typing uncertainty
Purity
\(\Pr(\text{left read}) \cdot \Pr(\text{right read}) \cdot \Pr(\text{insert size})\)
=
=
=
\(p_i / a_i\)
Sampling bias
Allele frequencies
Allele frequency estimation
Precision/Recall
False-discovery rate (FDR) control
+
Candidate variants
Mapped NGS reads
simulated data
Poster PROSIC
By Johannes Köster
Poster PROSIC
Poster at ETOS 2018
- 1,913