Bernhard Konrad
Institute of Applied Mathematics
The University of British Columbia
March 30th, 2015
Canada
BC
Survey Data from epidemiologists at the BCCDC in Vancouver:
Data mining, visualization, and modelling to gain actionable insight and predict the outcome of health interventions in the Vancouver HIV MSM epidemic.
Simplest model
Simplest version: Fit linear regression, extrapolate
Simplest version:
Fit linear regression, extrapolate
Major shortcomings of linear regression:
Only 1/100-1/1000 (depending on exposure)
Simplest stochastic analogue: Birth-death process
Birth rate:
Clearance rate:
Leads to exponential growth with rate b-c
t = 0
N = INOCULUM_SIZE
while N < DETECTION_LIMIT:
# Time to next event is exponentially distributed
t += EXP(bN+cN)
# Likelihood of event is proportional to reaction rates
if RAND(0,1) < (b/b+c):
# birth event
N = N+1
else:
# clearance event
N = N-1
return tInstead of millions of Gillespie simulations, my research group found a way to calculate pdfs semi-analytically
Set
Then we can easily derive the Master equation (balance equation)
v-1 and single birth
v+1 and single clearance
Leave state v by birth or clearance event
Knowing P means understanding everything!
Set
A little helper: The probability generating function
Master equation: infinitely many ODEs!
Use branching property to derive single ODE
Set
Key observations:
(branching property)
Set
Instead of taking millions of derivatives (numerically unstable)...
...use Cauchy's integral formula!
Set
Regardless of the method for the computation (Gillespie or semi-analytic), one major flaw remains:
HIV infection is very unlikely (0.1%-1%), but all observed patients became infected!
Extremely biased training set!
Maybe the virus was initially just "lucky" to grow faster than average!?
Heads: +1.1 points
Tails: -1 point
A) Start with 20 points
B) Start with 1 point
Lose when <= 0 points
Fit conditioned process to match the bias in the data
Virus not spontaneously cleared
Using Bayes rule and Markov property
where q is likelihood that exposure by single virion leads to infection
For actual risk calculations, replace b with more detailed model of viral replication
Fit unconditioned and conditioned model to patient data, assuming different risk scenarios
Distribution of the length of the window period
Due to hook: Shorter than previously expected!
If infected: Averaged likelihood of detectable viral load at time t
Likelihood of infection given a negative test at time t
days of neg test --- remaining risk
5.2 --- 0.475%
5.9 --- 0.450%
7.2 --- 0.375%
9.2 --- 0.250%
12.8 --- 0.125%
18.8 --- 0.050%
22.2 --- 0.025%
Thank you!
Built on Nature paper:
Engineering anti-malaria agent that stops onwards transmission in mosquitoes
Trade-off between virulence and environmental effect
Lead to of volunteers to build free webapp with 1500 solutions to previous UBC Math exam questions