Cam Davidson-Pilon
Follow these slides bit.ly/nwcamdp
1. We have different beliefs (read: assign different probabilities) of events, like political outcomes, occurring based on our information of the world.
2. We should have similar beliefs that a rolled dice will come show a 6 1/6th of the time.
P(A) is called the prior probability of event A occuring
P(A|X) is called the posterior probability of event A occuring, given information X
Coin Flip Example
P(A): The coin has a 50 percent chance of being Heads.
P(A|X): You look at the coin, observe a Heads has landed, denote this information X, and trivially assign probability 1.0 to Heads and 0.0 to Tails.
Buggy Code Example
P(A): This big, complex code likely has a bug in it.
P(A|X): The code passed all X tests; there still might be a bug, but its presence is less likely now.
Medical Patient Example
P(A): The patient could have any number of diseases.
P(A|X): Performing a blood test generated evidence X, ruling out some of the possible diseases from consideration.
¯\_(ツ)_/¯
import pymc as pm
alpha = 0.5
lambda_1 = pm.Exponential("lambda_1", alpha)
lambda_2 = pm.Exponential("lambda_2", alpha)
tau = pm.DiscreteUniform('tau', 0, 75)
@pm.deterministic
def lambda_(tau=tau, lambda_1=lambda_1, lambda_2=lambda_2):
dynamic_lambdas = np.zeros(n_count_data)
dynamic_lambdas[:tau] = lambda_1 # lambda before tau is lambda1
dynamic_lambdas[tau:] = lambda_2 # lambda after (and including) tau is lambda2
return dynamic_lambdas
observations = pm.Poisson("obs", lambda_, value=count_data, observed=True)
model = pm.Model([observations, lambda_1, lambda_2, tau])
mcmc = pm.MCMC(model)
mcmc.sample(40000, 10000, 1)
Group | Visitors | Conversions |
---|---|---|
Control | ? | ? |
Experiment | ? | ? |
Group | Visitors | Conversions |
---|---|---|
Control | 2000 | 100 |
Experiment | 2000 | 150 |
Group | Visitors | Conversions |
---|---|---|
Control | 2000 | 100 |
Experiment | 2000 | 150 |
What the business units really want is
What is the probability that the Experiment group converts better than Control?
0.041 | < | 0.072 | == | 1 |
0.054 | < | 0.076 | == | 1 |
0.046 | < | 0.090 | == | 1 |
0.060 | < | 0.058 | == | 0 |
0.052 | < | 0.075 | == | 1 |
estimate of p is 4/5 = 0.8
- No, both have preferable use cases
- Bayesian is preferable for small data or complex models
- Frequentist is preferable for large data
Survival Analysis in A/B testing
Pseudo-algorithm