Boomers Quantified Uncertainty. We Simulate It
[Video Credit: N-body simulation Francisco Villaescusa-Navarro]
IAIFI Fellow
Carolina Cuesta-Lazaro
HPCD in Astrophysics
Decision making
Decision making in science
Is the current Standard Model ruled out by data?
Mass density
Vacuum Energy Density
CMB
Supernovae
Observation
Ground truth
Prediction
Uncertainty
Is it safe to drive there?
Interpretable Simulators
Noise in features
+ correlations
Noise in finite data realization
Uncertain parameters
Limited model architecture
Imperfect optimization
Ensembling / Bayesian NNs
Forward Model
Observable
Dark matter
Dark energy
Inflation
Predict
Infer
Parameters
Inverse mapping
Fault line stress
Plate velocity
Likelihood
Posterior
Prior
Evidence
Markov Chain Monte Carlo MCMC
Hamiltonian Monte Carlo HMC
Variational Inference VI
If can evaluate posterior (up to normalization), but not sample
Intractable
Unknown likelihoods
Amortized inference
Scaling high-dimensional
Marginalization nuisance
["Polychord: nested sampling for cosmology" Handley et al]
["Fluctuation without dissipation: Microcanonical Langevin Monte Carlo" Robnik and Seljak]
Higher Effective Sample Size (ESS) = less correlated samples
Number of Simulator Calls
Known likelihood
Differentiable simulators
z: All possible trajectories
Maximize the likelihood of the training samples
Model
Training Samples
No implicit prior
Not amortized
Goodness-of-fit
Scaling with dimensionality of x
Implicit marginalization
Loss Approximate variational posterior, q, to true posterior, p
Image Credit: "Bayesian inference; How we are able to chase the Posterior" Ritchie Vink
KL Divergence
Need samples from true posterior
Run simulator
Minimize KL
Amortized Inference!
Run simulator
High-Dimensional
Low-Dimensional
s is sufficient iif
Maximise
Mutual Information
Need true posterior!
No implicit prior
Not amortized
Goodness-of-fit
Scaling with dimensionality of x
Amortized
Scales well to high dimensional x
Goodness-of-fit
Fixed prior
Implicit marginalization
Implicit marginalization
Just use binary classifiers!
Binary cross-entropy
Sample from simulator
Mix-up
Likelihood-to-evidence ratio
Likelihood-to-evidence ratio
No implicit prior
Not amortized
Goodness-of-fit
Scaling with dimensionality of x
Amortized
Scales well to high dimensional x
Goodness-of-fit
Fixed prior
Implicit marginalization
No need variational distribution
No implicit prior
Implicit marginalization
Approximately normalised
Not amortized
Implicit marginalization
Maximize the likelihood of the training samples
Model
Training Samples
Trained Model
Evaluate probabilities
Low Probability
High Probability
Generate Novel Samples
Simulator
Simulator
[Image Credit: "Understanding Deep Learning" Simon J.D. Prince]
Bijective
Sample
Evaluate probabilities
Probability mass conserved locally
Image Credit: "Understanding Deep Learning" Simon J.D. Prince
Neural Network
Sample
Evaluate probabilities
No implicit prior
Not amortized
Goodness-of-fit
Scaling with dimensionality of x
Amortized
Scales well to high dimensional x
Goodness-of-fit
Fixed prior
Implicit marginalization
No need variational distribution
No implicit prior
Implicit marginalization
Approximately normalised
Not amortized
Implicit marginalization
Test log likelihood
["Benchmarking simulation-based inference"
Lueckmann et al
arXiv:2101.04653]
Posterior predictive checks
Observed
Re-simulated posterior samples
Real or Fake?
["Benchmarking simulation-based inference"
Lueckmann et al
arXiv:2101.04653]
["A Trust Crisis In Simulation-Based Inference? Your Posterior Approximations Can Be Unfaithful" Hermans et al
arXiv:2110.06581]
Much better than overconfident!
["A Trust Crisis In Simulation-Based Inference? Your Posterior Approximations Can Be Unfaithful" Hermans et al
arXiv:2110.06581]
Credible region (CR)
Not unique
High Posterior Density region (HPD)
Smallest "volume"
True value in CR with
probability
Empirical Coverage Probability (ECP)
["Investigating the Impact of Model Misspecification in Neural Simulation-based Inference" Cannon et al arXiv:2209.01845 ]
Underconfident
Overconfident
Always look at information gain too
["A Trust Crisis In Simulation-Based Inference? Your Posterior Approximations Can Be Unfaithful" Hermans et al
arXiv:2110.06581]
["Calibrating Neural Simulation-Based Inference with Differentiable Coverage Probability" Falkiewicz et al
arXiv:2310.13402]
["A Trust Crisis In Simulation-Based Inference? Your Posterior Approximations Can Be Unfaithful" Hermans et al
arXiv:2110.06581]
["Benchmarking simulation-based inference"
Lueckmann et al
arXiv:2101.04653]
[Image credit: https://www.mackelab.org/delfi/]
["Sequential Neural Likelihood: Fast Likelihood-free Inference with Autoregressive Flows" Papamakarios et al
arXiv:1805.07226]
Proposal (different from prior)
["Fast -free Inference of Simulation Models with Bayesian Conditional Density Estimation" Papamakarios et al
arXiv:1605.06376]
["Flexible statistical inference for mechanistic models of neural dynamics." Lueckmann et al
arXiv:1711.01861]
Sequential can't be amortized!
Proposal (different from prior)
["A Strong Gravitational Lens Is Worth a Thousand Dark Matter Halos: Inference on Small-Scale Structure Using Sequential Methods" Wagner-Carena et al arXiv:2404.14487]
["Investigating the Impact of Model Misspecification in Neural Simulation-based Inference" Cannon et al arXiv:2209.01845]
More misspecified
"The frontier of simulation-based inference" Kyle Cranmer, Johann Brehmer, and Gilles Louppe
Github repos
Review
cuestalz@mit.edu
Book
"Probabilistic Machine Learning: Advanced Topics" Kevin P. Murphey