["DESI 2024 VI: Cosmological Constraints from the Measurements of Baryon Acoustic Oscillations" arXiv:2404.03002]
What role did Machine Learning play?
Dark Energy is constant over time
1-Dimensional
Machine Learning
Secondary anisotropies
Galaxy formation
Intrinsic alignments
DESI, DESI-II, Spec-S5
Euclid / LSST
Simons Observatory
CMB-S4
Ligo
Einstein
xAstrophysics
5-Dimensional
Dataset Size = 1
Can't poke it in the lab
Simulations
Bayesian statistics
Unicorn land The promise of ML for Cosmology
Reality Check Roadblocks & Bottlenecks
Mapping dark matter
Reverting gravitational evolution
Field Level Inference
Learning to represent baryonic feedback
Data-driven hybrid simulators
Unsupervised problems
Base Distribution
Target Distribution
Bridging two distributions
Make the data as likely as possible
Prompt
A person half Yoda, half Gandalf
[Image Credit: Claire Lamman (CfA/Harvard) / DESI Collaboration]
["A point cloud approach to generative modeling for galaxy surveys at the field level"
Cuesta-Lazaro and Mishra-Sharma
arXiv:2311.17141]
Base Distribution
Target Distribution
Long range correlations
Huge pointclouds (20M)
Homogeneity and isotropy
Siddharth Mishra-Sharma
Diffusion model
CNN
Diffusion
Increasing Noise
["Diffusion-HMC: Parameter Inference with Diffusion Model driven Hamiltonian Monte Carlo"
Mudur, Cuesta-Lazaro and Finkbeiner]
Nayantara Mudur
["Your diffusion model is secretly a certifiably robust classifier"
Chen et al
arXiv:2402.02316]
CNN
Diffusion
1 to Many:
Distribution of Galaxies
Underlying Dark Matter
["Debiasing with Diffusion: Probabilistic reconstruction of Dark Matter fields from galaxies"
Ono et al (including Cuesta-Lazaro) arXiv:2403.10648]
Victoria Ono
Core Park
Truth
Sampled
Observed
Small
Large
Scale (k)
Power Spectrum
Small
Large
Scale (k)
Cross correlation
TNG-300
True DM
Sample DM
["3D Reconstruction of Dark Matter Fields with Diffusion Models: Towards Application to Galaxy Surveys" Park, Mudur, Cuesta-Lazaro et al (in-prep)]
Posterior Sample
Posterior Mean
Stochastic Interpolants
NF
?
["Probabilistic Forecasting with Stochastic Interpolants and Foellmer Processes" Chen et al arXiv:2403.10648 (Figure adapted from arXiv:2407.21097)]
Generative SDE
Stochastic Interpolant
Boundary Conditions
Guided simulations with fuzzy constraints
How do we learn what is the robust information?
Simulating dark matter is easy!
"Atoms" are hard" :(
N-body Simulations
Hydrodynamics
Can we improve our simulators in a data-driven way?
(if cold!)
~ Gpc
pc
kpc
Mpc
Gpc
[Video credit: Francisco Villaescusa-Navarro]
Gas density
Gas temperature
Small
Large
In-Distribution
In-Distribution
In-Distribution
Out-of-Distribution
Out-of-Distribution
Out-of-Distribution
Out-of-Distribution
Out-of-Distribution
Out-of-Distribution
["Multifield Cosmology with Artificial Intelligence" Villaescusa-Navarro et al arXiv:2109.09747]
Out-of-Distribution
In-Distribution
Mikhail Ivanov
Robust galaxy bias model: Effective field Theories
+ Simulation as priors
Field-level EFT
["Full-shape analysis with simulation-based priors: constraints on single field inflation from BOSS" Ivanov, Cuesta-Lazaro et al arXiv:2402.13310]
Andrej Obuljen
Michael Toomey
["Full-shape analysis with simulation-based priors: cosmological parameters and the structure growth anomaly" Ivanov, Obuljen, Cuesta-Lazaro, Toomey arXiv:2409.10609]
Informative abstractions of the data
Transfer learning beyond LCDM
Cosmic web Anomaly Detection
Representing baryonic feedback
General
Predictive
Low dimensional?
Should generalize across scales, systems...
Transfer to unseen conditions
p(x|z)
Simple : Occam's razor
Causal?
Contrastive
Generative
inductive biases
from scratch or from partial observations
Students at MIT are
OVER-CAFFEINATED
NERDS
SMART
ATHLETIC
Simulator 1
Simulator 2
Dark Matter
Feedback
i) Contrastive
Baryonic fields
ii) Generative
Baryonic fields
Dark Matter
Generative model
Total matter, gas temperature,
gas metalicity
Encoder
What is the space of plausible solutions and how do we search it?
Differentiable Galaxies ODEs
Our best bet
Neural Network corrections
Data-driven hybrid simulators
Are these models predictive?
Parity violation cannot be originated by gravity
["Measurements of parity-odd modes in the large-scale 4-point function of SDSS..." Hou, Slepian, Chan arXiv:2206.03625]
["Could sample variance be responsible for the parity-violating signal seen in the BOSS galaxy survey?" Philcox, Ereza arXiv:2401.09523]
Real or Fake?
x or Mirror x?
Train
Test
Me: I can't wait to work with observations
Me working with observations:
Very subtle effect -> Hard to find data efficient architectures
1. There is a lot of information in galaxy surveys that ML methods can access
2. We can tackle high dimensional inference problems so far unatainable
3. Our ability to simulate limits the amount of information we can robustly extract
Hybrid simulators, forward models, robustness
Unsupervised problems
Mapping dark matter, constrained simulations... Let's get creative!
Field level inference