Big Data Cosmology meets AI
IAIFI Fellow
Carol Cuesta-Lazaro
MIT - 29th April 2024
Video Credit: N-body simulation Francisco Villaescusa-Navarro
The era of Big Data Cosmology
1-Dimensional
Machine Learning
Secondary anisotropies
Galaxy formation
Intrinsic alignments
Dust
xAstrophysics
DESI, DESI-II, Spec-S5
Euclid
LSST
Simons Observatory
CMB-S4
Ligo
Einstein
LSST
Early Universe Inflation
Late Universe
Energy and matter content
Evolution
Dark matter
Dark energy
Hubble Constant
Baryons
Neutrino masses
Non-Gaussianity
Tilt power spectrum
Hubble tension
Beyond the Standard Model
Multifield Inflation
Hybrid ML - Physics Simulators
Unsupervised searches
Cosmological (field level) Inference for Galaxy Surveys
DESI
DESI: Dark Energy Spectroscopic Instrument
~40 Million spectra!
(Image Credit: Jinyi Yang, Steward Observatory/University of Arizona)
(Image Credit: D. Schlegel/Berkeley Lab using data from DESI)
High dimensional data
Unknown
Simple summary statistic
estimated with Perturbation Theory
Probability pair of galaxy
Pair separation
Forward Model
Parameters
Observable
Likelihood
Simulator
+ MCMC hammer
Dark matter
Dark energy
Inflation
Perturbation Theory
Pen and paper
+ Density Estimation
+ Sampler
A forward model samples the likelihood
Parameters
Observable
Observed galaxy pointcloud
DESI
Forward Model
Dark matter
Dark energy
Inflation
A 2D animation of a folk music band composed of anthropomorphic autumn leaves, each playing traditional bluegrass instruments, amidst a rustic forest setting dappled with the soft light of a harvest moon
Image credit: DALL·E 3
1024x1024
"A point cloud approach to generative modeling for galaxy surveys at the field level"
Cuesta-Lazaro and Mishra-Sharma
arXiv:2311.17141
Base Distribution
Target Distribution
- Sample
- Evaluate
Siddharth Mishra-Sharma
Fixed Initial Conditions
Varying Cosmology
Mean pairwise
velocity
k Nearest neighbours
Pair separation
Pair separation
Trained on only 5000 positions!
Learning in 5000 dimensions with only 2000 simulations
Nayantara Mudur
"Diffusion-HMC: Parameter Inference with Diffusion Model driven Hamiltonian Monte Carlo"
Mudur, Cuesta-Lazaro and Finkbeiner
in prep
CNN
Diffusion
Increasing Noise
"Your diffusion model is secretly a certifiably robust classifier"
Chen et al
arXiv:2402.02316
1 to Many:
https://arxiv.org/abs/2210.02747
https://arxiv.org/abs/2302.00482
Stochastic Interpolants: Bridging arbitrary densities
"Stochastic Interpolants: A Unifying Framework for Flows and Diffusions"
Albergo, Boffi, Vanden-Eijnden
arXiv:2303.08797
Flow ODE
Continuity Equation
Regress the velocity field
Unknown!
Boundary Conditions
https://arxiv.org/abs/2210.02747
https://arxiv.org/abs/2302.00482
Stochastic Interpolants: Bridging arbitrary densities
"Stochastic Interpolants: A Unifying Framework for Flows and Diffusions"
Albergo, Boffi, Vanden-Eijnden
arXiv:2303.08797
Power Spectrum
Cross correlation
Small
Large
Scale (k)
Small
Large
Scale (k)
Small
Large
Scale (k)
Small
Large
Scale (k)
Can we run larger simulations? (DESI volumes)
At high resolution?
Faster?
All this works depends on simulations, but...
Thousands of them?
Hybrid Physical / ML simulators
Gravitational evolution ODE
Particle-mesh
"Nbodyify: Adaptive mesh corrections for PM simulations" Cuesta-Lazaro, Modi in preps
Particle-mesh
Full Nbody
Hybrid Simulator - on the fly
Gravitational evolution ODE
Trained to match particle velocities and positions: DIFFERENTIABLE
Density
Gravitational Potential
1. CNN
2. Read features at position using attention
3. Compute force correction
4. Run corrected simulation
Learn features
Particle-mesh
Full Nbody
Hybrid ML-Simulator
"Nbodyify: Adaptive mesh corrections for PM simulations" Cuesta-Lazaro, Modi in preps
Gravitational potential
Particle velocities
~ Gpc
pc
kpc
Mpc
Gpc
Video credit: Francisco Villaescusa-Navarro
Gas density
Gas temperature
Are there problems in cosmology that bypass a forward model?
Parity violation cannot be originated by gravity
"Measurements of parity-odd modes in the large-scale 4-point function of SDSS..." Hou, Slepian, Chan arXiv:2206.03625
"Could sample variance be responsible for the parity-violating signal seen in the BOSS galaxy survey?" Philcox, Ereza arXiv:2401.09523
Matthew Craigie
Peter Taylor
Yuan-Sen Ting
Pre-defined filters
No symmetries
Learned filters + symmetries
Reduce the problem to the space of odd-parity functions with equivariant graph networks?
Train
Test
Me: I can't wait to work with observations
Me working with observations:
Conclusions
1. There is a lot of information in galaxy surveys that ML methods can access
2. We can tackle high dimensional inference problems so far unatainable
3. Our ability to simulate will limit the amount of information we can extract
Hybrid simulators, forward models, robustness
Unsupervised problems: parity violation
Dark matter density reconstruction, Initial Conditions, let's get creative!
Field level inference
CTPSeminar2024
By carol cuesta
CTPSeminar2024
- 176