Big Data Cosmology meets AI
IAIFI Fellow
Carol Cuesta-Lazaro

LBL - 3rd May 2024
Video Credit: N-body simulation Francisco Villaescusa-Navarro




The era of Big Data Cosmology



1-Dimensional





Machine Learning
Secondary anisotropies
Galaxy formation
Intrinsic alignments
Dust
xAstrophysics



DESI, DESI-II, Spec-S5
Euclid
LSST
Simons Observatory
CMB-S4
Ligo
Einstein
LSST
Early Universe Inflation

Late Universe

Energy and matter content
Evolution
Dark matter
Dark energy
Hubble Constant
Baryons
Neutrino masses
Non-Gaussianity
Tilt power spectrum
Hubble tension
Beyond the Standard Model
Multifield Inflation
Hybrid ML - Physics Simulators
Unsupervised searches
Cosmological (field level) Inference for Galaxy Surveys
DESI


High dimensional data
Unknown
Simple summary statistic
estimated with Perturbation Theory


Probability pair of galaxy
Pair separation
Forward Model
Parameters
Observable
Likelihood
Simulator
+ MCMC hammer

Dark matter
Dark energy
Inflation
Perturbation Theory
Pen and paper



+ Density Estimation
+ Sampler






Density Split
Wavelet Scattering Transform
Moment generating functionals
Void-Galaxy cross correlation
N-point functions
Minkowski functionals
Credit: Alternative Clustering Methods. Enrique Paillas, Wei Liu, Mathilde Pinon, Gillian Beltz-Mohrmann, Georgios Valogiannis)
A forward model samples the likelihood
Parameters
Observable
Observed galaxy pointcloud
DESI

Forward Model
Dark matter
Dark energy
Inflation


A 2D animation of a folk music band composed of anthropomorphic autumn leaves, each playing traditional bluegrass instruments, amidst a rustic forest setting dappled with the soft light of a harvest moon
Image credit: DALL·E 3
1024x1024


"A point cloud approach to generative modeling for galaxy surveys at the field level"
Cuesta-Lazaro and Mishra-Sharma
arXiv:2311.17141
Base Distribution
Target Distribution
- Sample
- Evaluate

Siddharth Mishra-Sharma

Fixed Initial Conditions
Varying Cosmology






Mean pairwise
velocity
k Nearest neighbours

Pair separation
Pair separation

Trained on only 5000 positions!

Learning in 5000 dimensions with only 2000 simulations


Nayantara Mudur
"Diffusion-HMC: Parameter Inference with Diffusion Model driven Hamiltonian Monte Carlo"
Mudur, Cuesta-Lazaro and Finkbeiner
in prep


CNN
Diffusion
Increasing Noise
"Your diffusion model is secretly a certifiably robust classifier"
Chen et al
arXiv:2402.02316

1 to Many:

arXiv:2312.09271
arXiv:2307.09504
https://arxiv.org/abs/2210.02747
https://arxiv.org/abs/2302.00482
Stochastic Interpolants: Bridging arbitrary densities


"Stochastic Interpolants: A Unifying Framework for Flows and Diffusions"
Albergo, Boffi, Vanden-Eijnden
arXiv:2303.08797
Flow ODE

Continuity Equation
Regress the velocity field
Unknown!

Boundary Conditions
https://arxiv.org/abs/2210.02747
https://arxiv.org/abs/2302.00482
Stochastic Interpolants: Bridging arbitrary densities


"Stochastic Interpolants: A Unifying Framework for Flows and Diffusions"
Albergo, Boffi, Vanden-Eijnden
arXiv:2303.08797



Power Spectrum
Cross correlation
Small
Large
Scale (k)
Small
Large
Scale (k)
Small
Large
Scale (k)
Small
Large
Scale (k)


Can we run larger simulations? (DESI volumes)
At high resolution?
Faster?
All this works depends on simulations, but...
Thousands of them?

Hybrid Physical / ML simulators
Gravitational evolution ODE
Particle-mesh
"Nbodyify: Adaptive mesh corrections for PM simulations" Cuesta-Lazaro, Modi in preps


Particle-mesh
Full Nbody
Hybrid Simulator - on the fly
Gravitational evolution ODE
Trained to match particle velocities and positions: DIFFERENTIABLE


Density
Gravitational Potential

1. CNN
2. Read features at position using attention
3. Compute force correction
4. Run corrected simulation
Learn features



Particle-mesh
Full Nbody
Hybrid ML-Simulator
"Nbodyify: Adaptive mesh corrections for PM simulations" Cuesta-Lazaro, Modi in preps




Gravitational potential
Particle velocities




~ Gpc
pc
kpc
Mpc
Gpc
Video credit: Francisco Villaescusa-Navarro
Gas density
Gas temperature
Are there problems in cosmology that bypass a forward model?

Parity violation cannot be originated by gravity

"Measurements of parity-odd modes in the large-scale 4-point function of SDSS..." Hou, Slepian, Chan arXiv:2206.03625


"Could sample variance be responsible for the parity-violating signal seen in the BOSS galaxy survey?" Philcox, Ereza arXiv:2401.09523



Matthew Craigie
Peter Taylor
Yuan-Sen Ting



Pre-defined filters
No symmetries
Learned filters + symmetries

Train
Test

Me: I can't wait to work with observations
Me working with observations:
Conclusions

1. There is a lot of information in galaxy surveys that ML methods can access
2. We can tackle high dimensional inference problems so far unatainable
3. Our ability to simulate will limit the amount of information we can extract
Hybrid simulators, forward models, robustness
Unsupervised problems: parity violation
Dark matter density reconstruction, Initial Conditions, let's get creative!
Field level inference


BerkeleySeminar2024
By carol cuesta
BerkeleySeminar2024
- 259