Generative Solutions for Cosmic Problems

[Video Credit: N-body simulation Francisco Villaescusa-Navarro]

Carolina Cuesta-Lazaro

IAIFI Fellow, MIT / Center for Astrophysics

1-Dimensional

Machine Learning

Secondary anisotropies

Galaxy formation

Intrinsic alignments

DESI, DESI-II, Spec-S5

Euclid / LSST

Simons Observatory

CMB-S4

Ligo

Einstein

The era of Big Data Cosmology

xAstrophysics

5-Dimensional

w_0, w_a, f\sigma_8, \Omega_m, \sum m_\nu

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

The cost of cosmological simulations

AbacusSummit

330 billion particles in 2 Gpc/h volume

\mathcal{O}(100)

60 trillion particles

~ 8TBs per simulation

15M CPU hours

(TNG50 ~100M cpu hours)

\mathcal{O}(10^4)

ML Requirements

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

Simulation efficient methods

Can we store a simulation inside a neural network?

Prompting simulators

1

Fast Emulators + Likelihood Models

Continuous Fields and Compression

2

Controllable

Simulators

3
x_1
x_2

Model

p_\theta(x)

Training Samples

x_\mathrm{train}

Generative Models 1o1

Evaluate probabilities

Low p(x)

High p(x)

Generate Novel Samples

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

z \sim p(z)
x \sim p(x)
x = f(z)
p(x) = p(z = f^{-1}(x)) \left| \det J_{f^{-1}}(x) \right|

Probability mass conserved locally

Inference a la gradient descent

z = f^{-1}(x)

1) Tractable

2) f maximally expressive

Loss = Maximize likelihood training data

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

\frac{dx_t}{dt} = v^\phi_t(x_t)
x_1 = x_0 + \int_0^1 v^\phi_t(x_t) dt
\frac{d p(x_t)}{dt} = - \nabla \left( v^\phi_t(x_t) p(x_t) \right)

In continuous time

Continuity Equation

Loss requires solving an ODE!

Diffusion, Flow matching, Interpolants... All ways to avoid this at training time

[Image Credit: "Understanding Deep Learning" Simon J.D. Prince]

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

\frac{dx_t}{dt} = v_\theta(t,x_t)
= (1-t)x_0 + t x_1
v(t,x_t) = \mathbb{E}\left[\partial_t I | x_t=x \right]

Can we regress the velocity field directly?

Turned maximum likelihood into a regression problem!

I(t,x_0,x_1) = x_t = \alpha_t x_0 + \beta_t x_1
\mathcal{L} = \int_0^1 \mathbb{E}_{x_0,x_1} \left[v_\theta(t, x_t) - \partial_t I(t,x_0,x_1) \right] ^2 dt

Interpolant

+ \gamma_t z

Stochastic Interpolant

Expectation over all possible paths that go through xt

["Stochastic Interpolants: A Unifying framework for flows and diffusion" 
Albergo et al arXiv:2303.08797]

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

Diffusion Models

Reverse diffusion: Denoise previous step

Forward diffusion: Add Gaussian noise (fixed)

Prompt

A person half Yoda half Gandalf

Denoising = Regression

Fixed base distribution:

Gaussian

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

["A point cloud approach to generative modeling for galaxy surveys at the field level" 
Cuesta-Lazaro and Mishra-Sharma 

ICML AI4Astro 2023, arXiv:2311.17141]

Base Distribution

Target Distribution

  • Sample
  • Evaluate

Long range correlations

Huge pointclouds (20M)

Homogeneity and isotropy

Siddharth Mishra-Sharma

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

Fixed Initial Conditions / Varying Cosmology

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

p(\theta|x) = \frac{p(x|\theta)p(\theta)}{p(x)}

Diffusion model

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

CNN

Diffusion

Increasing Noise

p(\sigma_8|\delta_m)
p(\sigma_8|\delta_m + 0.01 \epsilon)
p(\sigma_8|\delta_m + 0.02 \epsilon)
["Diffusion-HMC: Parameter Inference with Diffusion Model driven Hamiltonian Monte Carlo" 
Mudur, Cuesta-Lazaro and Finkbeiner
NeurIPs 2023 ML for the physical sciences, arXiv:2405.05255]

 

Nayantara Mudur

CNN

Diffusion

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

["A Cosmic-Scale Benchmark for Symmetry-Preserving Data Processing" 
Balla, Mishra-Sharma, Cuesta-Lazaro et al
NeurIPs 2024 NuerReps arXiv:2410.20516]

 

E(3) Equivariant architectures

Benchmark models

["Geometric and Physical Quantities Improve E(3) Equivariant Message Passing" 
Brandstetter et al
arXiv:2110.02905]

 

Symmetry-preserving ML

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

The bitter lesson by Rich Sutton

The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin. [...]

 

methods that continue to scale with increased computation even as the available computation becomes very great. [...]

 

We want AI agents that can discover like we can, not which contain what we have discovered.

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

["A Cosmic-Scale Benchmark for Symmetry-Preserving Data Processing" 
Balla, Mishra-Sharma, Cuesta-Lazaro et al
NeurIPs 2024 NuerReps arXiv:2410.20516]

 

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

Memory scaling point clouds and voxels

h_0
h_1
h_5
h_4
h_2
h_3
h_6
e_{01}
e_{12}

Graph

\mathcal{O}(10^6)
\mathcal{O}(10^8)

Nodes

Edges

3D Mesh

\mathcal{O}(10^{10})

Voxels

Both data representations scale badly with increasing resolution

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

Representing Continuous Fields

(x,y,z,t)
\Psi
\dot \Psi

Continuous in space and time

x500 Compression!

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

Reconstructing dark matter back in time

Stochastic Interpolants

NF

p(\delta_\mathrm{ICs}, \theta|\delta_\mathrm{Obs}) =
p(\delta_\mathrm{ICs}|\delta_\mathrm{Obs})
p(\theta|\delta_\mathrm{ICs},\delta_\mathrm{Obs})
["Joint cosmological parameter inference and initial condition reconstruction with Stochastic InterpolantsCuesta-Lazaro, Bayer, Albergo et al 
NeurIPs 2024 ML for the Physical Sciences]

 

Adrian Bayer

Mount Fuji?

p(x_1|x_0)
x_0

?

x_1
s
["Probabilistic Forecasting with Stochastic Interpolants and Foellmer Processes" 
Chen et al arXiv:2403.10648 (Figure adapted from arXiv:2407.21097)]

 

Generative SDE

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

Guided simulations with fuzzy constraints

Simulate what you need

(and sometimes what you want)

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

1. Generative models are more than fast emulators: robust field-level likelihood models

2. Continuous fields can be used to represent cosmological fields

3. Cosmological field level inference can be made efficient with generative models

Conclusions

Can we make them more simulation efficient?

Compression + efficient data format

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

Can generally make simulators more controllable!