Carol Cuesta-Lazaro (IAIFI Fellow)

and Siddarth Mishra-Sharma (IAIFI Fellow)

Diffusion generative modelling for galaxy surveys

 

Initial Conditions of the Universe

Gaussian RF

Laws of gravity

3-D distribution of galaxies

Which are the ICs of OUR Universe?

Primordial non-Gaussianity?

3-D distribution of dark matter

Is GR modified on large scales?

How do galaxies form?

Neutrino mass hierarchy?

ML for the Large Scale Structure of the Universe:

Carol's wish list

Generative models

Learn p(x)

Evaluate the likelihood of a 3D map, as a function of the parameters of interest

1

Combine different galaxy properties (such as velocities and positions)

2

Sample 3D maps from the posterior distribution 

3

p(
)
|
\mathrm{Interesting}
\mathrm{parameters}
z_T
z_{0}
z_{1}
z_{2}
p(z_t|z_{t-1})

Reverse diffusion: Denoise previous step

Forward diffusion: Add Gaussian noise (fixed)

Diffusion models

A person half Yoda half Gandalf

q_\theta(z_{t-1}|z_t)
q_\theta(z_{t-1}|z_t) = \mathcal{N}(z_{t-1}|\mu_\theta(z_t), \sigma_t)
z_T
z_{0}
z_{1}

Diffusion on point clouds

z_{2}
q_\theta(z_{t-1}|z_t)
p(z_t|z_{t-1})

Reverse diffusion: Denoise previous step

Forward diffusion: Add Gaussian noise (fixed)

Cosmology

h_0
h_1
h_5
h_4
h_2
h_3
h_6

Node features coordinates (+mass, velocities)

Input

Noisy halo properties

Output

Noise prediction

Graph Neural Networks as score models

kNN (~20)

p(x,y,z, v_x, v_y, v_z, M_h|\Omega_m, \sigma_8)

Halo Mass Function

Velocity

PDF

Mean pairwise velocity

\mathcal{L}_T(x) = \sum_{i=1}^T \mathbb{E}_{q(z_{i}|x)} D_{KL} \left[q(z_{i-1} | z_{i}, x) || p_\theta(z_{i-1} | z_{i}) \right]
-\log p(x) \leq -\mathrm{VLB}(x)
D_{KL}(q(z_T|x) || p(z_T)) + \mathbb{E}_{q(z_0|x)} \left[-\log p(x|z_0) \right] + \mathcal{L}_T(x)

Prior loss

Diffusion loss

Reconstruction loss

Be a true Bayesian: Always maximise the likelihood

arxiv:2107.00630

arxiv:2303.00848

Maximum Likelihood = Denoising

Setting tight constraints with only 5000 halo positions

 

+ Galaxy formation

+ Observational systematics (Cut-sky, Fiber collisions)

+ Lightcone, Redshift Space Distortions....

Forward Model

N-body simulations

Observations

p(
)
|
\mathrm{Cosmology}

Optimise information on cosmological parameters

(robust) surprises

The challenge for field level inference