Justine Zeghal
justine.zeghal@umontreal.caLearning the Universe meeting
October 2025
Université de Montréal
Bayes theorem:
We can build a simulator to map the cosmological parameters to the data.
Prediction
Inference
Simulator
Depending on the simulator’s nature we can either perform
Simulator
Initial conditions of the Universe
Large Scale Structure
Needs an explicit simulator to sample the joint posterior through MCMC:
We need to sample in
high-dimension
→ gradient-based sampling schemes
Explicit joint likelihood
This approach typically involve 2 steps:
2) Implicit inference on these summary statistics to approximate the posterior.
1) compression of the high dimensional data into summary statistics. Without loosing cosmological information!
Summary statistics
Simulator
It does not matter if the simulator is explicit or implicit because all we need are simulations
Which full-field inference methods require the fewest simulations?
How to build sufficient statistics?
Can we perform implicit inference with fewer simulations?
How to generate more realistic simulations?
Which full-field inference methods require the fewest simulations?
How to build sufficient statistics?
Can we perform implicit inference with fewer simulations?
How to generate more realistic simulations?
Denise Lanzieri*, Justine Zeghal*, T. Lucas Makinen, François Lanusse, Alexandre Boucaud and Jean-Luc Starck
It is only a matter of the loss function used to train the compressor.
Sufficient Statistic
A statistic t is said to be sufficient for the parameters if and only if
Which learns a moment of the posterior distribution.
Mean Squared Error (MSE) loss:
Which learns a moment of the posterior distribution.
→ Approximate the mean of the posterior.
Which learns a moment of the posterior distribution.
Mean Squared Error (MSE) loss:
→ Approximate the mean of the posterior.
Mean Absolute Error (MAE) loss:
Which learns a moment of the posterior distribution.
Mean Squared Error (MSE) loss:
→ Approximate the median of the posterior.
→ Approximate the mean of the posterior.
Mean Absolute Error (MAE) loss:
Which learns a moment of the posterior distribution.
Mean Squared Error (MSE) loss:
The mean is not guaranteed to be a sufficient statistic.
By definition:
By definition:
By definition:
By definition:
By definition:
By definition:
→ build sufficient statistics according to the definition.
We developed a fast and differentiable (JAX) log-normal mass maps simulator.
1. We compress using one of the losses.
Benchmark procedure:
2. We compare their extraction power by comparing their posteriors.
For this, we use implicit inference, which is fixed for all the compression strategies.
Which full-field inference methods require the fewest simulations?
How to build sufficient statistics?
Can we perform implicit inference with fewer simulations?
How to generate more realistic simulations?
Which full-field inference methods require the fewest simulations?
How to build sufficient statistics?
Can we perform implicit inference with fewer simulations?
How to generate more realistic simulations?
ICML 2022 Workshop on Machine Learning for Astrophysics
Justine Zeghal, François Lanusse, Alexandre Boucaud,
Benjamin Remy and Eric Aubourg
There exist several ways to do implicit inference
→ Normalizing Flows
From simulations only!
A lot of simulations..
There exist several ways to do implicit inference
→ Normalizing Flows
From simulations only!
A lot of simulations..
There exist several ways to do implicit inference
How gradients can help reduce the number of simulations?
Normalizing flows are trained by minimizing the negative log likelihood:
But to train the NF, we want to use both simulations and the gradients from the simulator
Normalizing flows are trained by minimizing the negative log likelihood:
But to train the NF, we want to use both simulations and the gradients from the simulator
Normalizing flows are trained by minimizing the negative log likelihood:
→ On a toy Lotka Volterra model, the gradients helps to constrain the distribution shape.
But to train the NF, we want to use both simulations and the gradients from the simulator
Without gradients
With gradients
Which full-field inference methods require the fewest simulations?
How to build sufficient statistics?
Can we perform implicit inference with fewer simulations?
How to generate more realistic simulations?
Which full-field inference methods require the fewest simulations?
How to build sufficient statistics?
Can we perform implicit inference with fewer simulations?
How to generate more realistic simulations?
Justine Zeghal, Denise Lanzieri, François Lanusse, Alexandre Boucaud, Gilles Louppe, Eric Aubourg, Adrian E. Bayer
and The LSST Dark Energy Science Collaboration (LSST DESC)
We developed a fast and differentiable (JAX) log-normal mass maps simulator.
Both explicit and implicit inference yield the same posterior.
The gradients are too noisy to help reduce the number of simulations in implicit inference.
Implicit inference needs 10^3 simulations.
Both explicit and implicit inference yield the same posterior.
Explicit inference needs 10^6 simulations.
The gradients are too noisy to help reduce the number of simulations in implicit inference.
Implicit inference needs 10^3 simulations.
Which full-field inference methods require the fewest simulations?
How to build sufficient statistics?
Can we perform implicit inference with fewer simulations?
How to generate more realistic simulations?
Which full-field inference methods require the fewest simulations?
How to build sufficient statistics?
Can we perform implicit inference with fewer simulations?
How to generate more realistic simulations?
Justine Zeghal, Benjamin Remy, Yashar Hezaveh, François Lanusse,
Laurence Perreault-Levasseur
ICML co-located 2024 Workshop on Machine Learning for Astrophysics
A way to emulate is to learn the correction of a cheap simulation:
→ OT Flow Matching enables to learn an Optimal Transport mapping between two random distributions.
Easier than learning the entire simulation evolution.
We want:
With full-field inference, we are now only relying on simulations.
→ We need very realistic simulations.
OT
LPT
PM
Experiment
→ Good emulation
at the pixel-level
→ We can perform both implicit and explicit inference
Which full-field inference methods require the fewest simulations?
Can we perform implicit inference with fewer simulations?
How to build emulators?
Gradients can be beneficial, depending on your simulation model.
Explicit inference requires 100 times more simulations than implicit inference.
We can learn an optimal transport mapping.
How to build sufficient statistics?
Mutual Information Maximization
Summary statistics
Simulator