Field-Level
Inference
Justine Zeghal
justine.zeghal@umontreal.ca


Learning the Universe meeting
October 2025
Université de Montréal
Bayes theorem:
We can build a simulator to map the cosmological parameters to the data.
Prediction
Inference
Full-field inference: extracting all cosmological information
Simulator
Depending on the simulator’s nature we can either perform
- Explicit inference
- Implicit inference

Full-field inference: extracting all cosmological information
Simulator

Initial conditions of the Universe
Large Scale Structure

Needs an explicit simulator to sample the joint posterior through MCMC:
We need to sample in
high-dimension
→ gradient-based sampling schemes
-
Explicit inference

Explicit joint likelihood
This approach typically involve 2 steps:
2) Implicit inference on these summary statistics to approximate the posterior.
1) compression of the high dimensional data into summary statistics. Without loosing cosmological information!
Summary statistics
Full-field inference: extracting all cosmological information
Simulator
-
Implicit inference
It does not matter if the simulator is explicit or implicit because all we need are simulations

Outline
Which full-field inference methods require the fewest simulations?
How to build sufficient statistics?
Can we perform implicit inference with fewer simulations?
How to generate more realistic simulations?
Outline
Which full-field inference methods require the fewest simulations?
How to build sufficient statistics?
Can we perform implicit inference with fewer simulations?
How to generate more realistic simulations?
Optimal Neural Summarisation for Full-Field Weak Lensing Cosmological Implicit Inference
Denise Lanzieri*, Justine Zeghal*, T. Lucas Makinen, François Lanusse, Alexandre Boucaud and Jean-Luc Starck

It is only a matter of the loss function used to train the compressor.
How to extract all the information?
Sufficient Statistic
A statistic t is said to be sufficient for the parameters if and only if

Two main compression schemes

Regression Losses
Two main compression schemes

Regression Losses
Which learns a moment of the posterior distribution.
Two main compression schemes
Mean Squared Error (MSE) loss:
Which learns a moment of the posterior distribution.

Regression Losses
Two main compression schemes
→ Approximate the mean of the posterior.

Which learns a moment of the posterior distribution.
Two main compression schemes
Mean Squared Error (MSE) loss:
Regression Losses
→ Approximate the mean of the posterior.

Mean Absolute Error (MAE) loss:
Which learns a moment of the posterior distribution.
Two main compression schemes
Mean Squared Error (MSE) loss:
Regression Losses
→ Approximate the median of the posterior.
Two main compression schemes
→ Approximate the mean of the posterior.

Mean Absolute Error (MAE) loss:
Which learns a moment of the posterior distribution.
Mean Squared Error (MSE) loss:
Regression Losses


Regression Losses
Two main compression schemes





Two main compression schemes
Regression Losses





Two main compression schemes
Regression Losses






Two main compression schemes
Regression Losses



The mean is not guaranteed to be a sufficient statistic.
Two main compression schemes
Regression Losses
Mutual information maximization
Two main compression schemes
Mutual information maximization
By definition:
Two main compression schemes
Mutual information maximization
By definition:
Two main compression schemes
Mutual information maximization
By definition:
Two main compression schemes
Mutual information maximization
By definition:
Two main compression schemes
Mutual information maximization
By definition:
Two main compression schemes
Mutual information maximization
By definition:
Two main compression schemes
→ build sufficient statistics according to the definition.

We developed a fast and differentiable (JAX) log-normal mass maps simulator.



For our benchmark: a Differentiable Mass Maps Simulator
1. We compress using one of the losses.
Benchmark procedure:
2. We compare their extraction power by comparing their posteriors.
For this, we use implicit inference, which is fixed for all the compression strategies.


Numerical results


Outline
Which full-field inference methods require the fewest simulations?
How to build sufficient statistics?
Can we perform implicit inference with fewer simulations?
How to generate more realistic simulations?
Outline
Which full-field inference methods require the fewest simulations?
How to build sufficient statistics?
Can we perform implicit inference with fewer simulations?
How to generate more realistic simulations?
Neural Posterior Estimation with Differentiable Simulators
ICML 2022 Workshop on Machine Learning for Astrophysics
Justine Zeghal, François Lanusse, Alexandre Boucaud,
Benjamin Remy and Eric Aubourg
Neural Posterior Estimation
There exist several ways to do implicit inference
- Learning the likelihood
- Learning the likelihood ratio
- Learning the posterior
Neural Posterior Estimation
→ Normalizing Flows
From simulations only!
A lot of simulations..
There exist several ways to do implicit inference
- Learning the likelihood
- Learning the likelihood ratio
- Learning the posterior
Neural Posterior Estimation with Gradients






→ Normalizing Flows
From simulations only!
A lot of simulations..
There exist several ways to do implicit inference
How gradients can help reduce the number of simulations?
- Learning the likelihood
- Learning the likelihood ratio
- Learning the posterior
Normalizing Flows training with gradients
Normalizing flows are trained by minimizing the negative log likelihood:
Normalizing Flows training with gradients
But to train the NF, we want to use both simulations and the gradients from the simulator
Normalizing flows are trained by minimizing the negative log likelihood:
Normalizing Flows training with gradients
But to train the NF, we want to use both simulations and the gradients from the simulator
Normalizing flows are trained by minimizing the negative log likelihood:
Normalizing Flows training with gradients


→ On a toy Lotka Volterra model, the gradients helps to constrain the distribution shape.
But to train the NF, we want to use both simulations and the gradients from the simulator














Without gradients
With gradients
Posteriors on a toy model
Outline
Which full-field inference methods require the fewest simulations?
How to build sufficient statistics?
Can we perform implicit inference with fewer simulations?
How to generate more realistic simulations?
Outline
Which full-field inference methods require the fewest simulations?
How to build sufficient statistics?
Can we perform implicit inference with fewer simulations?
How to generate more realistic simulations?
Simulation-Based Inference Benchmark for Weak Lensing Cosmology
Justine Zeghal, Denise Lanzieri, François Lanusse, Alexandre Boucaud, Gilles Louppe, Eric Aubourg, Adrian E. Bayer
and The LSST Dark Energy Science Collaboration (LSST DESC)

We developed a fast and differentiable (JAX) log-normal mass maps simulator.



For our benchmark: a Differentiable Mass Maps Simulator

Benchmark Results
Both explicit and implicit inference yield the same posterior.

The gradients are too noisy to help reduce the number of simulations in implicit inference.
Implicit inference needs 10^3 simulations.
Benchmark Results
Both explicit and implicit inference yield the same posterior.


Explicit inference needs 10^6 simulations.
The gradients are too noisy to help reduce the number of simulations in implicit inference.
Implicit inference needs 10^3 simulations.
Training the NF with simulations and gradients:
Loss =
-
Do gradients help implicit inference methods?



Training the NF with simulations and gradients:
Loss =
-
Do gradients help implicit inference methods?
Outline
Which full-field inference methods require the fewest simulations?
How to build sufficient statistics?
Can we perform implicit inference with fewer simulations?
How to generate more realistic simulations?
Outline
Which full-field inference methods require the fewest simulations?
How to build sufficient statistics?
Can we perform implicit inference with fewer simulations?
How to generate more realistic simulations?
Bridging Simulators with Conditional Optimal Transport
Justine Zeghal, Benjamin Remy, Yashar Hezaveh, François Lanusse,
Laurence Perreault-Levasseur
ICML co-located 2024 Workshop on Machine Learning for Astrophysics
A way to emulate is to learn the correction of a cheap simulation:
→ OT Flow Matching enables to learn an Optimal Transport mapping between two random distributions.
Easier than learning the entire simulation evolution.
We want:
- the transformation to minimally transform the simulation
- learning a conditional transformation
- work with unpaired dataset


Learning emulators to generate more simulations
With full-field inference, we are now only relying on simulations.
→ We need very realistic simulations.




Flow Matching





Flow Matching






Flow Matching






Flow Matching







Optimal Transport Flow Matching






Results


OT
LPT
PM
Experiment
→ Good emulation
at the pixel-level
→ We can perform both implicit and explicit inference
Conclusion
Which full-field inference methods require the fewest simulations?
Can we perform implicit inference with fewer simulations?
How to build emulators?
Gradients can be beneficial, depending on your simulation model.
Explicit inference requires 100 times more simulations than implicit inference.
We can learn an optimal transport mapping.
How to build sufficient statistics?
Mutual Information Maximization
Thank you for your attention!
Summary statistics
Simulator

Ltu meeting
By Justine Zgh
Ltu meeting
- 27