Field Level Inference
A biased Perspective
[Video Credit: N-body simulation Francisco Villaescusa-Navarro]
Carolina Cuesta-Lazaro
Flatiron Institute
Institute for Advanced Studies
What is field-level inference?
A digital twin of our Universe

Observed Galaxy Distribution
Simulated Galaxy Distribution

Field Level Inference
Forward Model
(= no Cosmic Variance)




Carolina Cuesta-Lazaro Flatiron/IAS - FLI
Why field-level inference?
Optimal constraints
Counts-in-cell
Do we really need to infer 10^9 parameters to constrain 10?

Carolina Cuesta-Lazaro Flatiron/IAS - FLI
["Genie 2: A large-scale foundation model" Parker-Holder et al (2024)]
Probabilistic ML has made high dimensional inference tractable
1024x1024xTime
["Genie 3: A new frontier for world models" Parker-Holder et al (2025)]
Carolina Cuesta-Lazaro Flatiron/IAS - FLI
What field level inference isn't: Marginalisation


Carolina Cuesta-Lazaro Flatiron/IAS - FLI
Compression
Marginal Likelihood
Explicit Likelihood
Implicit Likelihood
Bridging two distributions

Base
Data
"Creating noise from data is easy;
creating data from noise is generative modeling."
Yang Song
Neural Network
Learning likelihoods at the field-level

["A point cloud approach to generative modeling for galaxy surveys at the field level"
Cuesta-Lazaro and Mishra-Sharma
International Conference on Machine Learning ICML AI4Astro 2023, Spotlight talk, arXiv:2311.17141]
Target Distribution
Simulated Galaxy 3d Map
Base Distribution
Prompt:
Carolina Cuesta-Lazaro Flatiron/IAS - FLI



High-Dimensional
Low-Dimensional
s is sufficient iif
Neural Compression
Carolina Cuesta-Lazaro Flatiron/IAS - FLI

Maximise
Mutual Information

Neural Posterior Estimation -> Optimal Summaries
["Optimal Neural Summarisation for Full-Field Weak Lensing Cosmological Implicit Inference" Lanzieri et al]
Carolina Cuesta-Lazaro Flatiron/IAS - FLI

CNN
Diffusion
Increasing Noise
["Diffusion-HMC: Parameter Inference with Diffusion Model driven Hamiltonian Monte Carlo" Mudur, Cuesta-Lazaro and Finkbeiner NeurIPs 2023 ML for the physical sciences, arXiv:2405.05255]

Nayantara Mudur


NPE-Compression
Diffusion
Learning the marginal likelihood is more robust
Carolina Cuesta-Lazaro Flatiron/IAS - FLI
Diffusion model
Robustness?
Is Field-Level Inference worth it?
Optimal Summaries
FLI
Same pixel-level fidelity required
Number of simulations needed?
Training simulations are IID
Very high dimensional inference!
Low dimensional inference
Marginal Likelihood
Amortized
Carolina Cuesta-Lazaro Flatiron/IAS - FLI






Reconstructing ALL latent variables:
Dark Matter distribution
Entire formation history
Peculiar velocities
Predictive Cross Validation:
Cross-Correlation with other probes without Cosmic Variance

[Image Credit: Yuuki Omori]
Constraining Inflation:
Inferring primordial non-gaussianity
Why field-level inference?
Carolina Cuesta-Lazaro Flatiron/IAS - FLI
Data-driven Subgrid models

True
Reconstructed

"Joint cosmological parameter inference and initial condition reconstruction with Stochastic Interpolants"
Cuesta-Lazaro, Bayer, Albergo et al
NeurIPs ML4PS 2024 Spotlight talk

Carolina Cuesta-Lazaro Flatiron/IAS - FLI

"Detecting model mispecification in cosmology with scale-dependent normalizing flows"
Akhmetzhanova, Cuesta-Lazaro, Mishra-Sarhma 2025
arXiv:2508.05744


Aizhan Akhmetzhanova
Carolina Cuesta-Lazaro Flatiron/IAS - FLI
Use optimal summaries instead of field
How well does the model fit the data?

Carolina Cuesta-Lazaro Flatiron/IAS - FLI
Base
OOD Mock 1
OOD Mock 2

Base
OOD Mock 1
OOD Mock 2
Large Scales
Small Scales
Small Scales

OOD Mock 1
OOD Mock 2
Parameter Inference Bias (Supervised)
OOD Metric (Unsupervised)
Large Scales
Small Scales
Carolina Cuesta-Lazaro Flatiron/IAS - FLI
Galaxy Bias
Self consistent predictions
Directly? linked to physical processes
Large Volumes
Large Volumes
MTNG ~ 500 Mpc/h
Robust
Clear assumptions
Large Scales
Galaxy formation?
["Full-shape analysis with simulation-based priors: Constraints on single field inflation from BOSS" Ivanov, Cuesta-Lazaro, Mishra-Sharma, Oblujen, Toomey arXiv:2402.13310]
Effective Field Theories
Empirical
HOD/SHAM
Fast
Accurate?
Hydrodynamics
Fast
Clear assumptions
Galaxy formation?
Carolina Cuesta-Lazaro Flatiron/IAS - FLI
["BaryonBridge: Interpolants models for fast hydrodynamical simulations" Horowitz, Cuesta-Lazaro, Yehia ML4Astro workshop 2025]

Particle Mesh for Gravity
CAMELS Volumes
1000 boxes with varying cosmology and feedback models

Gas Properties

Current model optimised for Lyman Alpha forest
7 GPU minutes for a 50 Mpc simulation
130 million CPU core hours for TNG50

Density
Temperature
Galaxy Distribution
Hydro Simulations at scale
Carolina Cuesta-Lazaro Flatiron/IAS - FLI

Learn a continuous representation for feedback


Dark Matter
Baryonic fields
Carolina Cuesta-Lazaro Flatiron/IAS - FLI
Mingshau Liu
(Ming)

The Roadmap
2) Assess the robustness of field-level inference via parameter-masked mock challenges in realistic OOD scenarios (example Beyond2pt)
3) Development of open source ecosystems for more plug and play models
Field level analysis too complex for one group to develop a robust framework!
1) Need to develop better validation metrics (requires better validation suites)
Carolina Cuesta-Lazaro Flatiron/IAS - FLI

Looking for PhD stduents and Postdocs on AIxAstro
carolina.clzr@gmail.com

FLI-ESI-VEINNA-2025
By carol cuesta
FLI-ESI-VEINNA-2025
- 42