Generative Solutions for Cosmic Problems
Flatiron Institute
Institute for Advanced Studies
Carol(ina) Cuesta-Lazaro


Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
"Tiny human catches a few photons in a bucket, declares dark energy is dynamic"
Can we confidently break LCDM?

Late Universe
Early Universe
Tension

Early vs Late
Parametric Extensions
[Image Credit: Prof. Wendy Freedman]
Breaking LCDM
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
Systematics?
-> Shrink error bars
-> Build methods for attribution
What is field-level inference?
A digital twin of our Universe

Observed Galaxy Distribution
Simulated Galaxy Distribution

Field Level Inference
Forward Model
(= no Cosmic Variance)




Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
Why field-level inference?
Optimal constraints
N-point functions
Counts-in-cell
Wavelets
Marked tpcfs
Voids
Do we really need to infer 10^9 parameters to constrain 5?

Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
Marginal Inference - SBI


Neural Compression
Initial Conditions
Marginal Likelihood
["Diffusion-HMC: Parameter Inference with Diffusion Model driven Hamiltonian Monte Carlo" Mudur, Cuesta-Lazaro and Finkbeiner NeurIPs 2023 ML for the physical sciences, arXiv:2405.05255]
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026





z: All possible trajectories

Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
Simulators sample the Likelihood

CNN
Diffusion
Increasing Noise
["Diffusion-HMC: Parameter Inference with Diffusion Model driven Hamiltonian Monte Carlo" Mudur, Cuesta-Lazaro and Finkbeiner NeurIPs 2023 ML for the physical sciences, arXiv:2405.05255]

Nayantara Mudur


Posterior (NPE)
Likelihood (NLE)
Learning the marginal likelihood is more robust
Learned Likelihood
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
Initial Conditions

Do we need neural summaries?
kmax ~ 0.5
DESI LRG-like HOD galaxies
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
(x 10 HODs / cosmology)
L = 1 Gpc/h




["Detecting Model Misspecification in Cosmology with Scale-Dependent Normalizing Flows" Akhmetzhanova, Cuesta-Lazaro, Mishra-Sharma]
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026

Aizhan Akhmetzhanova (Harvard)
Scale Dependent Compression

Base
OOD Mock 1
OOD Mock 2
Large Scales
Small Scales
Small Scales

OOD Mock 1
OOD Mock 2
Parameter Inference Bias (Supervised)
OOD Metric (Unsupervised)
Large Scales
Small Scales
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026






Reconstructing ALL latent variables:
Dark Matter distribution
Entire formation history
Peculiar velocities
Predictive Cross Validation:
Cross-Correlation with other probes without Cosmic Variance

[Image Credit: Yuuki Omori]
Constraining Inflation:
Inferring primordial non-gaussianity
Why field-level inference?
Data-driven Subgrid models / Data-driven Systematics
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
The forward model
Scaling up to survey volumes
Modelling small scale clustering
Survey realism
Model mispecification
The FLI Challenges
Sampling high-dimensional posteriors
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
The Forward Model
Stop worrying about the Likelihood so that you can worry about the forward model
Galaxy Formation

Adapted from arXiv:1804.03097
Symmetries
Connected to Underlying Physics
Hydro sims
Empirical
Halo Occupation Distribution (HOD)
EFT bias expansion
Matter Density
Galaxy Distribution
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
["Full-shape analysis with simulation-based priors: Constraints on single field inflation from BOSS" Ivanov, Cuesta-Lazaro et al 2025]
Simulation Based Priors

Simulated Galaxies
EFT Field Level Fit
Fit:
?
["Full-shape analysis with simulation-based priors: Constraints on single field inflation from BOSS" Ivanov, Cuesta-Lazaro et al 2025]
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026

40% Improvement!
x2 survey volume
BOSS + Conservative Priors
BOSS + Simulation Based Priors
Simulation Based Priors
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026

(Slide credit: Matthew Ho)Scaling Up to Survey Volumes
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026

Learning The Universe
Simon Ding
Xiaosheng Zhao
Lucas Makinen
Axel Lapel
Adrian Bayer
Guilhem Lavaux
Benjamin Wandelt
Ce Sui
Matthew Ho
Leander Thiele
Rosa Malandrino
Greg Bryan
Nicolas Chartier
Lucia Perez
Chirag Modi
Deaglan Bartlett
Shivam Pandey
Sammy Sharief
Ana Maria Delgado
Anirban Bairagi
Christopher Lovell
Carolina Cuesta-Lazaro
Shy Genel
Francisco Villaescusa-Navarro
Laurence Perreault Levasseur
...
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026

Particle Mesh for Gravity

Gas Properties
Density
Temperature

Galaxy Distribution
["BaryonBridge: Interpolants models for fast hydrodynamical simulations" Horowitz, Cuesta-Lazaro, Yehia ML4Astro workshop 2025]Scalable Field Level Emulators
Probabilistic
Local
["CHARM: Creating Halos with Auto-Regressive Multi-stage networks" Pandey et al 2024]
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026



Amanda Lue (Columbia)
Trained on CAMELS 25 Mpc/h -> Inference over 50 Mpc/h
Supernovae Feedback
N-body
Galaxies

(Slide credit: Matthew Ho)Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026

(Slide credit: Matthew Ho)Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026


Posterior resimulations in minutes!
- Gravity Solver (Gadget-4)
- Halo finder (SUBFIND)
- Semi-analytic galaxy formation model (L-Galaxies)
OOD Tests



Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
Sampling High Dimensional Posteriors

GANS

Deep Belief Networks
2006

VAEs

Normalising Flows

BigGAN

Diffusion Models

2014
2017
2019
2022
A folk music band of anthropomorphic autumn leaves playing bluegrass instruments
Contrastive Learning
2023
Meanwhile, on Earth...
2026
"Write a C compiler"
AGI?
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
Goal: Estimate unknown p(x1) from samples
Base
Target
Transport Map


Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026

Base
Data
"Creating noise from data is easy; creating data from noise is generative modeling."
(Yang Song)
Neural Network





Transport Map
Continuity Equation
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
Interpolant
Base
Data
Neural Network
1) Training
2) Inference
Estimated from samples
(Implicit Likelihood)
["Stochastic Interpolants: A Unifying framework for flows and diffusion" Albergo et al arXiv:2303.08797]
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026

Particle Mesh
Dark Matter Only
Gaussian Likelihood
Explicit Sampling vs SBI
["Joint cosmological parameter inference and initial condition reconstruction with Stochastic Interpolants" Cuesta-Lazaro, Bayer, Albergo et al NeurIPs 2024 ML for the Physical Sciences]

Adrian Bayer
Mount Fuji?
Chirag Modi
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026

1) Likelihood not necessarily Gaussian
2) Forward model no need differentiable
3) Amortized
Generative Model: Marginalizing over ICs
Generative Model: Fixing ICs
HMC: Marginalizing over ICs
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026

True
Reconstructed

Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026

SBI
HMC
Cross Correlation Coefficient
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
Scaling up in volume
Implicit FLI for DESI
DESI Y1 LRG Effective volumes already larger than our sims!
Small Scale Galaxy Bias

Selection
Fibre collisions
Forward Modelling the Survey Systematics



PT
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
Galaxy Formation

Adapted from arXiv:1804.03097
Symmetries
Connected to Underlying Physics
Hydro sims
Empirical
Halo Occupation Distribution (HOD)
EFT bias expansion
Matter Density
Galaxy Distribution
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
Scaling up in Volume
Large Scale Reconstruction

True

Reconstructed


["Reconstructing Cosmological Initial Conditions from Late-Time Structure with Convolutional Neural Networks"
Shallue, Eisenstein 2022]["Initial conditions from galaxies: machine-learning subgrid correction to standard reconstruction"
Parker, Bayer, Seljak 2025]Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026


Power Spectrum
Cross Correlation

Peculiar Velocities
True
Reconstructed
Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
BEFORE
Artificial General Intelligence?
AFTER


DiscoverPhysics: Benchmarking LLMs for
Out-of-the-Box Scientific Thinking

Matt Wiemann

Lindsay Smith
Hypothesis
Simulate World

Invisible particles
Extra dimensions
Multi Species ...
Simulate World

Invisible particles
Extra dimensions
Multi Species ...
Propose Experiment


Simulate
Text: Conceptual Understanding
Trajectories (.csv)

Science Agent (LLM)

Science Agent (LLM)

Outputs
Python Code: Trajectory MSE
["DiscoverPhysics: Benchmarking LLMs for Out-of-the-Box Scientific Thinking" Wiemann, Smith et al (including CCL)]
You are an expert physicist tasked with discovering scientific laws in a simulated universe. Your goal is to propose experiments, analyse the data they return, and ultimately deduce the underlying scientific law. Note that the laws of physics in this universe may differ from those in our own. You can perform experiments to gather data but must follow the protocol strictly.

If F = G/r²: G = 80. If F = G/r: G = 8. Linear in 1/r!
m=1: a = 0.664 · m=2: a = 0.614 · m=4: a = 0.664 · m=8: a = 0.714 … That's strange."
Consistent pattern: small +y 'extra' force (~0.05) regardless of position … maybe the anchor itself moves.
a_y_extra ≈ +0.05 everywhere on probes. Spot on! So anchor too just experiences g_y = 0.05 … g = (0, 0.05) is a uniform external acceleration on every particle.

Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026
You are an expert physicist tasked with discovering scientific laws in a simulated universe. Your goal is to propose experiments, analyse the data they return, and ultimately deduce the underlying scientific law. Note that the laws of physics in this universe may differ from those in our own. You can perform experiments to gather data but must follow the protocol strictly.
Confirmed: a ∝ 1/r.
observed −5.18. WAY too small. Something is very wrong with my understanding!
artifact of integration accuracy, NOT the true law. Probably because of close encounters causing chaos.
I'll just submit my law. The functional form is solid.


Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026




Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026

Conceptual Understanding
Trajectory Prediction
Random Seeds

Carolina Cuesta-Lazaro Flatiron/IAS @ FORTH 2026

You are an expert physicist tasked with discovering scientific laws in a simulated universe. Your goal is to propose experiments, analyse the data they return, and ultimately deduce the underlying scientific law. Note that the laws of physics in this universe may differ from those in our own. You can perform experiments to gather data but must follow the protocol strictly.
a × r seems roughly constant … not quite 1/r.
Let me try a × r²: … Not consistent.
So at large r, a ~ 1/r. At small r, a ~ 1/r².
Actually note the file path in the error message: 'extra_dimensions.csv'! This is a hint! The world might have an extra compactified dimension…


[Model starts fitting Yukawa potential, Bessel functions... At some point fitting tool errors.]
Carolina Cuesta-Lazaro Flatiron/IAS @ Perimeter 2026










Astrophysics proliferates in Simulation-based Inference
on Simulations
Carolina Cuesta-Lazaro Flatiron/IAS @ Perimeter 2026
Simulated Data
Observed Data
Alignment Loss
Reconstruction
Statistical Alignment
(OT / Adversarial)


Encoder
Obs
Encoder
Sims
Private Domain Information
Shared Information


Observed Reconstructed
Simulated Reconstructed
Shared Decoder
Shared Decoder
Carolina Cuesta-Lazaro Flatiron/IAS @ Perimeter 2026

A Toy Model Example


Idealized Simulations
Observations
+ Scale Dependent Noise
+ Bump
Carolina Cuesta-Lazaro Flatiron/IAS @ Perimeter 2026

Amplitude
Tilt
Tilt
Robust SBI from Shared

Visualizing Information Split
Carolina Cuesta-Lazaro Flatiron/IAS @ Perimeter 2026
Crete-2026
By carol cuesta
Crete-2026
- 27