Generative Solutions for Cosmic Problems
Flatiron Institute
Carol(ina) Cuesta-Lazaro




1-Dimensional


Machine Learning
Secondary anisotropies
Galaxy formation
Intrinsic alignments



DESI / SphereX
Euclid / LSST
SO / CMB-S4
Ligo / Einstein


The era of Big Data Cosmology
xAstrophysics
HERA / CHIME
SAGA / MANGA




Galaxy formation
Hosts
Reionization


Cosmic Microwave Background
Galaxies / Dwarfs
21 cm
Galaxy Surveys
Gravitational Lensing
Gravitational Waves
AGN Feedback/Supernovae


Carolina Cuesta-Lazaro - IAS

"Better inference methods = Better Data"

"Cosmology needs Astrophysics
Astrophysics needs Cosmology"

GANS

Deep Belief Networks
2006

VAEs

Normalising Flows

BigGAN

Diffusion Models

2014
2017
2019
2022
A folk music band of anthropomorphic autumn leaves playing bluegrass instruments
Contrastive Learning
2023
Meanwhile, on Earth...
Carolina Cuesta-Lazaro - IAS
2026
"Write a C compiler"
AGI?
["Genie 2: A large-scale foundation model" Parker-Holder et al (2024)]
Probabilistic ML has made high dimensional inference tractable
1024x1024xTime
["Genie 3: A new frontier for world models" Parker-Holder et al (2025)]
Carolina Cuesta-Lazaro - IAS
Carolina Cuesta-Lazaro - IAS
Goal: Estimate unknown p(x1) from samples
Base
Target
Transport Map



Base
Data
"Creating noise from data is easy; creating data from noise is generative modeling."
(Yang Song)
Carolina Cuesta-Lazaro - IAS
Neural Network





Transport Map
Continuity Equation
Carolina Cuesta-Lazaro - IAS
Interpolant
Base
Data
Neural Network
1) Training
2) Inference
Estimated from samples
(Implicit Likelihood)
Carolina Cuesta-Lazaro - IAS

What is field-level inference?
A digital twin of our Universe

Observed Galaxy Distribution
Simulated Galaxy Distribution

Field Level Inference
Forward Model
(= no Cosmic Variance)




Carolina Cuesta-Lazaro - IAS
Why field-level inference?
Optimal constraints
Counts-in-cell
Do we really need to infer 10^9 parameters to constrain ~10?

Compression
Marginal Likelihood
Initial Conditions
Carolina Cuesta-Lazaro - IAS
Carolina Cuesta-Lazaro - IAS
["Simulation-Based Emulators for Galaxy Clustering in the Era of Stage-IV Surveys: I. Two-Point Statistics and Beyond" Paillas et al (include CCL) 2026]







Reconstructing ALL latent variables:
Dark Matter distribution
Entire formation history
Peculiar velocities
Predictive Cross Validation:
Cross-Correlation with other probes without Cosmic Variance

[Image Credit: Yuuki Omori]
Constraining Inflation:
Inferring primordial non-gaussianity
Why field-level inference?
Data-driven Subgrid models / Data-driven Systematics
Carolina Cuesta-Lazaro - IAS
"Joint cosmological parameter inference and initial condition reconstruction with Stochastic Interpolants"
Cuesta-Lazaro, Bayer, Albergo et al
NeurIPs ML4PS 2024 
Particle Mesh
Dark Matter Only
Gaussian Likelihood
Explicit Sampling vs SBI
Carolina Cuesta-Lazaro - IAS

1) Likelihood not necessarily Gaussian
2) Forward model no need differentiable
3) Amortized
Generative Model: Marginalizing over ICs
Generative Model: Fixing ICs
HMC: Marginalizing over ICs
Carolina Cuesta-Lazaro - IAS

True
Reconstructed

Carolina Cuesta-Lazaro - IAS

SBI
HMC
Carolina Cuesta-Lazaro - IAS
Cross Correlation Coefficient
Carolina Cuesta-Lazaro - IAS

Scaling up in volume
Implicit FLI for DESI
DESI Y1 LRG Effective volumes already larger than our sims!
Small Scale Galaxy Bias

Selection
Fibre collisions
Forward Modelling the Survey Systematics



EFT
Carolina Cuesta-Lazaro - IAS
Galaxy Formation

Adapted from arXiv:1804.03097
Carolina Cuesta-Lazaro - IAS
Symmetries
Connected to Underlying Physics
Hydro sims
Empirical
Halo Occupation Distribution (HOD)
EFT bias expansion
Matter Density
Galaxy Distribution
Scaling up in Volume
Carolina Cuesta-Lazaro - IAS
Large Scale

True

Reconstructed



Carolina Cuesta-Lazaro - IAS

Power Spectrum
Cross Correlation

Peculiar Velocities
True
Reconstructed

Matter Density
Galaxy Distribution
Effective Field Theory
Dimensions + Symmetries
Rotational invariance
(+ Galilean inv)
Equivalence Principle
Carolina Cuesta-Lazaro - IAS
"Large Scale Galaxy Bias"
Desjacques, Jeong, SchmidtSimulation Based Priors
Carolina Cuesta-Lazaro - IAS

Simulated Galaxies
EFT Field Level Fit
Fit:
?
["Full-shape analysis with simulation-based priors: Constraints on single field inflation from BOSS" Ivanov, Cuesta-Lazaro et al 2025]

Carolina Cuesta-Lazaro - IAS
40% Improvement!
x2 survey volume
BOSS + Conservative Priors
BOSS + Simulation Based Priors
Simulation Based Priors
Galaxy Formation

Adapted from arXiv:1804.03097
Carolina Cuesta-Lazaro - IAS
Symmetries
Connected to Underlying Physics
Hydro sims
Empirical
Halo Occupation Distribution (HOD)
EFT bias expansion
Matter Density
Galaxy Distribution
Self-Consistent Predictions across observables

X-Ray
Cluster gas mass fractions
Cluster gas density profiles
Sunyaev-Zeldovich
Galaxy Properties
Thermal Integrated electron pressure (hot electrons / big objects)
Star formation + histories
Stellar mass / halo mass relation
FRBs
Integrated electron density

Kinetic Integrated electron density x peculiar velocity
Multi-wavelength Observables
Carolina Cuesta-Lazaro - IAS
["BaryonBridge: Interpolants models for fast hydrodynamical simulations" Horowitz, Cuesta-Lazaro, Yehia ML4Astro workshop 2025]
Particle Mesh for Gravity
CAMELS Volumes
1000 boxes with varying cosmology and feedback models

Gas Properties

Current model optimised for Lyman Alpha forest
7 GPU minutes for a 50 Mpc simulation
130 million CPU core hours for TNG50
Density
Temperature
Galaxy Distribution

Filed Level Emulators: Hydro At Scale
Carolina Cuesta-Lazaro - IAS
Carolina Cuesta-Lazaro - IAS


["BaryonBridge: Interpolants models for fast hydrodynamical simulations" Horowitz, Cuesta-Lazaro, Yehia ML4Astro workshop 2025]Variations in Subgrid Physics
Volume Upscaling

[Video credit: Francisco Villaescusa-Navarro]
Gas density
Gas temperature
Subgrid model 1
Subgrid model 2
Subgrid model 3
Subgrid model 4
Carolina Cuesta-Lazaro - IAS
Can we learn a general and continuous representation of Baryonic feedback?

Gas
Galaxies




Dark Matter
Baryonic fields
Marginalize over a broader set of subgrid physics
Interpolate between simulators
Mingshau Liu
(Ming)

Constrain z via multi-wavelength observations
Carolina Cuesta-Lazaro - IAS

Trained on:
TNG, SIMBA, Astrid, EAGLE
Encoder



1) Encoder

Gas
Galaxies




Dark Matter
Baryonic fields
2) Probabilistic Decoder
Carolina Cuesta-Lazaro - IAS



Dark Matter
Baryonic fields
(Test suite)
Carolina Cuesta-Lazaro - IAS

Gas Density
Temperature
Astrid
EAGLE
Interpolating over Simulations
Carolina Cuesta-Lazaro - IAS
Generalizing to unseen simulations: Magneticum



Carolina Cuesta-Lazaro - IAS
BEFORE
Artificial General Intelligence?
AFTER



https://parti.research.google​​​​​​​
A portrait photo of a kangaroo wearing an orange hoodie and blue sunglasses standing on the grass in front of the Sydney Opera House holding a sign on the chest that says Welcome Friends!



Carolina Cuesta-Lazaro - IAS
Reinforcement Learning
Carolina Cuesta-Lazaro - IAS
Bag of verifiable tasks
Policy (LLM)
Verifiable Reward
Expected Returns
["DeepSeek-R1" Guo et al 2025 arXiv:2501.12948]Coding competitions
What should we be thinking about?
Should Academia give up on training LLMs?
Should we design our own RL environments?
Should we think about the most ambitious projects we could tackle with a "country of geniuses in a data center"?
A radical change to how we work or just highlighting what was obviously wrong?
Carolina Cuesta-Lazaro - IAS
["DESI 2024 VI: Cosmological Constraints from the Measurements of Baryon Acoustic Oscillations" arXiv:2404.03002]

Dark Energy is constant over time
Carolina Cuesta-Lazaro - IAS
["An LLM-driven framework for cosmological
model-building and exploration" Mudur, Cuesta-Lazaro, Toomey ]

Can LLMs help us explore the space of hypothesis?
Propose a model for Dark Energy
Implement it in a Cosmology simulation code: CLASS
Test fit to DESI Observations
Iterate to improve fit
Quintessence, DE/DM interactions....
Must pass a set of general tests for "reasonable" models
Ideally, compare evidence to LCDM.
For now, Bayesian Information Criteria (BIC)
1
2

Nayantara Mudur (Harvard)
Carolina Cuesta-Lazaro - IAS
Can LLMs implement new physics models?
Thawing Quintessence
Axion-like Early Dark Energy
Ultra-light scalar field that temporarily acts as dark energy in the early universe
Implementation Challenge:
Dynamic dark energy model: scalar field transitions from "frozen" (cosmological constant-like) to evolving as the universe expands.
Oscillatory behaviour
Can take advantage of existing scalar field implementations in CLASS
+ 43,000 lines of C code
+ 10,000 lines of numerical files
CLASS Challenge:
Carolina Cuesta-Lazaro - IAS

1) Code compiles + passes unit tests (reasonable observables, numerical convergence...)
2) Implementation agrees with target repository
3) Goodness of fit for DESI + Supernovae
4) H0 tension metrics
Curated
1 page long description of model to be implemented, CLASS tips + very explicit units
Paper
Directly from a full paper
If fails, get feedback from another LLM
Carolina Cuesta-Lazaro - IAS

IAS-2026
By carol cuesta
IAS-2026
- 8