["Genie 2: A large-scale foundation model" Parker-Holder et al (2024)]
["Generative AI for designing and validating easily synthesizable and structurally novel antibiotics" Swanson et al]
Probabilistic ML has made high dimensional inference tractable
1024x1024xTime
["Genie 3: A new frontier for world models" Parker-Holder et al (2025)]
1-Dimensional
Machine Learning
Secondary anisotropies
Galaxy formation
Intrinsic alignments
DESI / SphereX / Hetdex
Euclid / LSST
SO / CMB-S4
Ligo / Einstein
xAstrophysics
HERA / CHIME
SAGA / MANGA
Galaxy formation
Emitters Census
Reionization
Cosmic Microwave Background
Galaxies / Dwarfs
21 cm
Galaxy Surveys
Gravitational Lensing
Gravitational Waves
AGN Feedback/Supernovae
Field-Level Inference and Emulators
Robust Simulation-based inference
Generating Fields
Generating Representations
Disentangling systematics from physics latent spaces
A digital twin of our Universe
Observed Galaxy Distribution
Simulated Galaxy Distribution
Field Level Inference
Forward Model
(= no Cosmic Variance)
Optimal constraints
Counts-in-cell
Do we really need to infer 10^9 parameters to constrain ~10?
Compression
Marginal Likelihood
Explicit Likelihood
Implicit Likelihood
Initial Conditions
Maximize the likelihood of the training samples
Parametric Model
Training Samples
Trained Model
Evaluate probabilities
Low Probability
High Probability
Generate Novel Samples
Simulator
Generative Model
Fast emulators
Inference
Generative Model
Simulator
Base
Data
"Creating noise from data is easy;
creating data from noise is generative modeling."
Yang Song
Neural Network
6 seconds / sim vs 40 million CPU hours
Fast Emulation
Density Fields
arXiv:2405.05255
Point Clouds
arXiv:2311.17141
1) Sampling the Neural Likelihood (NLE) with HMC
2) Directly an optimal compression: Neural Posterior (NPE)
Learned Likelihood
CNN
Diffusion
Increasing Noise
["Diffusion-HMC: Parameter Inference with Diffusion Model driven Hamiltonian Monte Carlo" Mudur, Cuesta-Lazaro and Finkbeiner NeurIPs 2023 ML for the physical sciences, arXiv:2405.05255]
Nayantara Mudur
Posterior (NPE)
Likelihood (NLE)
Learning the marginal likelihood is more robust
Learned Likelihood
Reconstructing ALL latent variables:
Dark Matter distribution
Entire formation history
Peculiar velocities
Predictive Cross Validation:
Cross-Correlation with other probes without Cosmic Variance
[Image Credit: Yuuki Omori]
Constraining Inflation:
Inferring primordial non-gaussianity
Data-driven Subgrid models / Data-driven Systematics
"Joint cosmological parameter inference and initial condition reconstruction with Stochastic Interpolants"
Cuesta-Lazaro, Bayer, Albergo et al
NeurIPs ML4PS 2024 Spotlight talk
Particle Mesh
Dark Matter Only
Gaussian Likelihood
1) Likelihood not necessarily Gaussian
2) Forward model no need differentiable
3) Amortized
Generative Model: Marginalizing over ICs
Generative Model: Fixing ICs
HMC: Marginalizing over ICs
True
Reconstructed
Initial Conditions
Finals
HMC (ICs)
SBI (ICs)
SBI (Finals)
HMC (Finals)
HMC (ICs)
SBI (ICs)
SBI (Finals)
HMC (Finals)
HMC (ICs)
SBI (ICs)
SBI (Finals)
HMC (Finals)
SBI
HMC
Scaling up in volume
DESI Y1 LRG Effective volumes already larger than our sims!
Small Scale Galaxy Bias
How galaxies are selected
Fibre collisions
Forward Modelling the Survey Systematics
EFT
Self-Consistent Predictions across observables
arXiv:1804.03097
X-Ray
Cluster gas mass fractions
Cluster gas density profiles
Sunyaev-Zeldovich
Galaxy Properties
Thermal Integrated electron pressure (hot electrons / big objects)
Star formation + histories
Stellar mass / halo mass relation
FRBs
Integrated electron density
Kinetic Integrated electron density x peculiar velocity
["BaryonBridge: Interpolants models for fast hydrodynamical simulations" Horowitz, Cuesta-Lazaro, Yehia ML4Astro workshop 2025]
Particle Mesh for Gravity
CAMELS Volumes
1000 boxes with varying cosmology and feedback models
Gas Properties
Current model optimised for Lyman Alpha forest
7 GPU minutes for a 50 Mpc simulation
130 million CPU core hours for TNG50
Density
Temperature
Galaxy Distribution
[Video credit: Francisco Villaescusa-Navarro]
Gas density
Gas temperature
Subgrid model 1
Subgrid model 2
Subgrid model 3
Subgrid model 4
Gas
Galaxies
Dark Matter
Baryonic fields
Marginalize over a broader set of subgrid physics
Interpolate between simulators
Mingshau Liu
(Ming)
Constrain z via multi-wavelength observations
Trained on:
TNG, SIMBA, Astrid, EAGLE
Encoder
1) Encoder
Gas
Galaxies
Dark Matter
Baryonic fields
2) Probabilistic Decoder
Dark Matter
Baryonic fields
(Test suite)
Gas Density
Temperature
Astrid
EAGLE
Simulated Data
Observed Data
Alignment Loss
Reconstruction
Statistical Alignment
(OT / Adversarial)
Encoder
Obs
Encoder
Sims
Private Domain Information
Shared Information
Observed Reconstructed
Simulated Reconstructed
Shared Decoder
Shared Decoder
Idealized Simulations
Observations
+ Scale Dependent Noise
+ Bump
Amplitude
Tilt
Tilt
arXiv:2503.15312
Pablo Mercader
Daniel Muthukrishna
Jeroen Audenaert
Legacy Survey
HSC
DESI
SDSS
Same Object / Different Instrument
Different Object / Same Instrument
Object 1
Object 2
Object 1
Orientation + Scale
Number
Instrument 1
Instrument 1
Instrument 2
Instrument Encoder
Object Encoder
Instrument Pair
Object Pair
Instrument Pair
Object Pair
Ground Truth
Instrument Pair
Object Pair
Recon
2. We can scale hydrodynamical simulation in volume for the analysis of LSS surveys
Can we leverage multi-wavelength observations?
3. Playing with the latent space will help us learn robustly
1. Cosmological field level inference can be made scalable with generative models
Can EFT help us scale in volume?
Can generally make simulators more controllable!
Is resolution too low?
Private-Shared Information Split
Disentangling systematics
Observation
Question
Hypothesis
Testable Predictions
Gather data
Alter, Expand, Reject Hypothesis
Develop General Theories
[Figure adapted from ArchonMagnus]
Simulators as theory models
High-dimensional data
["An LLM-driven framework for cosmological
model-building and exploration" Mudur, Cuesta-Lazaro, Toomey (in prep)]
Propose a model for Dark Energy
Implement it in a Cosmology simulation code: CLASS
Test fit to DESI Observations
Iterate to improve fit
Quintessence, DE/DM interactions....
Must pass a set of general tests for "reasonable" models
Ideally, compare evidence to LCDM.
For now, Bayesian Information Criteria (BIC)
1
2
Nayantara Mudur (Harvard)
Thawing Quintessence
Axion-like Early Dark Energy
Ultra-light scalar field that temporarily acts as dark energy in the early universe
Implementation Challenge:
Dynamic dark energy model: scalar field transitions from "frozen" (cosmological constant-like) to evolving as the universe expands.
Oscillatory behaviour
Can take advantage of existing scalar field implementations in CLASS
+ 43,000 lines of C code
+ 10,000 lines of numerical files
CLASS Challenge:
1) Code compiles + obtains reasonable observables
2) Implementation agrees with target repository
3) Goodness of fit for DESI + Supernovae
4) H0 tension metrics
Curated
1 page long description of model to be implemented, CLASS tips + very explicit units
Paper
Directly from a full paper
If fails, get feedback from another LLM
Shortcut: field that produces this?
Asked for physical motivation. It tried :(
Not true, preferred scale
Reinforcement Learning
Update the base model weights to optimize a scalar reward (s)
DeepSeek R1
Base LLM
(being updated)
What rewards are more advantageous?
Base LLM
(frozen)
Develop basic skills: numerics, theoretical physics, UNIT CONVERSION
Community Effort!
Evolutionary algorithms
Learning in natural language, reflect on traces and results
Examples: EvoPrompt, FunSearch,AlphaEvolve
["GEPA: Reflective prompt evolution can outperform reinforcement learning" Agrawal et al]
GEPA: Evolutionary
GRPO: RL
+10% improvement over RL with x35 less rollouts
Scientific reasoning with LLMs still in its infancy!
Observation
Question
Hypothesis
Testable Predictions
Gather data
Alter, Expand, Reject Hypothesis
Develop General Theories
[Figure adapted from ArchonMagnus]