Bridging Baryons and Dark Matter
IAIFI Fellow, MIT / Center for Astrophysics

Carolina Cuesta-Lazaro
A Machine Learning perspective





1-Dimensional



Machine Learning
Secondary anisotropies
Galaxy formation
Intrinsic alignments



DESI, DESI-II, Spec-S5
Euclid / LSST
Simons Observatory
CMB-S4
Ligo
Einstein


The era of Big Data Cosmology
xAstrophysics
5-Dimensional
Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024










Astrophysics dominates Simulation-based Inference
Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024




~ Gpc
pc
kpc
Mpc
Gpc
[Video credit: Francisco Villaescusa-Navarro]
Gas density
Gas temperature
Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024
Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024
Effective Field Theories

Probabilistic Debiasing

Learning to represent feedback

Robust
Conservative assumptions galaxy formation
Large Scales
Robust?
Hydro sims assumptions
All Scales
Robust?
Generalizable
All Scales

Mikhail Ivanov
Robust galaxy bias model: Effective field Theories
+ Simulation as priors


Field-level EFT
["Full-shape analysis with simulation-based priors: constraints on single field inflation from BOSS" Ivanov, Cuesta-Lazaro et al arXiv:2402.13310]

Andrej Obuljen
Michael Toomey

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

["The Millennium and Astrid galaxies in effective field theory: comparison with galaxy-halo connection models at the field level"
Ivanov, Cuesta-Lazaro et al arXiv:2412.01888]

["Full-shape analysis with simulation-based priors: cosmological parameters and the structure growth anomaly" Ivanov, Obuljen, Cuesta-Lazaro, Toomey arXiv:2409.10609]

1 to Many:
Galaxies
Dark Matter

["Debiasing with Diffusion: Probabilistic reconstruction of Dark Matter fields from galaxies"
Ono et al (including Cuesta-Lazaro)
NeurIPs 2024 ML for the physical Sciences arXiv:2403.10648]

Victoria Ono
Core F. Park
Probabilistic Debiasing

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024


Truth
Sampled


Observed
Small
Large
Scale (k)
Power Spectrum
Small
Large
Scale (k)
Cross correlation
Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

Small
Large
In-Distribution
In-Distribution
In-Distribution
Out-of-Distribution
Out-of-Distribution
Out-of-Distribution
Out-of-Distribution
Out-of-Distribution
Out-of-Distribution
Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024
TNG-300
True DM
Sample DM




Size of training simulation
2) Generalising to larger volumes
Model trained on Astrid subgrid model
1) Generalising across subgrid models
Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024

["3D Reconstruction of Dark Matter Fields with Diffusion Models: Towards Application to Galaxy Surveys" Park, Mudur, Cuesta-Lazaro et al ICML 2024 AI for Science]
Posterior Sample
Posterior Mean
Debiasing Cosmic Flows
Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024
Representation Learning
Informative abstractions of the data



Transfer learning beyond LCDM
Cosmic web Anomaly Detection
Representing baryonic feedback

Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024
Representation Learning a la gradient descent
Contrastive
Generative
inductive biases
from scratch or from partial observations



Students at MIT are
OVER-CAFFEINATED
NERDS
SMART
ATHLETIC
Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024


Simulator 1
Simulator 2


Dark Matter
Feedback
i) Contrastive
Learning the feedback manifold
Baryonic fields
ii) Generative
Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024


Baryonic fields
Dark Matter
Generative model
Total matter, gas temperature,
gas metalicity





Encoder
Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024



X-Ray
Gas mass fractions
Gas density profiles
Sunyaev-Zeldovich
Galaxy Properties
Thermal Integrated electron pressure (hot electrons)
Star formation + histories
Stellar mass / halo mass relation
Carolina Cuesta-Lazaro IAIFI/MIT @ Princeton 2024
Multi-wavelength observations
FRBs
Integrated electron density

Kinetic Integrated electron density x peculiar velocity
1. Effective Field Theories of galaxy clustering benefit from simulation-based priors without compromising robustness
2. Probabilistic debiasing can robustly map the dark matter distribution
3. A general representation for baryonic feedback may inform galaxy formation modelling

Conclusions
Carolina Cuesta-Lazaro IAIFI/MIT @ IPMU 2024

Can we use hydro sims directly?
Requires forward modelling survey systematics at the field level

Can we constrain it through multi-wavelength observations?
PrincetonCosmology2024
By carol cuesta
PrincetonCosmology2024
- 200