Annotated guide to Beyond the Observable
IAIFI Fellow, MIT

Carolina Cuesta-Lazaro
Art: "Drawing Hands" by M.C. Escher
A Machine Learning perspective on modern Cosmology

Start with the crime scene!
Think about other people in the field working on similar things.
How is your voice different?
Does your talk reflect this in every part, even introduction?




1-Dimensional



Machine Learning
Secondary anisotropies
Galaxy formation
Intrinsic alignments



DESI / SphereX / Hetdex
Euclid / LSST
SO / CMB-S4
Ligo / Einstein


The era of Big Data Cosmology
xAstrophysics
HERA / CHIME
SAGA / MANGA




Galaxy formation
Emitters Census
Reionization


Cosmic Microwave Background
Galaxies / Dwarfs
21 cm
Galaxy Surveys
Gravitational Lensing
Gravitational Waves
Carolina Cuesta-Lazaro IAIFI/MIT @ NYU 2025
AGN Feedback/Supernovae
It's not a review talk, bring it back to who you are
How is your voice different?
Does your talk reflect this in every part, even introduction?
The talk should be focused topically, but you can find a sneaky way to let them know that you have broad interests without actually going into them
Why Now?
Beyond tools
Optimisation
Neural representations
Baryonification

Inflation

Symmetry-preserving ML

Early Universe - JWST

Simulation Based Inference
Epidemiological simulations


Medical Imaging
Natural Language Processing

Exoplanets
Compute
Simulations
Data
ML
Statistics
Physics
What is dark matter made of?
What is driving the accelerated expansion?
How did the Universe begin?
A new way of thinking
about
physical systems

Carolina Cuesta-Lazaro IAIFI/MIT @ NYU 2025
Good Visualizations of Why Now help

GANS

Deep Belief Networks
2006

VAEs

Normalising Flows

BigGAN

Diffusion Models

2014
2017
2019
2022
A folk music band of anthropomorphic autumn leaves playing bluegrass instruments
Contrastive Learning
2023
Meanwhile, on Earth...
Carolina Cuesta-Lazaro IAIFI/MIT @ NYU 2025
["Genie 2: A large-scale foundation model" Parker-Holder et al (2024)]

["Generative AI for designing and validating easily synthesizable and structurally novel antibiotics" Swanson et al]
Probabilistic ML has made high dimensional inference tractable
1024x1024xTime
["Genie 3: A new frontier for world models" Parker-Holder et al (2025)]
Carolina Cuesta-Lazaro IAIFI/MIT @ NYU 2025










Astrophysics proliferates Simulation-based Inference
on Simulations
Carolina Cuesta-Lazaro IAIFI/MIT @ NYU 2025
You'll have to be a bring cringe...

["A point cloud approach to generative modeling for galaxy surveys at the field level"
Cuesta-Lazaro and Mishra-Sharma
International Conference on Machine Learning ICML AI4Astro 2023, Spotlight talk, arXiv:2311.17141]
Base Distribution
Target Distribution
Simulated Galaxy 3d Map
Prompt:




Prompt: A person half Yoda half Gandalf
Carolina Cuesta-Lazaro IAIFI/MIT @ NYU 2025
Make sure people learn something interesting today, they're looking for signals of whether you will be a good teacher
Some people say you have to say something that sounds overly complicated and technical so that you know you are smart. That sounds very silly to me. It's more impressive if you can take something complicated and make it seem easy. That's the kind of person I want to work with!
Generative Models 101
Maximize the likelihood of the training samples
Parametric Model


Training Samples
Carolina Cuesta-Lazaro IAIFI/MIT @ NYU 2025
Trained Model

Evaluate probabilities


Low Probability
High Probability

Generate Novel Samples


Simulator
Generative Model
Fast emulators
Testing Theories
Generative Model
Simulator
Generative Models: Simulate and Analyze
Carolina Cuesta-Lazaro IAIFI/MIT @ NYU 2025




Reverse diffusion: Denoise previous step
Forward diffusion: Add Gaussian noise (fixed)
Prompt: A person half Yoda half Gandalf
Diffusion model
Carolina Cuesta-Lazaro IAIFI/MIT @ NYU 2025
Summarize the context of why you think what you've done is cool. It might not be obvious for people outside of your subfield
6 seconds / sim vs 40 million CPU hours
Fast Emulation:
Parameter constraints:
Generative Models: Simulate and Analyze





Diffusion
Pair Counting

Carol's optimistic forecast
High dimensional inference

Alternative Clustering Methods
Carolina Cuesta-Lazaro IAIFI/MIT @ NYU 2025
Sampling over jointly with theory parameters






Constrained Simulations for Galaxy Surveys
100M dimensions
Reconstructing ALL latent variables:
Dark Matter distribution
Entire formation history
Peculiar velocities
Interpretability:
Cross-Correlation with other probes

[Image Credit: Yuuki Omori]
Constraining Inflation:
Inferring primordial non-gaussianity
Carolina Cuesta-Lazaro IAIFI/MIT @ NYU 2025
Make sure you leave space for all the cool stuff you want to do in the future, and that the connection to who you are and what you've done in the past is obvious

Simulations
Observations
Guided by observational constraints
Robust Inference
Generative Models:
Beyond Simulation Emulation
Part 1
What is driving the accelerated expansion?
Reconstructing latent features:
Dark matter, ICs...
Part 2
How did the Universe begin?
What is dark matter made of?
Anomaly Detection for new physics searches
Baryonic feedback
Hybrid simulators
Part 3
Future Directions
Breaking LCDM
Predictive hydro sims
Carolina Cuesta-Lazaro IAIFI/MIT @ NYU 2025
If it makes sense, tell them why you think you'll love it there
Beyond tools
Compute
Simulations
Data
ML
Statistics
Physics
Use-inspired AI developments
The future of Astrophysics
A new way of thinking
about
physical systems
Carolina Cuesta-Lazaro IAIFI/MIT @ NYU 2025

Eric Vanden-Eijnden
Kyunghyun Cho
Mehryar Mohri

Yann Lecun
Rob Fergus
CCPP
Denis Zorin
Roman Scoccimarro
Jeremy Tinker
Mike O'Neil
Anthony Pullen
Leslie Greengard
David W. Hogg
Georg Stadler
Ken Van Tilburg
Neal Weiner
Olivier Pauluis
Glennys R. Farrar
Edwin P. Berger
Yacine Ali-Haïmoud

Practice with people that have a critical mindset but that you trust, ideally also with people that are not super familiar with your research / field
Advcie-NYU-2025
By carol cuesta
Advcie-NYU-2025
- 145