Build Big Meets Build Smart To Explore The Universe

Flatiron Institute

Institute for Advanced Studies

Carolina Cuesta-Lazaro

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

Observation

Question

Hypothesis

Testable Predictions

Gather data

Alter, Expand, Reject Hypothesis

Develop General Theories

[Figure adapted from ArchonMagnus] 

High-dimensional data

Simulators as theory models

The Scientific Method in 2025

The Universe accelerates!

The Universe expands, it should decelerate

What is the ultimate fate of the Universe?

Need a repulsive dark energy component

Measure supernovae redshifts

Matter domination -> the Universe decelerates: rate?

Distance-redshift relation via standard candles

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

Simulations?

"Semantic" lower dimensional representation

Foundation Models for Science

[On the Opportunities and Risks of Foundation Models" Bommasani et al]

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

Simulations in Foundation Models for Science

x^\mathcal{O}
x^\mathcal{S}

Simulated Data

Observed Data

z^\mathcal{O}_p
z^\mathcal{O}_s
z^\mathcal{S}_s
z^\mathcal{S}_p

Alignment Loss

\mathcal{L} = \sum_{\mathcal{D} \in (\mathcal{S}, \mathcal{O})} p(x^\mathcal{D}|z^\mathcal{D}_s, z^\mathcal{D}_p) + \lambda d(z^\mathcal{O}_s,z^\mathcal{S}_s)
E^\mathcal{O}
E^\mathcal{S}

Reconstruction

Alignment

50\%

(OT / Adversarial)

\hat{x}^\mathcal{O}
\hat{x}^\mathcal{S}
D

Shared Decoder

D

Observed Reconstructed

Simulated Reconstructed

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

A Toy Model Example

Idealized Simulations

Observations

+ Scale Dependent Noise

+ Bump

x^\mathcal{O}
x^\mathcal{S}
["Disentangling Foundation Models for Science: Robust Integration of Simulated and Observed Data" Cuesta-Lazaro, Alvarez-Melis (in-prep)]

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

Amplitude

Tilt

Tilt

p(\theta|z^\mathcal{O}_s)
p(\theta|z^\mathcal{O}_p)
p(\theta|z^\mathcal{O}_p,z^\mathcal{O}_s)
p(\theta|z^\mathcal{O}_p)

Robust SBI from Shared

p(x^\mathcal{O}|z^\mathcal{O}_p,z^\mathcal{O}_s)
p(x^\mathcal{O}|z^\mathcal{O}_s)

Visualizing Information Split

["Disentangling Foundation Models for Science: Robust Integration of Simulated and Observed Data" Cuesta-Lazaro, Alvarez-Melis (in-prep)]

Late Universe

Early Universe

Tension

From Tensions to Discoveries:  Anomalies in Cosmology

Early vs Late

Parametric Extensions

[Image Credit: Prof. Wendy Freedman]

 

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

Looking for what we don't know to look for

The missing pieces: Beyond parametric searches

Axion Dark Matter

Dark Matter - Baryon Interactions

Primordial Non-Gaussianity

Early Dark Energy

Dark Radiation

[Credit: Sandbox Studio]

 

[Credit: Sandbox Studio]

 

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

Learning Structured Representations with Flow Matching

["CosmoFlow: Scale-Aware Representation Learning for Cosmology with Flow Matching" Kannan, Qiu, Cuesta-Lazaro, Jeong]

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

Sid Kannan (UCSB)

Autoregressive in Frequency

["CosmoFlow: Scale-Aware Representation Learning for Cosmology with Flow Matching" Kannan, Qiu, Cuesta-Lazaro, Jeong]

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

["Detecting Model Misspecification in Cosmology with Scale-Dependent Normalizing Flows" Akhmetzhanova, Cuesta-Lazaro, Mishra-Sharma]

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

Base

OOD Mock 1

OOD Mock 2

Large Scales

Small Scales

Small Scales

OOD Mock 1

OOD Mock 2

Parameter Inference Bias (Supervised)

OOD Metric (Unsupervised)

Large Scales

Small Scales

Aizhan Akhmetzhanova (Harvard)

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

Observation

Question

Hypothesis

Testable Predictions

Gather data

Alter, Expand, Reject Hypothesis

Develop General Theories

[Figure adapted from ArchonMagnus] 

Simulators as theory models

The Scientific Method in 2025

High-dimensional data

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

["An LLM-driven framework for cosmological
model-building and exploration" Mudur, Cuesta-Lazaro, Toomey (in prep)]

Can LLMs turn these anomalies into new hypothesis?

Propose a model for Dark Energy

Implement it in a Cosmology simulation code: CLASS

Test fit to DESI Observations

Iterate to improve fit

Quintessence, DE/DM interactions....

Must pass a set of general tests for "reasonable" models

Ideally, compare evidence to LCDM.

For now, Bayesian Information Criteria (BIC)

1

2

Nayantara Mudur (Harvard)

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

Can LLMs implement new physics models?

Thawing Quintessence

Axion-like Early Dark Energy

Ultra-light scalar field that temporarily acts as dark energy in the early universe 

Implementation Challenge:

Dynamic dark energy model: scalar field transitions from "frozen"  (cosmological constant-like) to evolving as the universe expands.

Oscillatory behaviour

Can take advantage of existing scalar field implementations in CLASS

+ 43,000 lines of C code

+ 10,000 lines of numerical files

CLASS Challenge:

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

1) Code compiles + obtains reasonable observables

2) Implementation agrees with target repository

3) Goodness of fit for DESI + Supernovae

4) H0 tension metrics

Curated

1 page long description of model to be implemented,  CLASS tips + very explicit units

Paper

Directly from a full paper

If fails, get feedback from another LLM

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

Propose a Dark Energy Model

Shortcut: field that produces this?

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

Propose a Dark Energy Model

Asked for physical motivation. It tried :( 

Not true, preferred scale

Reinforcement Learning

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

How to iterate

Update the base model weights  to optimize a scalar reward (s)

DeepSeek R1

Base LLM

(being updated)

What rewards are more advantageous?

Base LLM

(frozen)

Develop basic skills: numerics, theoretical physics, UNIT CONVERSION

Community Effort!

Evolutionary algorithms

Learning in natural language, reflect on traces and results

Examples: EvoPrompt, FunSearch,AlphaEvolve

How to iterate

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

["GEPA: Reflective prompt evolution can outperform reinforcement learning" Agrawal et al]

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

GEPA: Evolutionary

GRPO: RL

+10% improvement over RL with x35 less rollouts

Scientific reasoning with LLMs still in its infancy!

Carolina Cuesta-Lazaro Flatiron/IAS - Build Big or Build Smart

Observation

Question

Hypothesis

Testable Predictions

Gather data

Alter, Expand, Reject Hypothesis

Develop General Theories

[Figure adapted from ArchonMagnus] 

The Scientific Method in > 2025

MIA-Big or Smart-2025

By carol cuesta

MIA-Big or Smart-2025

  • 4