Generative Solutions for Cosmic Problems

Flatiron Institute

Institute for Advanced Studies

Carol(ina) Cuesta-Lazaro

p(\mathrm{World}|\mathrm{Prompt})
["Genie 2: A large-scale foundation model" Parker-Holder et al (2024)]
p(\mathrm{Drug}|\mathrm{Properties})
["Generative AI for designing and validating easily synthesizable and structurally novel antibiotics" Swanson et al]

Probabilistic ML has made high dimensional inference tractable

1024x1024xTime

["Genie 3: A new frontier for world models" Parker-Holder et al (2025)]

Carolina Cuesta-Lazaro Flatiron/IAS - TriState

Model Mispecification

Carolina Cuesta-Lazaro Flatiron/IAS - TriState

Shared Information

Private Information

- Shared + Private

Simulation-Based Inference in Cosmology

\delta_\mathrm{Obs}
\delta_\mathrm{ICs}

True

Reconstructed

p(\delta_\mathrm{ICs}, \theta|\delta_\mathrm{Obs})

Idealized Simulations

Observations

+ Scale Dependent Noise

+ Bump

x^\mathcal{O}
x^\mathcal{S}

Representation Learning

Physics

Systematics

Amplitude

Tilt

Tilt

p(\theta|z^\mathcal{O}_s)
p(\theta|z^\mathcal{O}_p,z^\mathcal{O}_s)
p(\theta|z^\mathcal{O}_p)

- Shared

- Private

[arXiv:2503.15312]

BEFORE

Artificial General Intelligence?

AFTER

Carolina Cuesta-Lazaro Flatiron/IAS - TriState

Artificial General Intelligence?

Carolina Cuesta-Lazaro Flatiron/IAS - TriState

[https://metr.org/blog/2025-07-14-how-does-time-horizon-vary-across-domains/]

Observation

Question

Hypothesis

Testable Predictions

Gather data

Alter, Expand, Reject Hypothesis

Develop General Theories

[Figure adapted from ArchonMagnus] 

High-dimensional data

Simulators as theory models

The Scientific Method in 2025

The Universe accelerates!

The Universe expands, it should decelerate

What is the ultimate fate of the Universe?

Need a repulsive dark energy component

Measure supernovae redshifts

Matter domination -> the Universe decelerates: rate?

Distance-redshift relation via standard candles

Carolina Cuesta-Lazaro Flatiron/IAS - TriState

\Lambda \mathrm{CDM}
["DESI 2024 VI: Cosmological Constraints from the Measurements of Baryon Acoustic Oscillations" arXiv:2404.03002]

Dark Energy is constant over time

Carolina Cuesta-Lazaro Flatiron/IAS - TriState

w(z) = \frac{p_\mathrm{DE}}{\rho_\mathrm{DE}} = w_0 + \frac{z}{1+z}w_a
2 - 4 \sigma

DM-DE Interactions

Carolina Cuesta-Lazaro Flatiron/IAS - TriState

[arXiv: 2503.14743]

Phantom Crossing

\rho + p \geq 0

Violates Null Energy Condition

[arXiv: 2503.16415]
w_\mathrm{eff} = \frac{w_\phi}{1 + \left[ \frac{A(\phi)}{A(\phi_0)} -1 \right] \frac{\rho^0_\mathrm{DM}}{a^3 \rho_\phi}}

Change in dark matter mass

["An LLM-driven framework for cosmological
model-building and exploration" Mudur, Cuesta-Lazaro, Toomey (in prep)]

Can LLMs help us explore the space of hypothesis?

Propose a model for Dark Energy

Implement it in a Cosmology simulation code: CLASS

Test fit to DESI Observations

Iterate to improve fit

Quintessence, DE/DM interactions....

Must pass a set of general tests for "reasonable" models

Ideally, compare evidence to LCDM.

For now, Bayesian Information Criteria (BIC)

1

2

Nayantara Mudur (Harvard)

Carolina Cuesta-Lazaro Flatiron/IAS - TriState

Can LLMs implement new physics models?

Thawing Quintessence

Axion-like Early Dark Energy

Ultra-light scalar field that temporarily acts as dark energy in the early universe 

Implementation Challenge:

Dynamic dark energy model: scalar field transitions from "frozen"  (cosmological constant-like) to evolving as the universe expands.

Oscillatory behaviour

Can take advantage of existing scalar field implementations in CLASS

+ 43,000 lines of C code

+ 10,000 lines of numerical files

CLASS Challenge:

Carolina Cuesta-Lazaro Flatiron/IAS - TriState

1) Code compiles + passes unit tests (reasonable observables, numerical convergence...)

2) Implementation agrees with target repository

3) Goodness of fit for DESI + Supernovae

4) H0 tension metrics

Curated

1 page long description of model to be implemented,  CLASS tips + very explicit units

Paper

Directly from a full paper

If fails, get feedback from another LLM

Carolina Cuesta-Lazaro Flatiron/IAS - TriState

Propose a Dark Energy Model

Shortcut: field that produces this?

Carolina Cuesta-Lazaro Flatiron/IAS - TriState

Propose a Dark Energy Model

Asked for physical motivation. It tried :( 

Not true, preferred scale

Carolina Cuesta-Lazaro Flatiron/IAS - TriState

Reinforcement Learning

How to iterate

Update the base model weights  to optimize a scalar reward (s)

DeepSeek R1

Base LLM

(being updated)

Base LLM

(frozen)

Develop basic skills: numerics, theoretical physics, experimentation...

Community Effort!

Carolina Cuesta-Lazaro Flatiron/IAS - TriState

Learning to play "Scientist"

Carolina Cuesta-Lazaro Flatiron/IAS - TriState

\mathcal{L}(q_\mathrm{obs},q_\mathrm{latent},p_\mathrm{obs},p_\mathrm{latent})

1. Design next Experiment

2. Hypothesize Equation of motion

3. Simulate and Compare

p(\mathrm{World})
p(\mathrm{Prompt}|\mathrm{World})

Evolutionary algorithms

Learning in natural language, reflect on traces and results

Examples: EvoPrompt, FunSearch,AlphaEvolve

How to iterate

Carolina Cuesta-Lazaro Flatiron/IAS - TriState

["GEPA: Reflective prompt evolution can outperform reinforcement learning" Agrawal et al]

GEPA: Evolutionary

GRPO: RL

+10% improvement over RL with x35 less rollouts

Scientific reasoning with LLMs still in its infancy!

Carolina Cuesta-Lazaro Flatiron/IAS - TriState

3. Science is ultimately a human endeavor, what questions are interesting to answer and may be solvable is up to us.  What role can LLMs play  in Science?

Conclusions

1. LLMs are improving on most subjects at an insane rate, including maths

What problems in physics can we tackle with automated code generation?

Can generally make simulators more controllable!

Artificial Muses

2. How do we improve their physics reasoning skills?

RL over simulated worlds

Science not so amenable to a "scalar reward" setup

Carolina Cuesta-Lazaro Flatiron/IAS - TriState

"Play" is important

CCA-TriState-2025

By carol cuesta

CCA-TriState-2025

  • 2