Understanding Generative Models
via Interactions
Claudia Merger, Alexandre Rene, Kirsten Fischer, Peter Bouss, Sandra Nestler, David Dahmen, Carsten Honerkamp, Moritz Helias and Sebastian Goldt
13.03.2026

Generative models learn data statistics
examples use cases:
- image/video/audio/text generators
- physical observables (replace costly scientific simulations)
- foundation models for drug discovery,
Task: Given some data \( \mathcal{D} \) from an unknown distribution \( p \)
Generate \( x \sim p \)
Task is solved by learning \( \, p_{\theta} \approx p\)
e.g. with Likelihood \( \mathcal{L}\left(\mathcal{D}\right) =-\sum_{x \in \mathcal{D}} \ln p_{\theta}(x) \)
Understanding Generative models
Task: Given some data \( \mathcal{D} \) from an unknown distribution \( p \)
Generate \( x \sim p \)
Task is solved by learning \( \, p_{\theta} \approx p\)
Two questions:
- What can we learn from \(p_{\theta} \) about data?
- How close are \( p, \, p_{\theta} \) ?
\( p\)
\( \, p_{\theta} \)
Span model space with interactions.
?
?
Write interacting theory using polynomial action \( S_{\theta} (x) = \ln p_{\theta} (x)\)
\( S_{\theta} (x)= A^{(0)} + A^{(1)}_{i} x_i + A^{(2)}_{ij} x_i x_j +A^{(3)}_{ijk} x_i x_j x_k + \dots \)
Interactions are effective descriptions of complex systems
Merger, C., et. al. ‘Learning Interacting Theories from Data’. PRX, 2023
Write interacting theory using polynomial action \( S_{\theta} (x) = \ln p_{\theta} (x)\)
\( S_{\theta} (x)= A^{(0)} + A^{(1)}_{i} x_i + A^{(2)}_{ij} x_i x_j +A^{(3)}_{ijk} x_i x_j x_k + \dots \)
Example:


Interactions are effective descriptions of complex systems
Merger, Rene, et. al. ‘Learning Interacting Theories from Data’. PRX, 2023
Write interacting theory using polynomial action \( S_{\theta} (x) = \ln p_{\theta} (x)\)
\( S_{\theta} (x)= A^{(0)} + A^{(1)}_{i} x_i + A^{(2)}_{ij} x_i x_j +A^{(3)}_{ijk} x_i x_j x_k + \dots \)
\( A^{(k)} \)
Interactions are effective descriptions of complex systems
Why Interactions?
\( S_{\theta} (x)= A^{(0)} + A^{(1)}_{i} x_i + A^{(2)}_{ij} x_i x_j +A^{(3)}_{ijk} x_i x_j x_k + \dots \)
Why use interactions to study deep learning?
Observation: neural networks learn "easy" statistics first, then more complex statistics
\( \rightarrow \) see also: Ingrosso & Goldt, 2022; Refinetti et al., 2023; Belrose et al., 2024, ...
\( \rightarrow \) principled approach to studying learning of statistics from data, from easy to hard


Predict performance of diffusion models as a function of \( \# \text{training examples} \)
\( p\)
\( \, p_{\theta} \)
Merger, Goldt, 2025 arXiv.2505.24769.
Good performance: at least \( \# \text{training examples} \asymp d\)
Stronger decay in spectra \(\rightarrow \) better performance at fixed \(N\)
estimate
\( A^{(2)} \propto \frac{1}{\Sigma_{\text{emp.}} +\gamma \text{Id}} \neq \frac{1}{\Sigma_{\text{true}}}\)
of \(\Sigma_{\text{true}}\)
Predict performance of diffusion models as a function of \( \# \text{training examples} \)
\( p\)
\( \, p_{\theta} \)
Bardone, Merger, Goldt, 2026
on arXiv soon!
Plant one direction with higher order statistics:
diffusion models need \( \mathcal{O} \left(d^{k^*-1} \right) \) samples to find it
Understanding Generative models via Interactions
\( p\)
\( \, p_{\theta} \)
Using interactions, we can
- map the inferred statistics to an interpretable form central to physics
- predict the performance of generative models
Thanks to
Lorenzo Bardone
Alexandre Rene
Kirsten Fischer
Peter Bouss
Sandra Nestler
David Dahmen
Carsten Honerkamp
Moritz Helias
Sebastian Goldt



DPG Talk 2026
By merger
DPG Talk 2026
- 0