Modeling and simulation with AI

 

 

Vedant Puri
DEC 04, 2025

Committee:

Levent Burak Kara, Yongjie Jessica Zhang, Amir Barati Farimani, Krishna Garikipati

Motivation: speed up engineering design workflow

  • A

Motivation: understanding turbulence is critical for energy systems

Mesosphere

Wind farm

Turbine

Blade

1000\,\mathrm{km}
10\,\mathrm{km}
100\,\mathrm{m}
10\,\mathrm{m}

Phase 1: High-order finite-elements on large meshes

1

\partial_t \vec{v} + (\vec{v}\cdot\nabla)\vec{v} = -\nabla p + \frac{1}{\text{Re}}\Delta \vec{v} + f\\ \nabla\cdot\vec{v} = 0

Navier-Stokes Equations

(Flow past bluff body \( Re = 3900 \))

Need high quality function representation over (complex) geometry

Main operations: \(\nabla, \, \int_\Omega\)

High-order interpolation is the underlying technology

Differentiation

Interpolation

Integration

Prohibitively expensive

Challenges with meshing

Requires tailoring solution to problem

Newsflash: ML beats the curse of dimensionality-ish

2

Orthogonal Functions Deep Neural Networks






 

 

 
f = \tilde{f} + \mathcal{O}(h)
\tilde{f}(x) = \Sigma_{i=1}^N f_i \phi_i(x)
\tilde{f}(x) = Z_L \circ (\dotsc (\sigma Z_0(x)))

\( N \) parameters, \(M\) points

\( h \sim 1 / N \) (for shallow networks)

\( N \) points

\( \dfrac{d}{dx} \tilde{f}\sim \mathcal{O}(N^2) \) (exact)

\( \dfrac{d}{dx} \tilde{f} \sim \mathcal{O}(N) \) (exact, AD)

\( \int_\Omega \tilde{f} dx \sim \mathcal{O}(N) \) (exact)

(Weinan, 2020)

\( \int_\Omega \tilde{f} dx \sim \mathcal{O}(M) \) (approx)

Model size scales with signal complexity

Model size scales exponentially with dimension

\( N \sim h^{-d/c} \)

Phase 2: Hybrid ML + FEM systems for closure modeling

3

Phase 3: Enhancing PDE solvers with ML and vice-versa

4

Landscape of ML for PDEs

Mesh ansatz

PDE-Based

Neural Ansatz

Data-driven

FEM, FVM, IGA, Spectral

Fourier Neural Operator

Neural Field

DeepONet

Physics Informed NNs

Convolution NNs

Graph NNs

Adapted from Núñez, CEMRACS 2023

Neural ODEs

Universal Diff Eq

u =
\dfrac{du}{dt} =
\dfrac{d\tilde{u}}{dt} = \tilde{\mathcal{L}}_p(\tilde{u}) +
\dfrac{du}{dt} = \mathcal{L}_p(u) + \mathcal{N}_p(u)
\begin{cases} \dfrac{d u}{dt} = \mathcal{L}_p(u) + \mathcal{N}_p(u), & x\in\Omega\\ u|_{\partial\Omega} = g(t) \end{cases}

Reduced Order Modeling

Overview

  • Data + physics - reduced order modeling

     
  • Data-driven surrogate model

     
  • Proposed work 1

     
  • Proposed work 2

Reduced order modeling

 

How to do physics with neural representations?

Primer on model order reduction

\frac{\partial \boldsymbol{u}}{\partial t} = \mathcal{L}(\boldsymbol{x}, t, \boldsymbol{u}; \boldsymbol{\mu})

2

Full order model (FOM)

\boldsymbol{u}(\boldsymbol{x}, t; \boldsymbol{\mu}) \approx g_\text{FOM}(\boldsymbol{x}, \textcolor{red}{\bar{u}(t; \boldsymbol{\mu}})) = \mathbf{\Phi} \cdot \textcolor{red}{\bar{u}(t; \boldsymbol{\mu}})

Linear POD-ROM

Nonlinear ROM

\textcolor{red}{\bar{u}(t; \boldsymbol{\mu})} \approx g'_\text{ROM}(\textcolor{orange}{\bar{u}(t; \boldsymbol{\mu}}) = \bar{u}_0 + \mathbf{P} \cdot \textcolor{orange}{\tilde{u}(t; \mathbf{\mu})}
\boldsymbol{u}(\boldsymbol{x}, t; \boldsymbol{\mu}) \approx g_\text{ROM}(\boldsymbol{x}, \textcolor{blue}{\tilde{u}(t; \boldsymbol{\mu}})) = \mathrm{NN}_\theta\left(\boldsymbol{x}, \textcolor{blue}{\tilde{u}(t; \boldsymbol{\mu}}) \right)

Learn low-order spatial representations

Time-evolution of reduced representation with Galerkin projection

\mathbb{R}^{N_\text{FOM}}
\bar{u}(0)
\tilde{u}(0)
\tilde{u}(T)
\mathcal{M}
\bar{u}(T)
h_\text{ROM}
g_\text{ROM}
\begin{pmatrix} \hspace{0.4em} \\ \\ \\ \end{pmatrix}
\begin{pmatrix} \hspace{0.4em} \\ \\ \\ \end{pmatrix}
\bar{u}(t=0)
\tilde{u}(t=0)
\tilde{u}(t=T)
\bar{u}(t=T)
\text{Manifold}\\ \text{projection}
\text{Model}\\ \text{inference}
\frac{\mathrm{d} \bar{u}}{\mathrm{d} t} = \mathcal{L}(\bar{u}, t)
\mathbf{J}_g\frac{\mathrm{d} \tilde{u}}{\mathrm{d} t} = \mathcal{L}(g_\text{ROM}(\tilde{u}), t)
\begin{pmatrix} \hspace{0.8em} \\ \\ \\ \\ \end{pmatrix}
\begin{pmatrix} \hspace{0.8em} \\ \\ \\ \\ \end{pmatrix}
\mathbb{R}^{N_\text{FOM}}
\bar{u}(0)
\bar{u}(T)
\textcolor{orange}{N_\text{Lin-ROM}} << \textcolor{red}{N_\text{FOM}}
\textcolor{blue}{N_\text{Nl-ROM}} <\,\,

Motivation: accelerating PDE solvers

1

2D Viscous Burgers problem \( (\mathit{Re} = 1\text{k})\)

Smooth neural field ROM (SNF-ROM)

  • Fast inference time - \(199\times\) speed-up
  • Consistently high accuracy
  • Robust & non-intrusive online evaluation

\(\text{Relative error: }0.37\%\)

\(\text{DoFs: }524~k \to 2\)

\(\text{Wall-time: }13.4~\text{s} \to 0.068~\text{s}\)

High freq. noise

Non-differentiable!

Accurately capture of dynamics with smooth neural fields

\textcolor{red}\times
\mathrm{NN}(x) \approx u^{}(x) \implies \dfrac{\mathrm{d}^k}{\mathrm{d}x^k} \mathrm{NN}(x) \approx u^{(k)}(x)
\mathrm{NN}(x)
\frac{\mathrm{d}}{\mathrm{d}x} \mathrm{NN}(x)
\frac{\mathrm{d}^2}{\mathrm{d}x^2} \mathrm{NN}(x)

Large deviations!

Learning smooth latent space trajectories

\(\text{Autoencoder ROM}\)

\(\text{SNF-ROM}\)

\text{Projection}
\text{Online solve}
\text{Projection}
\text{Online solve}

Evolution of ROM states

No deviation

\text{FOM}
\text{POD-ROM}
\text{SNFL-ROM}
\text{CAE-ROM}
\text{SNFW-ROM}

SNF-ROM ensures accurate online dynamics evaluation.

Accurate capture of dynamics

Unsupervised learning has little control on latent space trajectories

3

\text{FOM}
\text{CAE-ROM}

Autoencoder ROMs see a sharp rise in error due to deviation of the reduced states from the learned manifold.

Encoder-free ROMs have disjoint latent space representations which inhibit online evaluations.

\text{Encoder}(\bar{u}(t))
\text{Online solve}
\text{distribution of reduced states }(\tilde{u})
\text{distribution of }\tilde{u}

Autoencoder ROMs

Auto-decoder ROMs

\tilde{u}(t)
\begin{pmatrix} \hspace{0.9em} \\ \\ \\ \\ \end{pmatrix}
\begin{pmatrix} \textcolor{blue}*\\ \textcolor{blue}*\\ \end{pmatrix}
\begin{pmatrix} \hspace{0.9em} \\ \\ \\ \\ \end{pmatrix}
\bar{u}(t)
\bar{u}(t)

\(\text{Encoder}\)

\(\text{Decoder}\)

\text{Projection}
\begin{pmatrix} \textcolor{blue}*\\ \textcolor{blue}*\\ \end{pmatrix}
\begin{pmatrix} \hspace{0.8em} \\ \\ \\ \\ \end{pmatrix}
\bar{u}(t_1)

\(\text{Decoder}\)

\cdots
\begin{pmatrix} \textcolor{blue}*\\ \textcolor{blue}*\\ \end{pmatrix}

\(\text{Loss }\)

\tilde{u}(t_1)
\tilde{u}(t_n)
\begin{pmatrix} \hspace{0.8em} \\ \\ \\ \\ \end{pmatrix}
\cdots
\bar{u}(t_n)
\text{Inference}

\(\nabla_{\tilde{u}} L\)

\text{Inference}

SNF-ROM: smooth latent space traversal

4

\boldsymbol{\mu}
t

\(\tilde{u}(t; \boldsymbol{\mu})\)

\(\Xi_\varrho\)

\varrho, \, \theta = \argmin_{\varrho, \, \theta}\left\{ \sum_{\boldsymbol{x}, \, t, \, \boldsymbol{\mu}} || \boldsymbol{u}(\boldsymbol{x}, t; \boldsymbol{\mu}) - g_\theta(\boldsymbol{x}, \Xi_\varrho(t; \boldsymbol{\mu})) ||_2^2 \right\}

Q. What prior to place on the latent space to ensure smooth/accurate traversal?

\frac{\mathrm{d}\textcolor{blue}{\tilde{u}}}{\mathrm{d} t} = \mathbf{J}_g^+ \cdot \textcolor{red}{ \bar{f}_\text{RHS} %\bar{\mathcal{L}}(t, g_\theta(\textcolor{blue}{\tilde{u}(t; \boldsymbol{\mu})}; \boldsymbol{\mu}) }

Control the complexity of latent trajectories.

\mathbf{J}_g^+ \cdot \textcolor{red}{ \bar{f}_\text{RHS} }
\tilde{u}(T; \boldsymbol{\mu})
\mathbb{R}^{N_\text{ROM}}
\tilde{u}(0; \boldsymbol{\mu})

Supervised learning problem: \((\boldsymbol{x}, t; \boldsymbol{\mu}) \to \boldsymbol{u}(\boldsymbol{x}, t; \boldsymbol{\mu})\).

\(\text{Loss } (L)\)

\(\text{Backpropagation}\)

\(\nabla_\theta L\)

\(\nabla_\varrho L\)

\(\nabla_\theta L\)

\(\text{PDE Problem}\)

\((\boldsymbol{x}, t, \boldsymbol{\mu})\)

\(\text{ Parameters}\)

\( \text{and time}\)

\(\text{ Intrinsic ROM manifold}\)

\tilde{\mathcal{U}} = \left\{ \tilde{u}(t; \mathbf{\mu}) |~ t,\, \mathbf{\mu} \right\}
\tilde{u}(t; \mathbf{\mu})

\(\text{Coordinates}\)

\(\text{Smooth neural field MLP }(g_\theta)\)

\(\tilde{u}\)

\(\boldsymbol{x}\)

\(\boldsymbol{u}\left( \boldsymbol{x}, t; \boldsymbol{\mu} \right)\)

Learn \((t; \boldsymbol{\mu}) \to \tilde{u}(t; \boldsymbol{\mu})\) directly

SNF-ROM: neural field regularization

5

\mathrm{NN}(x) \approx u^{}(x)
\textcolor{red}\times
\dfrac{\mathrm{d}^k}{\mathrm{d}x^k} \mathrm{NN}(x) \approx u^{(k)}(x)

Derivative calculation is carried out with automatic differentiation making the dynamics evaluation non-intrusive.

SNF-ROM with Lipschitz regularization (SNFL-ROM)

\(\text{Penalize the \textcolor{blue}{Lipschitz constant} of the MLP [arXiv:2202.08345]}\)

\varrho, \, \theta = \argmin_{\varrho, \, \theta}\left\{ L_\text{data}(\varrho, \theta) + \textcolor{blue}{\alpha \bar{c}(\theta)} \right\}
\text{For MLP: } c_\theta \leq \textcolor{blue}{\bar{c}(\theta)} = \prod_{l=1}^L ||W_l||_p
||f(x_2) - f(x_1)||_p \leq \textcolor{blue}{c}||x_2 - x_1||_p
\text{change in output}
\text{change in input}
\text{For a single layer: } \textcolor{blue}{c_l} = ||W_l||_p

\(\text{[enwiki:1230354413]}\)

SNF-ROM with Weight regularization (SNFW-ROM)

\(\text{Directly penalize \textcolor{red}{high-frequency components} in }\dfrac{\text{d}}{\text{d} x}\text{NN}_\theta(x)\)

\frac{\text{d}}{\text{d} x} \mathrm{NN}_\theta(x) = \left( \prod_{l=2}^L W_l \cdot \text{diag}(\textcolor{red}{\sigma'(z_{l-1})}) \right) \cdot W_1
\textcolor{red}{ \cos\left( W_l z_{l-1} + b_l \right) }
\varrho, \, \theta = \argmin_{\varrho, \, \theta}\left\{ L_\text{data}(\varrho, \theta) + \textcolor{red}{ \frac{\gamma}{2} \sum_{l=1}^L \sum_{i,j} ||W_l^{ij}||_2^2 } \right\}

We present two approaches to learn inherently smooth and accurately differentiable neural field MLPs.

\({x}\)

\({u(x)}\)

\mathrm{NN}(x)
u(x)
\text{NN}(x)
\frac{\mathrm{d}}{\mathrm{d}x} \mathrm{NN}(x)
\frac{\mathrm{d}^2}{\mathrm{d}x^2} \mathrm{NN}(x)
\text{SNFL}
\text{SNFW}

Neural field MLPs are

non-differentiable

High freq. noise

Experiment: 1D Kuramoto-Sivashinsky problem

8

\frac{\partial {u}}{\partial t} + u\frac{\partial {u}}{\partial x} + \frac{\partial^2 {u}}{\partial x^2} + \nu\frac{\partial^4 {u}}{\partial x^4} = 0

Both Lipschitz regularization (SNFL) and weight regularization (SNFW) capture the 4-th order derivative accurately.

\(\text{Relative error } (\Delta t = \Delta t_0)\)

\(\text{Relative error } (\Delta t = 10\Delta t_0)\)

Oscillations due to variation in projection error

Highly diffusive; even POD with 2 modes

Experiment: 2D Viscous Burgers problem \( (\mathit{Re} = 1~{k})\) 

\frac{\partial \boldsymbol{u}}{\partial t} + \boldsymbol{u} \cdot \boldsymbol{\nabla}\boldsymbol{u} = \nu \Delta \boldsymbol{u}

6

\(\text{CAE-ROM}\)

\(\text{SNFL-ROM}\)

\(\text{SNFW-ROM}\)

SNFL-ROM, SNFW-ROM effectively capture the traveling shock.

7

Experiment: 1D Viscous Burgers problem \( (\mathit{Re} = 10~{k})\) 

\frac{\partial {u}}{\partial t} + {u} \frac{\partial {u}}{\partial x} = \nu \frac{\partial^2 u}{\partial x^2}

\(\text{CAE-ROM}\)

\(\text{SNFL-ROM}\)

\(\text{SNFW-ROM}\)

CAE-ROM has complex diverging trajectories, where as SNF-ROM has near linear and easy to follow ones

Online dynamics solve matches learned trajectories

Online evaluation deviates!

Distribution of reduced states \((\tilde{u})\)

Takeaways

  • Neural fields are a promising new direction for ROMs
  • We have developed solutions to key challenges in ensuring stable, accurate and fast dynamics evaluation
  • In future, we plan to attack larger problems with multi-resolution neural field architectures that promise high accuracy and faster training

FLARE

 

Make transformers go BRRR!!!

Motivation: Transformer models are slow

1

Want AI driven rapid design

Dataset of additive manufacturing parts

Efficient transformers models

Scale up to 1 million points on a single GPU!

State of the art results on several PDE benchmarks!

Relative \(L_2\) error

Low-rank attention mechanism

Message-passing is fundamentally low-rank

import torch.nn.functional as F
def flare_multihead_mixer(q, k, v):
	# Args - q: [H, M, D], k, v: [B, H, N, D]
	# Ret - y: [B, H, N, D]
	z = F.scaled_dot_product_attention(q, k, v, scale=1.0)
	y = F.scaled_dot_product_attention(k, q, z, scale=1.0)
return y

Low-rank attention mechanism

Message-passing is fundamentally low-rank

Low-rank attention mechanism

Continued improvement with depth and rank.

Elasticity problem

Darcy problem

Proposed work 1

 

Expressive transformers

Proposed work 2

 

Efficient conditioning for transformers

Conditioning mechanism

Continued improvement with depth and rank.

Proposed timeline

Continued improvement with depth and rank.

Publications

Thank you

 

Questions?

Vedant Puri thesis proposal

By Vedant Puri

Vedant Puri thesis proposal

Vedant Puri's thesis proposal at Carnegie Mellon University

  • 0