Weekly Meeting

MAY 24, 2024

Vedant Puri

https://github.com/vpuri3

Mechanical Engineering, Carnegie Mellon University

Advisors: Prof. Burak Kara, Prof. Jessica Zhang

Updates 05/24/24 - Literature review

Motivation

  • Previous work (SNF-ROM) is a proof of concept that we can do physics with neural network spatial discretizations.
  • Aim: scale this method to larger and complex problems.
  • Challenge: slow training becomes a bottleneck
    • Deep neural networks converge slowly for high frequency feature

Potential solutions

  • Computer vision community has been developing fast training architectures

  • Multi-resolution and sparse architectures that capture high frequency features

Applications

  • PINNs: solve PDE by optimizing NN parameters
  • ROM: project PDE onto learned spatial representation

Next steps

  • Finish lit review
  • Implement neural architecture
  • Test on 1D/2D regression problems
  • Experiment with PINNs, ROMs

Research Plan

  • Literature review [in prog]
  • Formulate a research plan
    • Define research problem
    • Choose possible solution
    • What are the new contribution?
  • Model implementation
  • Architecture & hyperparameter tuning
  • Experimental setup
  • Writing

Our current architecture

Advantages

  • Neural field architecture, \( u(x) = NN(x) \) can be queried anywhere
  • Independent of grid representation

Disadvantages

  • Slow to train: each query requires a full DNN evaluation.
  • Many hours/days to capture details in signal
  • Often fail to capture high frequency details
\vec{x}

Signal

MLP

Example: Image regression with deep neural network \( (r, g, b) = NN(x, y) \)

https://www.it-jim.com/blog/nerf-in-2023-theory-and-practice/

https://arxiv.org/pdf/2309.15426

Fast training architectures

Key idea

  • Develop sparse architectures by decomposing domain with local overlapping grids
  • Capture high-frequency local detail with adaptive grids
  • Reduce training time from several days to minutes

 

 

 

 

 

Applications

  • Computer vision: tested on regression problems (wildly successful)
  • PINNs: some success. The resulting optimization problem is still very hard to solve

Disadvantages

  • Would have to be modified for physics problems

Instant neural graphic primitives with multiresultion hash encoding 2022

NeuRBF: A Neural Fields Representation with Adaptive RBF 2023

Application to PINNs

  • Suffer from noisy gradients, challenging optimization.
  • Not tested extensively on complex problems

The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24)

Current state of fast training architectures

Method Approach Notes
Instant NGP 2022 Learn multiple overlapping feature grids at different resolutions Leads to sharp gradients at grid boundaries 
3D Gaussian splating 2023 No neural network. Parameterize solution as sum of gaussians with learnable position, covariance. Adaptively add/rm gaussians. Easier PINNs optimization problem since theres no NN?
NeuRBF 2023 Adaptive RBF feature grid RBF features are smoothly interpolated
Dictionary fields 2023
Plenoxels
TenosRF

Many important papers in this field

Weekly Meeting

MAY 24, 2024

Vedant Puri

https://github.com/vpuri3

Mechanical Engineering, Carnegie Mellon University

Advisors: Prof. Burak Kara, Prof. Jessica Zhang

Updates 05/30/24

Motivation

  • Scale SNF-ROM to complex problems that are out-of-reach
  • Complementary to PMFI Task 2.1
  • Challenge: long training time becomes a bottleneck

Potential solutions

  • Multi-resolution and sparse architectures developed by CV community train in minutes rather than days

  • Would need to be modified to fit physics problems

Research problem

  • Develop fast training architecture and apply to a large ROM problem (e.g., 2D/3D turbulence)
  • Develop fast training PINNs architectures and apply to large problems

PMFI project

  • Andrew, Kevin working on data generation (NetFabb, Prof. Zhang's code (PMFI Task 1)
  • Discuss differentiation with Kevin's project (residual deformation prediction)

Current state of ML-ROM

(SNF-ROM, 2024)

PMFI proposal

Current state of PINNs (NSFnet 2020)

  • Nonconvex optimization landscape
  • Need data for fast optimization
  • Long training times (120k epochs)

Multiresolution ML architectures

Neural field MLP

  • MLP have low frequency bias
  • Often fail to capture fine details in signal

Multiresolution architectures

  • Capture fine details with adaptive feature grids
  • Comparable number of parameters
  • Smaller memory footprint (small MLP)
  • 100x faster training (faster convergence) and fast inference

Instant neural graphic primitives with multiresultion hash encoding 2022

Method Approach Notes
Instant NGP 2022 Learn multiple overlapping feature grids at different resolutions
Hash encoding for fast querying
Applied to PINNs for simple problems. Leads to sharp gradients at grid boundaries
3D Gaussian splatting 2023 No neural network. Parameterize solution as sum of gaussians with learnable position, covariance. Adaptively add/rm gaussians. Explicit representation can lead to easier PINNs optimization problem since there's no NN?
NeuRBF 2023 Combination of adaptive RBF and grid RBF features. Sinusoidal composition for multi-freq. Similar to 3DGS but no adaptive control. RBF features are smoothly interpolated. Very desirable. Possibly good for PINNs.
Dictionary fields 2023 factorize a signal into a coefficient field and a basis field which is a dictionary of known functions. use coordinate transformations to apply the same basis functions across multiple locations and scales, typically in a grid pattern. Combines deterministic functions with learned coefficient fields. Can we control smoothness? Grid structure can lead to large number of parameters.

Updates 05/24/24

Motivation

  • Scale SNF-ROM method to larger and complex problems
  • Complementary to PMFI grant (Task 2.1)
  • Challenge: long training time becomes a bottleneck

Potential solutions

  • Develop fast training multi-resolution architectures

  • Would need to be modified to fit physics problems

New contribution

  • Develop fast training architecture and apply to a large ROM problem (e.g., 2D/3D turbulence)
  • Develop fast training PINNs architectures and apply to large problems

Research plan

  • Literature review [this week]
    • Find models with favorable properties for our application
  • Code implementation [2 weeks]
  • Architecture & hyperparameter tuning
    • ​Tune architecture for our application
  • Experimental setup
  • Writing

Updates 05/30/24

PMFI Project

  • Task 1: Andrew, Kevin
  • Task 2: ML surrogate model
    • Task 2.1: learn geometry embedding
    • Task 2.2: surrogate model
  • Task 3: Uncertainty quantification

This week

  • Set up learning pipeline for Task 2.1 with toy dataset for a small neural network

  • Finished literature review

  • Start implementing state of the art models next week

PMFI proposal

Updates 06/03/24 - PMFI Project

PMFI Project

  • Goal: Given CAD geometry, predict residual deformation
  • Task 1: Dataset generation
  • Task 2: ML Model
    • Task 2.1: Learn low dimensional representation
    • Task 2.2: Surrogate model on low-dim representation
  • Task 3: Uncertainty quantification

This week

  • Finished setting up learning pipeline for Task 2.1 with toy dataset for a small neural network

  • Finished literature review on ML architectures for task 2.1

  • Start implementing state of the art models next week

Breakdown of task 2

Goal

  • Given \(G_\text{CAD}\), predict \( G_\text{RD}\)

Task 1 - Data generation

  • Generate dataset \(\{ G_\text{CAD}, \, G_\text{RD}\}\)

Task 2.1 - Dimensionality reduction

  • Learn  low-dimensional representation \( g_i\) for each geometry \(G_i\)
  • Done by solving regression problem with backpropagation


     
  • We have \( g_\text{CAD}, \, g_\text{RD}\)

Task 2.2 - Surrogate model

  • Learn neural network surrogate model \(g_\text{CAD} \to g_\text{RD}\)
  • \(G_\text{CAD}\) can then be recovered with the network in Task 2.1
  • New contrib: Implicit neural network for dynamically evolving geometry??
g_i, \theta = \text{argmin} = || \text{SDF}(x; G_i) - \text{NN}_\theta(g_i, x) ||_2^2 \\\hspace{-10em}g_i, \, \theta

Update - 06/07/24

PMFI Project Goal

  • Given CAD geometry predict, residual deformation (RD)

Task 1 - Data generation

  • Generate dataset of CAD and RD geometries with NetFabb, APDL code

Task 2.1 - Dimensionality reduction

  • Project geometries to low-dimensional space

Task 2.2 - Predict residual deformation

  • Task 2.2.1 - Learn surrogate model to predict RD geometry
    • Conditioned on LPBF process parameters (diffusion architecture)
  • Task 2.2.2 - Capture the dynamics for RD simulation
    • Modify diffusion architecture for Task 2.2.1 to timeseries forecasting

New Contributions

  • Task 2.1 - Novel dimensionality reduction architecture
    • SOTA focuses on single shape compression; we simultaneously compress an entire dataset of shapes
    • Future work: apply this to ROMs and tackle large 3D problems
  • Task 2.2.1 - Novel latent diffusion method for predicting 3D shapes
  • Task 2.2.2 - Novel latent diffusion method for predicting 3D shape evolution

Update - 06/11/24 PMFI Project

Task 1 - Generate dataset of residual deformation calculations

  • APDL code
    • Ran a test case provided by Xuan Liang
    • Setting up cases with complex geometry
  • Autodesk NetFabb
    • Looks like data-preparation was incomplete
    • Several parts are printed with wrong orientation
    • Need to fix it and rerun simulations

This week

  • Run APDL code on mesh geometries
  • Fix orientation issue and rerun a subset of test cases

PMFI Project Update - 06/13/24

Task 1 - Generate dataset of residual deformation calculations

  • APDL code: Ran simple test case
  • Autodesk NetFabb: first set of simulations done
    • Generate visualizations to understand dataset
    • Prune dataset to get a small subset for experiments

Task 2 - ML Surrogate model

  • Plan: Latent diffusion architecture
    • Dimensionality reduction
    • Conditional diffusion model
  • Progress: 
    • Continue with literature review
    • Implement baseline shape reconstruction model
    • Try out multi-resolution architectures

New contributions

  • Novel dimensionality reduction architecture
    • State of the art methods focus on single shape compression and cannot produce latent space of shapes
  • Novel latent diffusion method for predicting 3D shapes

APDL Code

NetFabb Simulations

Shape reconstruction with baseline method (MLP)

PMFI Project full plan

PMFI Project Goal

  • Given CAD geometry predict, residual deformation (RD)

Task 1 - Data generation

  • Generate dataset of CAD and RD geometries with NetFabb, APDL code

Task 2.1 - Dimensionality reduction

  • Project geometries to low-dimensional space

Task 2.2 - Predict residual deformation

  • Task 2.2.1 - Learn surrogate model to predict RD geometry
    • Conditioned on LPBF process parameters (diffusion architecture)
  • Task 2.2.2 - Capture the dynamics for RD simulation
    • Modify diffusion architecture for Task 2.2.1 to timeseries forecasting

New Contributions

  • Task 2.1 - Novel dimensionality reduction architecture
    • SOTA focuses on single shape compression; we simultaneously compress an entire dataset of shapes
    • Future work: apply this to ROMs and tackle large 3D problems
  • Task 2.2.1 - Novel latent diffusion method for predicting 3D shapes
  • Task 2.2.2 - Novel latent diffusion method for predicting 3D shape evolution

SNF-ROM Paper review - 06/20/24

Reviewer 1

  • [UPDATE] Implement hyper-reduction and report speed-up
    • 18x speedup on 2D Burgers without loss in accuracy


       
    • Up to 98x speedup with larger \(\Delta t\)

       
    • [12] gets 2x and 11x speed up on 1D and 2D Burgers respectively

Reviewer 2

  • Conduct more experiments:
    • Create a table of \(R^2\) for each experiment
    • [DONE] Compare accuracy vs \(N_\text{ROM}\)
    • Compare SNFW/SNFL regularization against \(L_2\) regularization
    • Compare Galerkin projection vs LSPG
    • Extrapolation in time
FOM: 15.530123 seconds (GPU allocations: 163.760 GiB)
ROM:  0.862563 seconds (GPU allocations:   8.925 GiB)

Burgers 2D - Error vs time plots

No hyper-reduction

with hyper-reduction

ROM w. large DT: 0.157669 seconds (GPU allocations: 1.790 GiB)

hyper-reduction + large \(\Delta t\)

\(1 \%\) error

\(1 \%\) error

\(1 \%\) error

SNF-ROM Paper review - 06/21/24

Reviewer 1

  • [UPDATE] Implement hyper-reduction and report speed-up

Reviewer 2

  • Conduct more experiments:
    • Create a table of \(R^2\) for each experiment
    • [DONE] Compare accuracy vs \(N_\text{ROM}\)
    • Compare SNFW/SNFL regularization against \(L_2\) regularization
    • Compare Galerkin projection vs LSPG
    • Extrapolation in time

Plan

  • Address remaining comments and update the manuscript

Weekly Meeting

AUG 05, 2024

Vedant Puri

https://github.com/vpuri3

Mechanical Engineering, Carnegie Mellon University

Advisors: Prof. Burak Kara, Prof. Jessica Zhang

Updates 08/05/24

WCCM Conference Takeaways

  • SNF-ROM well received. Much interest in methodology and speed-up results
  • Had long discussions on new directions in this field:
    • data-driven vs equation-based approaches
    • architectural choices

Offline stage

Online stage

Plan for upcoming projects

  • Fundamental ROM project

    • Based on discussions at WCCM, I have devised a plan to address current limitations of SNF-ROM

  • PMFI project

    • Earlier consensus was to use Kevin's work to satisfy grant requirements

    • In more recent discussions, Prof. Zhang indicated she wants at least a paper report back to PMFI in recent discussions

    • Prof. Zhang has also pointed out some other opportunities in geometry modeling

Question

  • How to split time between PMFI and fundamental ROM project?

Fundamental ROM project

Offline stage

Online stage

  • Motivation

    • Address current limitations: long training times, limited accuracy

  • Method
    • Remove/reduce offline training stage
    • Evolve all the weights of the model in the online stage
  • Current state of literature
    • They have not convincingly demonstrate a speed-up thus far
    • This method is called "Neural Galerkin" or "evolutionary networks"
  • Goal
    • demonstrate large speed-up, high-accuracy on SNF-ROM experiments
    • Preliminary experiments indicate we can achieve both

NYU group led by Benjamin Peherstorfer.

First paper (Mar 2022). 43 citations thus far

William Anderson (PhD NC State, Post-doc at LANL)

Our unique edge

  • Already have most of the machinery in place - fast experimentation
  • Develop network architectures that
    • fast custom Jacobian implementation
    • have fewer parameters than MLPs

Neural Galerkin approach

Our unique edge

  • Already have most of the machinery in place - fast experimentation
  • Develop network architectures that
    • fast custom Jacobian implementation
    • have fewer parameters than MLPs
  • Speed-up is dominated by two factors:
    • Number of model evaluation points
    • cost of Jacobian system solve

Update 8/9/24

ROM project (Neural Galerkin)

  • Ran "proof-of-concept" experiments with a simple model on advection-diffusion problem
  • We are getting good results
  • Studied literature to assess novelty of our idea

PMFI Project / Geometry modeling

  • Looking into Ray's work on additive/subtractive manufacturing
  • Discussion with Prof. Kara this morning

Neural Galerkin method

Neural Galerkin - Advection problem

Deep Neural Network (BASELINE)

~150 parameters

256 collocation points

 Multiplicative filter network (MFN)

~210 parameters

256 collocation points

Machine precision accuracy with exact solution

Parameterized Gaussian (OURS)

3 parameters

8 collocation points

Fast evaluation!

Neural Galerkin - Diffusion problem

Deep Neural Network (BASELINE)

~150 parameters

256 collocation points

 Multiplicative filter network (MFN)

~210 parameters

256 collocation points

Error due to limited expressivity of this simple model

Parameterized Gaussian (OURS)

3 parameters

8 collocation points

Neural Galerkin - Advection Diffusion problem

Parameterized Gaussian (OURS)

3 parameters

8 collocation points

Deep Neural Network (BASELINE)

~150 parameters

256 collocation points

 Multiplicative filter network (MFN)

~210 parameters

256 collocation points

Error due to limited expressivity of this simple model

FAILED TO CONVERGE

Neural Galerkin conclusions

Conclusions from "proof of concept" experiment

  • It is possible to construct small NN architectures
  • Accuracy increases with higher-order time-integrators and adaptive time-stepping

Next steps

  • Make this model more expressive and experiment with Burgers problem
  • Start using off-the-shelf time-integrators

Sources of error in experiment

  • DNN/ MFN initialization was done gradient descent.
  • Could have been made better with second-order optimizers (L-BFGS)

Potential New Contributions

  • Develop novel architectures that have sparsity
  • Few parameters + few collocation points ==> fast evaluation

Time-integration of Parameterized Gaussian (OURS)

u(x, t) = \textcolor{blue}{c(t)}\exp(\textcolor{blue}{\sigma(t)} (x-\textcolor{blue}{\bar{x}(t)}))

Parameterized Gaussian discretization

 Update 8/15/24 - ROM project (Neural Galerkin) 

Updates

  • Tested more expressive model parameterizations
    • \(\textsf{\textcolor{green}{Projection}}\) step (gradient descent, L-BFGS) not robust
    • Parameterizations are too expressive for the simple tests
  • Tested a complex 1D problems (advection/diffusion of square wave)
  • Updated code to use high-order, off-the-shelf time-integrators
  • Implemented periodic boundary conditions

Next steps

  • Test on more complex problems: 1D Burgers, 2D problems

Potential new contributions

  • Smaller, lighter parameterizations lead to fast time-integration
  • Fast hyper-reduction with sparse parameterizations
  • develop metrics for adaptively adding/removing  complexity (similar to adaptive mesh refinement)
u(x, t) = \textcolor{blue}{c(t)}\exp(\textcolor{blue}{\sigma(t)} (x-\textcolor{blue}{\bar{x}(t)}))
u(x, t) = \sum_{i=1}^{\textcolor{magenta}{N}} \textcolor{blue}{c_i} \exp(\textcolor{blue}{\sigma_i} (x-\textcolor{blue}{\bar{x}_i}))
u(x, t) = \sum_{i=1}^{\textcolor{magenta}{N}} \textcolor{blue}{c_i} \exp(\textcolor{blue}{\sigma_i} (x-\textcolor{blue}{\bar{x}_i})) \sin\left(\textcolor{blue}{\omega_i}x + \textcolor{blue}{\phi_i} \right)

Parameterized Gaussian kernels

Multiple parameterized Gaussians kernels

Multiple parameterized Gabor kernels

Neural Galerkin - Advection Diffusion of square wave

Parameterized Gaussian (OURS)

8 parameters, 512 collocation points

Compute time: \(0.17~\text{s}\)

Deep Neural Network (BASELINE)

~150 parameters, 512 collocation points

Compute time: \(6~\text{s}\)

Accuracy limited by initial projection step

u(x, t) = \sum_{i=1}^{\textcolor{magenta}{2}} \textcolor{blue}{c_i} \exp(\textcolor{blue}{\sigma_i} (x-\textcolor{blue}{\bar{x}_i}))

Parameterization

\textcolor{blue}{\sigma_i}
\textcolor{blue}{\sigma_i^\text{left}}
\textcolor{blue}{\sigma_i^\text{right}}
x - \textcolor{blue}{\bar{x}_i}

FOM compute time: \(0.80~\text{s}\)

u(x)

Update 08/15/24 - PMFI Project

PMFI Project Goal

  • Given CAD geometry, simulate the 3D printing process
  • Learn the evolutionary of the 3D printed geometry

Task 1 - Data generation

  • Simulate AM process in Autodesk Netfabb

Task 2 - Formulate dynamics problem problem

  • Multiple options:
    • next step prediction problem
    • conditioning on CAD model
  • Reviewing literature on similar topics
  • Discussion ongoing with Prof. Kara.

Task 1: AM process simulation with NetFabb

 Update 8/22/24 - ROM project

  • Prev paper: SNF-ROM: Projection-based nonlinear ROM with smooth neural fields
  • New paper: Localized nonlinear kernel parameterizations for fast neural galerkin

Updates

  • Tested 1D Burgers problem (Re 10k)
  • Tested a new parameterization that seems to work well for shocks
  • Problem: \(\textsf{\textcolor{green}{Projection step}}\) (gradient descent, L-BFGS) not robust
  • Fixed a critical bug that was causing time-instability

Next steps

  • Add more Gaussians to increase accuracy during \(\textsf{\textcolor{green}{projection}}\) step

Potential new contributions

  • Smaller, lighter parameterizations lead to fast time-integration
  • Fast hyper-reduction with sparse parameterizations
  • develop metrics for adaptively adding/removing  complexity (similar to adaptive mesh refinement)
u(x, t) = \sum_{i=1}^{\textcolor{magenta}{N}} \textcolor{blue}{c_i} \exp(\textcolor{blue}{\sigma_i} (x-\textcolor{blue}{\bar{x}_i}))
u(x, t) = \sum_{i=1}^{\textcolor{magenta}{N}} \frac{\textcolor{blue}{c_i}}{2} \left( \tanh(\textcolor{blue}{\omega_0} (x - \textcolor{blue}{x_0})) - \tanh(\textcolor{blue}{\omega_1} (x - \textcolor{blue}{x_1})) \right)

Parameterized Gaussians kernels

Parameterized Tanh kernels

Gaussian kernels

Tanh kernels

Time evolution bug

u(x, t) = g(x, \tilde{u}(t))
\frac{\partial}{\partial t} u(x, t) = \mathcal{L}(x, t, u(x, t))
\mathbf{J}_g \frac{\partial}{\partial t} \tilde{u}(t) = f
\mathbf{J}_g = \begin{bmatrix} \frac{\partial}{\partial \tilde{u}} g(x_i, \tilde{u}_j) \end{bmatrix}_{ij}

Governing PDE

Ansatz

\tilde{u}(t + \Delta t) = \tilde{u}(t) + \Delta t \cdot \tilde{f}

Galerkin projection

Problem

\(\mathbf{J}_g\) can be rank deficient

\(\implies\) Ill-behaved system solve

\tilde{f} = \mathbf{J}_g^\dagger ~f = (\mathbf{J}_g^T \mathbf{J}_g)^{-1} \mathbf{J}_g^T f

QR factorization would silently fail and return either NaNs or \(\begin{bmatrix}0 & \cdots & 0 \end{bmatrix}\)

Iterative solvers can invert rank-deficient systems and return a non-unique solution

This has been causing instability in online solve for complicated paramterizations

Solution

Neural Galerkin - Burgers Eqn Re 10k

Parameterized Tanh (OURS): 6 parameters, 8192 collocation points

Deep Neural Network (BASELINE): ~150 parameters, 8192 collocation points

FOM compute time: \(7~\text{s}\)

DNN compute time: \(10~\text{s}\)

Our compute time: \(0.07~\text{s}\)

 Update 8/22/24 - PMFI project

PMFI Project Goal

  • Given CAD geometry, simulate the 3D printing process
  • Learn the evolutionary of the 3D printed geometry

Task 1 - Data generation

  • Simulate AM process in Autodesk Netfabb

Task 2 - Formulate dynamics problem problem

  • Directly predict CAD geometry at each layer
    • Predict layer by layer: next step prediction problem
    • given previous N layers, predict the displacement of the next layer
    • conditioning on CAD model
  • Latent space dynamics
    • Embed geometries in latent space
    • Learn latent space dynamics to evolve low-dim representation
  • Reading the following papers in for more ideas
    • Transolver, latent dynamics network, universal physics transformer

Task 1: AM process simulation with NetFabb

Update 8/29/24 - PMFI Project

PMFI Project Goal

  • Predict residual deformation in Laser Powder Bed Fusion process

Our approach

  • Given CAD geometry, simulate the 3D printing process
  • Predict residual deformation at each layer as it is being deposited

Application

  • Preemptively catch part interference with re-coater blade

LPBF process simulation with Autodesk NetFabb

CAD = get_CAD_geometry()
displacement = []

for l in 1:num_layers(CAD) # loop over layers

	# slice geometries
	next_cad_layer = CAD[l]
    prev_layer_disp = displacement[1:l-1]
	
    # our model
    next_layer_disp = predict_layer_disp(next_cad_layer, prev_layer_disp)

	# update Geometry
	displacement.append(next_layer_disp)
end

asbuilt_geometry = CAD + displacement
asbuilt_geometry.visualize()

\(\texttt{CAD[l]}\)

\(\mathrm{NN}\)

\(\texttt{disp[1:l-1]}\)

\(\texttt{disp[l]}\)

Capturing geometry evolution

Modeling task

  • Input 1: Displacement of nodes from previous layers
  • Input 2: Position of nodes to be deposited
  • Output: Displacement of nodes on incoming layer
  • If predicting temperature, we'd have to update all nodes at all times
    • Temperature prediction won't be an auto-regressive problem.
    • It would be a space-time prediction problem

\(\texttt{disp[1:l-1]}\)

\(\texttt{CAD[l]}\)

\(\mathrm{NN}\)

\(\texttt{disp[l]}\)

Open questions

  • How to represent geometry slices? Point clouds or 2D slices of a 3D graph? Data preprocessing
  • How to formulate neural network?
    • Input/ output are graphs/point clouds with different structure / number of points
    • Tokenization similar to Transolver can be beneficial here.
    • Aditya pointed to Latent Neural Operator paper that might be useful here. 
  • What manual processing can we do to simplify the learning problem?
  • Can we utilize the G-code in any way? How to account for support structure?

Layer L+1

Layer L

Transolver

Physics attention (slide from Kevin)

Latent neural operator

Update 9/06/24 - PMFI Project

  • Goal: Predict temperature, deformation in Laser Powder Bed Fusion process
  • Application: Preemptively catch part interference with re-coater blade
  • Approaches:
    • Autoregressive build: Predict (T, d) at each layer as it is being deposited
      • Challenge: input/ output point clouds are different (varying sizes, non-overlapping). Makes for a complicated training problem
    • Space-time field prediction: Predict (T, d) as a function of (x, t) (baseline)
      • Challenge: The mesh is evolving with time. Field values at grid points is not available until that area is built. Need to interpolate to a common background mesh
      • Challenge: Extremely large training problem: Training time for Kevin/Aditya was ~1/2 day for static field prediction. If we interpolate and train over 100 time-steps, it would take us ~100x long
  • This week:
    • Discussed data preprocessing pipeline with Andrew
    • Create a synthetic dataset with Prof. Kara
      • Fixed uniform background grid so no data preprocessing
      • Can quickly run experiments and compare models

LPBF process simulation with Autodesk NetFabb

Synthetic dataset

Update 9/12/24 - PMFI Project

  • Goal: Predict temperature, deformation in LPBF
  • This week:
    • Migrated codebase to PyTorch
    • Training MLPs on the sandbox dataset
      • (x, z, t) --> Temperature
      • temperature distribution only depends on Z and T so it is fundamentally an easy problem.
      • But MLP is having trouble because of the part boundary
      • To test this, train model (x, z, t) --> SDF
  • Plan
    • Train CNNs on rectangular grid
      • (XZ ??, SDF ??, Time) --> Temperature
    • Consider only the voxels inside the part and try a GNN
      • (XZ ??, Time) --> Temperature

PMFI Project Approaches

  • Goal: Predict temperature, deformation in Laser Powder Bed Fusion process
  • Application: Preemptively catch part interference with re-coater blade
  • Approaches:
    • Autoregressive build: Predict (T, d) at each layer as it is being deposited
      • Challenge: input/ output point clouds are different (varying sizes, non-overlapping). Makes for a complicated training problem
    • Space-time field prediction: Predict (T, d) as a function of (x, t)​
      • Approach 1: MLP (BASELINE):  Interpolate all data to a common background grid and train MLP (XYZ, Time) --> (Temp, Disp)
      • Approach 2: GNN: Interpolate all data time-steps to the final mesh. Then train a GNN (XYZ, Time) --> (Temp, Disp)
      • Challenge: Potentially time-consuming data pre-processing step
      • Challenge: Extremely large training problem: Training time for Kevin/Aditya was ~1/2 day for static field prediction. If we interpolate and train over 100 time-steps, it would take us ~100x long.
      • Would have to engage in data-pruning, multi-GPU training

LPBF process simulation with Autodesk NetFabb

Update 9/19/24 - PMFI Project

  • Goal: Predict temperature, deformation in LPBF
  • This week:
    • Trained MLPs, CNNs on sandbox dataset
    • Results are mixed: need to discuss analysis
    • Setting up GNNs
  • Plan
    • Show that GNNs work well on the sandbox dataset
    • Compare against any transformer based approaches?
    • Scale up to 3D the actual AM dataset

Results: Training CNNs on the next-step prediction problem

True

Prediction

Abs Error

  • Input: Temperature, time on grid at time-step "n"
  • Output: Temperature, at next time-step "n+1"
T^{n+1} = \mathrm{CNN}(T^{n}, t^n, x, z)
temperature_data = get_data()

temperatures = []
temperatures.append(temperature_data[0])

for time in 0:T # loop over time-steps
	curr_temp = temperatures[-1]
    next_temp = CNN(curr_temp, time, X, Z)
    temperatures.append(next_temp)
end

Training

Inference (auto-regressive rollout)

Large errors during training lead to complete deviation during inference

Large errors localized to the interface

Reason: discontinuities in the temperature distribution

This causes discontinuities in the temperature distribution that cannot be captured by CNN/GNN.

Sandbox was designed to mimic the temperature history of the AM dataset

at a point

Hypothesis testing: Easing the discontinuity improves learning

True

Prediction

Error

Training

Inference (auto-regressive rollout)

Introduce a blending function to smoothen the discontinuity

Question: Is this representative of the NetFabb dataset? 

at a point

What is the NetFabb dataset truly like?

Discontinuous

Continuous

Implication:

  1. The layer that is being built appears, but is cold.
  2. Then, it heats up.
  3. Then it cools down as the heat source moves away.

Implication:

  1. The layer that is being built appears, and is hot
  2. Then it cools down as the heat source moves away

This is not representative of the dataset

This is representative of the dataset

at a point

at a point

2

1

3

1

2

This is not representative of the dataset

Update 9/26/24 - PMFI Project

  • Goal: Predict temperature, deformation in LPBF
  • Method: Track the evolution of fields on the part mesh with GNNs
  • This week:
    • Ran experiments with the diffused interface idea + CNNs
    • Getting large errors in time-evolution (auto-regressive rollout)
    • Setting up GNNs with PyTorch Geometric
  • Plan
    • Show that GNNs work well on the sandbox dataset
    • Scale up to 3D the actual AM dataset
  • Logistics
    • Talk about funding for next year
    • Paper review for Prof. Kara

LPBF process simulation with Autodesk NetFabb

CNNs with diffused interface

True

Prediction

Abs Error

  • Training accuracy is ~ 4 %
  • Errors accumulate during inference
  • Getting ~20 % error during rollout
  • Discussion with Prof. Zhang on improving rollout accuracy: SimVP
  • Discussion with Aditya on improving rollout accuracy

Training on next-step prediction problem

Inference with auto-regressive rollout

Still seeing significant errors during rollout

Errors localized to the interface but much smaller

temperature_data = get_data()

temperatures = []
temperatures.append(temperature_data[0])

for time in 0:T # loop over time-steps
	curr_temp = temperatures[-1]
    next_temp = CNN(curr_temp, time, X, Z)
    temperatures.append(next_temp)
end

Graph Neural Networks

Meeting 9/30/24 - PMFI proposal ideas

  • Amount: $25,000 -- $70,000
  • Period: Aug 1 2025 -- Jul 31 2026
  • Requirements
    • A Pennsylvania industry partner is required for this program
    • Partner companies may provide in-kind contributions
  • Potential ideas
    • Build on our ongoing 2024-2025 proposal
      • ​2024-2025: Predict deformation field on part
      • 2025-2026: Incorporate dynamics
    • Apply data-driven reduced order modeling to AM

Previously funded projects (2023)

Update 10/04/24 - PMFI Project

  • Goal: Predict temperature, deformation in LPBF
  • Method: Track the evolution of fields on the part mesh with GNNs
  • This week: Tested Graph Neural Network on synthetic 2D dataset
  • Next week: Move to 3D NetFabb dataset
  • Key contribution
    • Diffused interface method to capture evolving interface on static graph

GNN with diffused interface

True

Prediction

Abs Error

  • < 1 % error everywhere
  • Mean Square Error: 1e-5
  • Errors accumulate during inference
  • Getting ~20 % error during rollout
  • Discussion with Prof. Zhang on improving rollout accuracy: SimVP
  • Discussion with Aditya on improving rollout accuracy

Training on next-step prediction problem

Inference with auto-regressive rollout

Still seeing significant errors during rollout

Errors localized to the interface but much smaller

temperature_data = get_data()

temperatures = []
temperatures.append(temperature_data[0])

for time in 0:T # loop over time-steps
	curr_temp = temperatures[-1]
    next_temp = CNN(curr_temp, time, X, Z)
    temperatures.append(next_temp)
end

Update 10/18/24 - PMFI Project

  • Goal: Predict temperature, deformation in LPBF
  • Method: Track the evolution of fields on the part mesh with GNNs
  • Prev: Tested GNN on sandbox dataset
  • This week: Got to work on Netfabb dataset (Ti64 hi-res)
    • ​Monday: Got set up. Tested MeshGraphNet on final-time dataset. Too many edges (700k in some cases). GPU running out of memory.
    • Tuesday: Visualization. Ensured mesh connectivity is correct.
    • Wednesday: Extracted, visualized time-series dataset
  • Next week: Move to 3D NetFabb dataset
  • Key contribution
    • Diffused interface method to capture evolving interface on static graph

NetFabb 3D Dataset (final time)

NetFabb 3D Dataset (time series)

Update 10/24/24 - PMFI Project

  • Goal: Predict temperature, deformation in LPBF
  • Method: Track the evolution of fields on the part mesh with GNNs
  • Prev: Extracted time-series data
  • This week: interpolated time-series data to a common fine mesh
  • Next week: Apply diffused interface and train MeshGNN. Thoughts? Start working on proposal
  • Key contribution
    • Diffused interface method to capture evolving interface on static graph

PMFI Proposal 2025

  • IDEA 1: Time-series analysis of additive manufacturing simulations
    • CHALLENGE: Evolving geometry. No constant graph/mesh
    • Approach 1: Interpolate all data to a static graph. Apply auto-regressive GNN to predict time-series
    • Approach 2: Transformer based auto-regressive approach
  • IDEA 2: Reduced Order Modeling for AM simulations
    • CHALLENGE: Complex physics - thermal transport, solidification, thermal transport
    • APPROACH: Learn governing ODE for latent space dynamics
    • ADVANTAGE: interpretable

Update 10/31/24 - PMFI Project

  • Goal: Predict temperature, deformation in LPBF
  • Method: Track the evolution of fields on the part mesh with GNNs
  • Prev: Extracted time-series data
  • This week:
    • Applying GNN to time-series data. Some impediments
      • GNN is very memory intensive. Cannot run GNN on large meshes with Eagle. 
      • Isolating smaller cases and running GNN only on them.
      • Bridges has been down this week.
    • PMFI proposal
  • Next week: Evaluate results. Compare impact with/without diffused interface
  • Key contribution
    • Diffused interface method to capture evolving interface on static graph

PMFI 2025-2026 Project

Modeling dynamical deformation in LPBF with neural network surrogates

  • Motivation:
    • Prevent recoater blade collision in LPBF during printing.
    • Highlight the complex dynamics of the physics of AM process.
    • Quantities of interest: displacement, warping. Thermal gradients. Porosity, manufacturing induced flaws.
    • Challenging multiplysics problems.
  • Gap:
    • Simulation of governing eqns prohibitively expensive
    • Prev. works doesn't look at dynamically evolving geometries.
    • Specifically simulations are carried out on different graphs. This is what makes the problem challenging. 
  • Method: Develop neural network surrogate model for time-series prediction with LPBF
    • Task 1: Time-series data generation (6 months)
      • Approach: Run Autodesk NetFabb on a large dataset of parts
    • Task 2: Develop surrogate model (6 months)
      • Approach: Transformer based auto-regressive approach
      • Challenge: Training ML models on evolving geometry - no constant graph/mesh
  • Related works:
    • Highlight Kevin TagUNet, PMFI 2024-2025. Explain how this builds on top of your competency.
    • Thermo-mechanical governing equations.

PMFI project -11/15/24

Modeling dynamical deformation in LPBF with neural network surrogates

  • Goal: Predict deformation stress time series in LPBF
  • Approach: Track evolution of fields on part mesh with GNN
  • This week:
    • PMFI proposal
  • Next week:
    • Test mesh GNN on AM dataset. Start implementing transformer method
  • Key contributions:
    • GNN: Diffused interface method to capture evolving interface on static mesh

PROJECT STATUS

  • [X] Data preprocessing
  • [X] Test GNN method of 2D toy problem
  • [  ] Test GNN method on 3D dataset
  • [  ] Test transformer method on 2D toy problem
  • [  ] Test transformer method on 3D dataset

NetFabb 3D Dataset (time series)

PMFI project -11/21/24

Modeling dynamical deformation in LPBF with neural network surrogates

  • Goal: Predict deformation stress time series in LPBF
  • Approach: Track evolution of fields on part mesh with GNN
  • This week:
    • GNN on 3D dataset. single shape
  • Next week:
    • Scale up.
  • Key contributions:
    • GNN: Diffused interface method to capture evolving interface on static mesh

PROJECT STATUS

  • [X] Data preprocessing
  • [X] Test GNN method of 2D toy problem
  • [  ] Test GNN method on 3D dataset
  • [  ] Test transformer method on 2D toy problem
  • [  ] Test transformer method on 3D dataset

NetFabb 3D Dataset (time series)

PMFI project -12/05/24

Modeling dynamical deformation in LPBF with neural surrogates

  • Goal: Predict deformation stress time series in LPBF
  • Approach: Use GNN for next step prediction: \( y_{n+1} = y_{n} + NN(y_n, \ldots) \)
  • Baseline method: Directly apply GNN to graph
  • Our method:
    • ​Extract sub-graph corresponding to deposited material (0 -- n), and the incoming layer (n -- n + 1)
    • Interpolate up \(y_n\) to fill region (n -- n + 1) to resolve discontinuity
    • Next step prediction with bulk masking to dissuade large changes away from interface. This stabalizes the auto-regressive rollout
      \( y_{n+1} = \tilde{y}_{n} + NN(\tilde{y}_n, \ldots) \cdot I_\text{bulk} \)
    • Embed back into full graph
  • Key contributions
    • The baseline approach is not tuned to the particularities of LPBF simulations
    • We present a multi-step pipeline for accurate evaluation of LPBF 
y_{n}
y_{n + 1}

Subgraph

extraction

interpolation

Neural network prediction

Results: Displacement fields predictions

Model is trained on 30 shapes. These are results from the test set

Baseline

Our approach

-0.05

0.43

0.28

0.69

0.56

0.80

-3.55

-0.20

Auto-regressive predictions at the final time-step

Baseline

Our approach

0.97

0.99

0.99

0.99

0.97

0.99

0.96

0.99

Predictions given ground truth at final time-step

test01

test09

test06

test00

Discussion

Observations

  • We do improve prediction quality across the board
  • But the improvement is not enough: our model does not produce reliable predictions for all cases.
  • GNN architecture limitations: MeshGNN produces large errors in bottleneck regions.

Blockers: Limited GPU compute available

  • Limited GPU compute available on PSC Bridges (typically only 1 GPU is available).
  • Limited to experimenting with ~5 shapes at a time.
  • This makes experimentation very hard
    • Modifications (to pipeline, architecture, training hyper-params) improve some cases
    • But the improvement is not uniform
    • The only way forward is to run larger experiments with 100+ shapes.
      • It is not that experiments with fewer GPUs take longer.
      • Experiments without sufficient GPUs are not possible.

GPU Compute Requirements for Graph Networks

  • AM Dataset
    • ~50k nodes, ~500k edges, ~25 time-steps per shape. ~27k shapes
  • GNN architecture
    • Node convolutions, edge convolutions (message passing)
    • Memory demand dominated by edge convolutions
  • GPU Compute
    • A single V-100 GPU (32 GB RAM) can hold process 4 shapes in a batch
    • Takes 4 hours to train on 30 shapes with 1 GPU
  • Stochastic Gradient Descent

     
  • Minimum GPU Requirement
    • 1 Nvidia V-100 node (8 GPUs), (32 GB RAM per GPU)
    • ~1.5 node-hours to train on 100 shapes
    • Anticipated need:
      • 200 node-hours for experimentation (with 100 shapes)
      • 5,000 node-hours for training with 5,000 shapes
\text{batch size} = \text{number of GPU} \times \text{batch size per GPU} \sim 32

Same configuration as PSC Bridges

Project Plan

Modeling dynamical deformation in LPBF with neural network surrogates

  • Goal: Predict deformation stress time series in LPBF
  • Status
    • Our expectation was that GNN + next-step prediction would lead to good results
    • That has not been the case so far.
  • Next steps
    • Continue pressing forward with GNNs
      • Pursue architectural modifications to alleviate MeshGNN limitations
    • OR Test transformer methods (discussed earlier and proposed for 2025)

PROJECT STATUS

  • [X] Data preprocessing
  • [X] Test GNN method of 2D toy problem
  • [  ] Test GNN method on 3D dataset
  • [  ] Test transformer method on 2D toy problem
  • [  ] Test transformer method on 3D dataset

Spatial Discretization

Temporal Discretization

Graph Neural Networks

Transformer embedding representation

Neural implicits

Next-step prediction

LSTM or related architecutre

Transformer

next-step prediction

Update - 12/12/2024

Modeling dynamical deformation in LPBF with neural network surrogates

  • Goal: Predict deformation stress time series in LPBF
  • Approach: Use GNNs for next step prediction
  • This week
    • Compute needs
      • Bridges staff recommends reserving nodes for blocks of time
      • Perlmutter - need to request allocation
        • Exploratory grant - 100 GPU hrs, 250 CPU hrs
        • Proposal needs to fall under one of DOEs program offices
      • Argonne Polaris also has a director's discretionary allocation
      • Long term solution: Apply for INCITE next year.
  • Next week
    • Now that we know the limits of MeshGNN architecture, next step is to experiment with Graph-Unet architecture.
    • PMFI report - Due January? Check the call. 1-2 page report. Check call.
  • Logistics
    • Travel to India - Dec 20 - Jan 12. Work remote for week for Dec 23, Jan 1-12.

Update - 12/23/2024

SNF-ROM paper revisions for Journal of Computational Physics

  • Both R1 and R2 are overall positive about the paper and the approach
  • Introduction and Methods section
    • Minor clarifications
  • Experiments section
    • R1: Demonstrate grid-independent hyper-reduction
    • R1: Demonstrate time-extrapolation
    • R2: Compare online wall clock times for each ROM method for each test case
    • R2: Plot singular value decay of each test case. Choose POD dimension based on that

Update - 01/03/2025

SNF-ROM paper revisions for Journal of Computational Physics

  • This week
    • Run experiments for reviewers:
      • Demonstrate grid-independent hyper-reduction
      • Principled choice for POD modes
    • In progress
      • ​​R1: Demonstrate time-extrapolation
        • R2: Compare online wall clock times for each ROM method for each test case
        • R2: Plot singular value decay of each test case. Choose POD dimension based on that

Hyper reduction is grid independent

Choose POD to have same accuracy as ML methods

1D Advection

Slow energy decay in POD modes

AM Project Plan

Modeling dynamical deformation in LPBF with neural network surrogates

  • Goal: Predict deformation stress time series in LPBF
  • Status: Graph networks + next-step prediction has not led to good result
  • Plan:
    • Pursue architectural modifications to alleviate MeshGNN limitations
      • Time needed: 3-4 weeks
      • Publishable contributions: Novel GNN architecture for AM timeseries
    • Test transformer methods (discussed earlier and proposed for 2025)
      • Time needed: 1-2 months
      • Publishable contributions: Novel transformer method for AM timeseies

PROJECT STATUS

  • [X] Data preprocessing
  • [X] Test GNN method of 2D toy problem
  • [X] Test GNN method on 3D dataset
  • [  ] Test transformer method on 2D toy problem
  • [  ] Test transformer method on 3D dataset

Spatial Discretization

Temporal Discretization

Graph Neural Networks

Transformer embedding representation

Neural implicits

Next-step prediction

LSTM or related architecutre

Transformer

next-step prediction

Update 1/31/25 - AM time-series modeling

Modeling dynamical deformation in LPBF with neural network surrogates

  • Goal: Predict deformation stress time series in LPBF
  • Method: GNN/ Transformer + next-step-prediction
    • Pursue architecture/pipeline modification to alleviate MeshGNN limitations
    • Contributions: Novel architecture and pipeline for AM time-series
  • This week:
    • Compared transformer model and GNN on AM dataset
    • Analyze cases where our model is failing
    • Problem: we are looking at individual cases, not the entire dataset
  • Next week:
    • Write metrics to identify out-of-distribution shapes
    • Write evaluation metrics for the entire dataset, not individual cases
    • Data filtering - remove out-of-distribution shapes from training dataset

GNN based simulation pipeline so far

Modeling dynamical deformation in LPBF with neural surrogates

  • Goal: Predict deformation stress time series in LPBF
  • Approach: Use GNN for next step prediction: \( y_{n+1} = y_{n} + NN(y_n, \ldots) \)
  • Baseline method: Directly apply GNN to graph
  • Our method:
    • ​Extract sub-graph corresponding to deposited material (0 -- n), and the incoming layer (n -- n + 1)
    • Interpolate up \(y_n\) to fill region (n -- n + 1) to resolve discontinuity
    • Next step prediction with bulk masking to dissuade large changes away from interface. This stabalizes the auto-regressive rollout
      \( y_{n+1} = \tilde{y}_{n} + NN(\tilde{y}_n, \ldots) \cdot I_\text{bulk} \)
    • Embed back into full graph
  • Key contributions
    • The baseline approach is not tuned to the particularities of LPBF simulations
    • We present a multi-step pipeline for accurate evaluation of LPBF 
y_{n}
y_{n + 1}

Subgraph

extraction

interpolation

Neural network prediction

GNN based simulation pipeline so far

CAD = get_CAD_geometry()
displacement = []

for l in 1:num_layers(CAD) # loop over layers

	# slice geometries
	next_cad_layer = CAD[l]
    prev_layer_disp = displacement[1:l-1]
	
    # our model
    next_layer_disp = NN(next_cad_layer, prev_layer_disp)

	# update Geometry
	displacement.append(next_layer_disp)
end

asbuilt_geometry = CAD + displacement
asbuilt_geometry.visualize()

Results: \(R^2\) values for displacement fields predictions with GNN

Model trained on 30 shapes. These are results from the test set

Baseline

Our approach

-0.05

0.43

0.28

0.69

0.56

0.80

-3.55

-0.20

Auto-regressive predictions at the final time-step

Baseline

Our approach

0.97

0.99

0.99

0.99

0.97

0.99

0.96

0.99

Predictions given ground truth

Training is successful!

Auto-regressive evaluation yields mixed results

Overall improvement over baseline

Success

Failure

Update 2/6/25 - AM time-series modeling

Modeling dynamical deformation in LPBF with neural network surrogates

  • Goal: Predict deformation stress time series in LPBF
  • Method: GNN/ Transformer + next-step-prediction
    • Pursue architecture/pipeline modification to alleviate MeshGNN limitations
    • Contributions: Novel architecture, pipeline for AM time-series
  • Updates:
    • Analyzed 3500 shapes for data filtering, sped up data-extraction process
    • Identified and removed out-of-distribution cases
    • Trained GNN/ Transformer on steady-state data - good results
  • Next steps:
    • Test with filtered dataset on time-series problem
    • Compute statistics for time-series problem
    • Refine time-series model pipeline based on displacement results
    • Feature engineering: calculate SDF for the shapes
  • Challenges
    • Slow filesystem, limited compute availability on Bridges - should be better now
    • Transolver model does not support batching - need to fix!

Data filtering

Original dataset

Filtered dataset

Large meshes

Thin shapes

Large meshes

Data filtering

Original dataset

Filtered dataset

Thin features

Too few layers

Bad simulator

Data filtering

Original dataset

Filtered dataset

Bad simulator

Thin shapes

Results: Final-time displacement prediction on filtered dataset

Transolver

Mesh Graph Net

Update 2/13/25 - AM time-series modeling

Modeling dynamical deformation in LPBF with neural network surrogates

  • Goal: Predict deformation stress time series in LPBF
  • Method: GNN/ Transformer + next-step-prediction
    • Pursue architecture/pipeline modification to alleviate MeshGNN limitations
    • Contributions: Novel architecture, pipeline for AM time-series
  • Updates:
    • Testing models on time-series dataset
    • Distance field featurization
  • Next steps:
    • Pipeline tuning - modify pipeline to improve results
    • Parameter tuning to improve generalization
    • Architecture changes to transolver - see recent Transolver++ paper
    • Feature engineering: spherical histogram
  • Misc
    • Dataset size: conduct study on generalization as a function of training dataset size

Timeseries results - Z-displacement

Vanilla MeshGNN

Vanilla Transolver

Training (N=191)

Test set (N=47)

Transolver + Pipeline

Plot of R-sqaure accuracy vs time-step during auto-regressive rollout

Large variance

Improved stability

Poor generalization so far

Vanilla MeshGNN Vanilla transolver Transolver + Pipeline
Training set (191 shapes) 0.27 0.45 0.65
Test set (47 shapes) 0.24 0.48 0.49

R-sqaure accuracy at the final time-step

FROM YESTERDAY

Timeseries results - Z-displacement

Vanilla MeshGNN

Vanilla Transolver

Training (N=191)

Test set (N=47)

Transolver + Pipeline

Plot of R-sqaure accuracy vs time-step during auto-regressive rollout

Large variance

Improved stability

Better generalization

Vanilla MeshGNN Vanilla transolver Transolver + Pipeline
Training set (191 shapes) 0.27 0.45 0.68
Test set (47 shapes) 0.24 0.48 0.65

Median R-sqaure accuracy at the final time-step

Final-time results: Z-displacement

Vanilla Transolver

Vanilla MeshGraphNet

Transolver + SDF Feature

Plots of R-sqaure accuracy

Update 2/20/25 - AM time-series modeling

Modeling dynamical deformation in LPBF with neural network surrogates

  • Goal: Predict deformation stress time series in LPBF
  • Method: GNN/ Transformer + next-step-prediction
    • Pursue architecture/pipeline modification to alleviate MeshGNN limitations
    • Contributions: Novel architecture, pipeline for AM time-series
  • Updates:
    • Feature engineering
    • Parameter tuning, running larger models on
    • Neither larger models nor feature engineering are leading to faster convergence so far
    • Models continue to get better with more epochs
  • Next steps:
    • Pipeline tuning - modify pipeline to improve results
    • Architecture changes to transolver - see recent Transolver++ paper

Final-time results: Z-displacement

Transolver

MeshGraphNet

Transolver + SDF Feature + longer training

Plots of R-sqaure accuracy

Update 2/28/25 - AM time-series modeling

Modeling dynamical deformation in LPBF with neural network surrogates

  • Goal: Predict deformation stress time series in LPBF
  • Method: Transformer + next-step-prediction
  • Contribution:
    • Novel pipeline for AM time-series models
    • Novel transformer architecture
  • Updates:
    • Set up on Amir's cluster
    • Parameter sweep baseline architecture
      • Neither larger models nor feature engineering are leading to faster convergence so far
      • Models continue to get better with more epochs
      • Like due to problem in pipeline
    • Architecture modifications: training in progress
  • Next steps:
    • Pipeline tuning
    • Modify architecture

Update 3/05/25 - AM time-series modeling

Modeling dynamical deformation in LPBF with neural network surrogates

  • Goal: Predict deformation stress time series in LPBF
  • Method: Transformer + next-step-prediction
  • Contribution:
    • Novel transformer architecture
  • Updates:
    • Parameter sweeps runnnig on Aamir's clusters
    • Architecture modifications yield positive results
  • Next steps:
    • Continue experimenting with architecture
    • Train time-series model on larger datasets

Timeseries results - Z-displacement trained on 50 shapes

Vanilla Transolver

+ Pipeline

Cluster Attention + AdaLN

+ Pipeline (OURS)

Cluster Attention + Q-Conditioning

+ Pipeline (OURS)

Transolver + pipeline Cluster Attention + AdaLN + Pipeline (OURS) Cluster Attention + Q-Conditioning + Pipeline (OURS)
MSE Loss 7.38e-5 6.15e-5 4.99e-5

Training statistics

Enforcing Sparsity in slice weights

ARM Institute - Call for Proposals

Advanced Robotics for Manufacturing (ARM) Institute

  • AIM: develop high-quality datasets and AI-based solutions that can enhance robotic manufacturing
    • Trustworthy & High-Quality Datasets (HQD): Developing manufacturing datasets that are FAIR (Findable, Accessible, Interoperable, Reusable).

    • Robotic AI Skills, Models, and Frameworks: Creating adaptable robotic skills that can improve manufacturing processes.
    • Transfer Learning: Developing AI models that can be applied across different domains to reduce training costs.
  • PROJECT SCOPE:
    • Total Funding: Up to $2M for multiple projects.
    • Max Individual Award: $1M per project.
    • Cost-Share Requirement: 1:1 matching funds from project teams.
    • Duration: Maximum 18 months.
  • ELIGIBILITY:
  • Teams must include at least one U.S. manufacturer.
  • Multi-organization collaborations are required; single-entity proposals are not eligible.

Trustworthy & High-Quality Datasets

  • The ARM Institute is seeking trustworthy, high-quality contextualized datasets to support the development of AI models and build a library of robotic composable skills

Transfer Learning

  • Proposals should explore methods to navigate these challenges, ensuring secure, efficient, and effective adaptation of pre-trained models to diverse manufacturing contexts.

Update 3/13/25

Modeling dynamical deformation in LPBF with neural network surrogates

  • Contributions:
    • Novel transformer architecture for spatial slicing
    • Novel transformer architecture for time-series modeling
    • Time-series dataset for LPBF
  • Updates:
    • Architecture modifications yielding positive results
  • Next steps:
    • Continue experimenting with architecture
    • Train time-series model on benchmark datasets
  • Logistics
    • Publication venue
      • So far, we have been thinking of an AM journal publication
      • Now, we would like to consider publishing in an AI conference
    • Requesting time off March 26-28
    • Funding opportunity: ARM Institute

Update 3/20/25

Modeling dynamical deformation in LPBF with neural network surrogates

  • Contributions:
    • Novel transformer architecture for spatial slicing
    • Novel transformer architecture for time-series modeling
    • Time-series dataset for LPBF
  • Updates:
    • Testing architecture modifications on Cylinder Flow benchmar dataset
  • Next steps:
    • Continue experimenting with architecture for steady state
    • Continue benchmarking for rollout

Cylinder flow dataset - MeshGraphNet

1 case

100 cases

Cylinder flow dataset -1 test case

Transolver

Transolver + AdaLN Conditioning

Transolver + Q-Conditioning

America Makes - Call for Applied Research Projects

Allied Additive Manufacturing Interoperability (AAMI) Program

  • Scope: Up to $550,000 for 12 months (9 months of technical effort, 3 months of final report development)
  • Aim: Establish AM interoperability b/w USA UK defense supply chains by demonstrating qualification methods for LPBF



     
  • Eligibility

America Makes - Call for Applied Research Projects

  • Phase 1: Develop common OQ framework to validate material properties and process consistency across U.S. and U.K. suppliers.
  • Phase 2: Apply OQ framework to manufacture end-use parts, demonstrating PQ and interoperability
  • Focus Materials: Ti-6Al-4V (Grade 5), Nickel Alloy 718, and 316L Stainless Steel
  • Types of qualifications:

Role of numerical simulation / ML surrogates

  • Phase 1: Operational Qualification (OQ) Framework Development
    • Role for Simulation: Use simulation to model key process parameters (e.g., laser power, scan speed, powder layer thickness) and their impact on material performance, repeatability, and mechanical properties. This could help identify critical variables and reduce the need for extensive physical testing, streamlining the qualification process.
  • Phase 2: Performance Qualification (PQ) and End-Use Parts
    • Role for Simulation: Simulations could validate part-specific performance by modeling build conditions (e.g., machine build height consistency, laser stitch zones) and predicting defects or deviations before physical production. This aligns with the requirement to validate “manufacturing conditions deemed part application specific” (Page 10).
  • Alignment with AM Technology Roadmap and design requirements (REQ)

America Makes - Call for Applied Research Projects

Update 4/02/25

Modeling dynamical deformation in LPBF with neural network surrogates

  • Contributions:
    • Novel transformer architecture for spatial slicing
    • Novel transformer architecture for time-series modeling
    • Time-series dataset for LPBF
  • Updates:
    • Testing architecture modifications on Cylinder Flow benchmar dataset
  • Next steps:
    • Continue experimenting with architecture for steady state
    • Continue benchmarking for rollout

Architecture

Adaptive LayerNorm Block

Q-Conditioning Block

Methods for time conditioning

Cylinder flow dataset -10 test cases

Mesh Graph Net

Transolver

Transolver + AdaLN Conditioning

Cylinder flow dataset -10 cases

Transolver + AdaLN Conditioning

Transolver + AdaLN + Slice Attn

Transolver + Qcond + Slice Attn

Update 17/02/25

Cluster attention: stable point-cloud attention for learning physical fields

  • Contributions:
    • Novel transformer architecture for spatial slicing
    • Novel transformer architecture for time-series modeling
    • Time-series dataset for LPBF
  • Updates:
    • Struggling with cylinder flow dataset
      • some distributional shift between train and test datasets
      • selecting homogeneous subset of dataset does help. scaling helps too
    • Steady state problem: elasticity
      • Figured out why slice attention wasn't training well earlier.
      • Have a mathematical data-flow framework for our architecture
  • Next steps:
    • Time-series: abandon cylinder flow for now, work on airfoil, AM
    • Steady-state: test model on other datasets - darcy, NS, steady AM

Cylinder flow dataset - distributional shift bw train/test dataset

Train dataset stats (100 cases)

Test dataset stats (100 cases)

Loss curves

  • Distributional shift between train and test datasets causes poor generalization
  • Discussed with Anthony Zu (Amir's student). He saw similar issues.
    • He scaled up the model size to 70m parameters. We went up to 35m, saw no improvement
    • Recommends using val dataset in place of test dataset. We have not attempted this yet.

Our approach - create train/test split from subset of train dataset

1000 train / 100 test

  • Splitting train dataset into smaller train/test splits helps... but not enough.

200 train / 50 test

140 train / 35 test

Steady state elasticity dataset: 1000 train / 100 test cases

Transolver (physics attention)

Cluster Attention

  • Slicing
    • learned query [H, M, D]
    • QK normalization
    • Query - head mixing
  • Head-wise normalization
  • Self-attention
    • Permute & QKV projection [H*D, H*D]
  • Deslicing
  • Slicing
    • learned query embedding [M, D]
  • Self-attention
    • QKV projection [D, D]
  • Deslicing

Data-flow arguments

  • Slicing
    • Learned query [H, M, D]
      • Transolver applies Linear projection to permuted x. Perceiver POV allows us to view the weights of the linear layer as latent query embedding. Transolver thus uses the same query vector for each head. We give each head a unique one.
    • QK normalization
      • Normalize query and key embeddings for stable training (ref. NGPT paper)
    • Query - head mixing
      • Allow slice weights in different heads to communicate with each other (ref. Multi-Token attention, Talking-Head attention papers)
      • cannot do key mixing because key vector is of size N (point cloud). can't apply any permutation-dependent conv either.
  • Head-wise normalizatoin - Stability, break symmetry
  • Self-attention
    • Permute & QKV projection [H*D, H*D]
      • Transolver applies same projection to each head in parallel. No head-mixing. We allow for more head-mixing here. 
      • Query-head mixing above is only on the attention weights. Here, we mix token value across heads.
      • Think of transolver as applying a block diagonal matrix with each block being the same as us as applying a full matrix.
  • Deslicing

Experiments

Dataset CA + Concat CA + AdaLN CA + Q-Cond TS + Concat TS + AdaLN UPT GINO
Cylinder Flow
Airfoil
...
AM Dynamic
Dataset Cluster Attention (CA) Transolver (TS) UPT GINO
Elasticity
Darcy
...
AM steady

Dynamic rollout comparison

Steady state comparisons

Maybe we should forego Adaptive layer-norm for the same of time

Experiments - standard PDE benchmarks

LNO (from their paper) 0.0052 0.0029 0.0049 0.0026 0.0845 0.0049
Cluster Attention (ours) 0.0040 - - - - -
Our impl of transolver 0.0064 - - - - -
unified_pos \ deriv_loss True False
True 0.005969 0.00638390
False - 0.00679797

Darcy - Target (0.0058 / 0.0049)

Darcy experiments with transolver (conv2D) on transolver repo

unified_pos \ deriv_loss True False
True 0.00612938 0.0070930
False 0.0075420 0.0076147

Darcy experiments with transolver (conv2D) on transolver repo with max_grad_norm=None

Airfoil (standard) - target (0.0053/0.0048)

Setup Train MSE Test MSE Train Rel Error Test Rel Error
TS conv BS 4 OCLR 1e-3 E 500 CGN 1e-1 - - - 0.0057543
CA conv BS 4 OCLR 1e-3 E 500 CGN 1e-0 9.48e-4 2.52e-3 0.0048791 0.0072914
CA conv BS 4 OCLR 5e-4 E 500 CGN 1e-0 7.03e-4 2.68e-3 0.0042424 0.0073877
CA conv BS 2 OCLR 5e-4 E 500 CGN 1e-0 - - -
CA conv BS 2 OCLR 1e-4 E 500 CGN 1e-0 4.45e-4 2.07e-4 0.0032965 0.005795

Airfoil (standard) - target (0.0053/0.0048)

Cluster Attention w/o conv

Transolver w/o conv

Pipe (standard) - target (0.0027/ 0.0031)

Setup Train MSE Test MSE Train Rel Error Test Rel Error
TS conv BS 2 LR 1e-3 E 500 CGN 1e-1 WD 1e-5 0.0045
TS conv BS 4 LR 1e-3 E 500 CGN 1e-1 WD 1e-5 6.39e-6 7.12e-4 0.000998 0.00462
CA conv BS 4 LR 1e-3 E 500 CGN 1e-0 WD 1e-5 3.62e-4 1.82e-3 0.007495 0.011562
CA conv BS 4 LR 5e-4 E 500 CGN 1e-0 WD 1e-5 7.38e-6 8.94e-4 0.0010101 0.00565
CA conv BS 4 LR 5e-4 E 500 CGN 1e-1 WD 1e-5 4.87e-6 9.69e-4 0.000840 0.00576
CA conv BS 2 LR 5e-4 E 500 CGN 1e-1 WD 1e-5
CA conv BS 2 LR 1e-4 E 500 CGN 1e-1 WD 1e-5
CA conv BS 2 LR 1e-4 E 500 CGN 1e-1 WD 1e-4

Darcy

Transolver H = 128, Conv2D = False

Cluster Attention H = 128, Conv2D = False

Transolver H = 256, Conv2D = False

Cluster Attention H = 256, Conv2D = False

Elasticity

Elasticity

Elasticity

Elasticity - residual connection

Elasticity - without residual connection

Elasticity - L/H study

Attention block

AM time-series modeling project plan

Modeling dynamical deformation in LPBF with neural network surrogates

  • Goal: Predict deformation stress time series in LPBF
  • This week:
    • Compared transformer model and GNN on AM dataset
    • Analyze cases where our model is failing
    • Problem: we are looking at individual cases, not the entire dataset
  • Next week:
    • Write metrics to identify out-of-distribution shapes
    • Write evaluation metrics for the entire dataset, not individual cases
    • Data filtering - remove out-of-distribution shapes from training dataset
  • Plan:
    • GNN/ Transformer + next-step-prediction (~1-2 months)
      • Pursue architectural modification to alleviate MeshGNN limitations
      • Contributions: Novel architecture and pipeline for AM time-series
    • Transformer next-step prediction (proposed in PMFI 2025) (~3 months)
      • Contributions: Novel transformer method for AM time-series

Spatial Discretization

Temporal Discretization

GNN / Transformer

Transformer embedding

Neural implicits

Naive next step prediction

LSTM

Transformer next step prediction

ROM Project

Nonlinear kernel parameterizations for Neural Galerkin

  • Equation-based, data-free numerical methods for solving PDEs
  • Fast PDE solve in comparison to FEM thanks to compact representation
  • Smaller representation, faster solve in comparison to ML-based ROMs

Status and plan

  • Promising preliminary results on a host of 1D problems
  • Need at least 1 semester of time to flesh out these ideas

Potential new contributions and timeline

  • Develop and finalize proposed parameterizations (1-2 moths)
    • Test different parameterization ideas
    • Test on 1D, 2D test cases
  • Develop adaptive refinement/ coarsening techniques (1 month)
u(x, t) = \sum_{i=1}^{\textcolor{magenta}{N}} \frac{\textcolor{blue}{c_i(t)}}{2} \left( \tanh(\textcolor{blue}{\omega_0(t)} (x - \textcolor{blue}{x_0(t)})) - \tanh(\textcolor{blue}{\omega_1(t)} (x - \textcolor{blue}{x_1(t)})) \right)

Parameterized Tanh kernels

ROM Project - Nonlinear parameterizations

Nonlinear kernel parameterizations for Neural Galerkin

  • Equation-based, data-free numerical methods for solving PDEs

Status and plan

  • Promising preliminary results on a host of 1D problems
  • Need at least 1 semester of time to flesh out these ideas

Potential new contributions and timeline

  • Develop novel parameterizations that have several benefits
    • Very expressive (handles shocks) with few parameters (speedup)
    • Fast hyper-reduction as parameterization is naturally sparse
    • Accurate integration as parameterization is sparse
    • In comparison, DNNparameterizations are large and do not result in a speedup; other kernelized parameterizations (e.g. Gaussian kernels) are not as expressive
  • Develop adaptive refinement techniques
  • Develop adaptive coarsening techniques
u(x, t) = \sum_{i=1}^{\textcolor{magenta}{N}} \frac{\textcolor{blue}{c_i(t)}}{2} \left( \tanh(\textcolor{blue}{\omega_0(t)} (x - \textcolor{blue}{x_0(t)})) - \tanh(\textcolor{blue}{\omega_1(t)} (x - \textcolor{blue}{x_1(t)})) \right)

Parameterized Tanh kernels

Neural Galerkin - Advection Diffusion problem

Parameterized Gaussian (OURS)

3 parameters

8 collocation points

Deep Neural Network (BASELINE)

~150 parameters

256 collocation points

 Multiplicative filter network (MFN)

~210 parameters

256 collocation points

Error due to limited expressivity of this simple model

FAILED TO CONVERGE

Improve model fit by splitting kernels

  • At time t=0, we are fitting the initial condition given to us with our nonlinear model. This is the projection step.
  • To improve the fit, we are going to make the model more expressive with boosting.
  • We do this by repeatedly dividing each kernel in two and optimizing both.
  • This is akin to adaptive refinement

1 Kernel (6 params)

4 Kernel (21 params)

advisor_meetings_2

By Vedant Puri

advisor_meetings_2

Biweekly co-advisor meeting

  • 49