Differentiable Particle Accelerator Simulations
D. Vilsmeier
DiPAS
Model-based optimization
Physics
Model
Optimizer
Example: Particle Tracking
Example: Linear optics
Transfer Matrix
- Tracking through one element is analogue to a neural network layer (b = 0, σ = id)
- Weights are given and constrained by the transfer matrix
- Whole beamline is analogue to a feed-forward network
@ Entrance
@ Exit
Example: Particle Tracking
Example: Quadrupole
Tracking as computation graph
Express as combination / chain of elementary operations
Output / prediction
Input
(Elementary) operation
Parameter of element
Layout as computation graph (keeping track of operations and values at each stage)
Tracking as computation graph
Tracking as computation graph
0.1
0.32
3.2
0.98
1.0
0.32
0.32
0.32
0.95
0.31
0.0295
0.0475
0.077
0.03
0.05
Tracking as computation graph
0.1
0.32
3.2
0.98
1.0
0.32
0.32
0.32
0.95
0.31
0.0295
0.0475
0.077
1.0
1.0
1.0
0.03
0.05
0.05
0.03
0.096
0.009
-0.016
0.091
0.075
-0.091
-0.025
Why not derive manually?
For long lattices this type of derivatives can become very quickly very tedious
Beamline consists of many elements
Tracking in circular accelerator requires multiple turns
Dependence on lattice parameters typically non-linear
Symbolic differentiation can help to "shortcut"
Optimization of network parameters
"Supervised Learning" using samples of input and expected output
Sample 🡒 Forward pass 🡒 Prediction
Quality of prediction by comparing with (expected) sample output, e.g. MSE
Minimization of loss by tuning network parameters 🡒 Optimization problem
Gradient based parameter updates using (analytically) exact gradients via Backpropagation algorithm
Dataflow &
Differentiable
Programming
Example: Quadrupole tuning
- Beamline, 160 m, 21 quadrupoles, gradients are to be adjusted
- Fulfill multiple goals:
- Minimize losses along beamline
- Minimize beam spot size at target
- Keep beam spot size below threshold at beam dump location
21 Quadrupoles
Target
Beam dump
Envelope
Optimize gradients
Example: Inference of model errors
- Same beamline as before, infer quadrupole gradient errors w.r.t model
- Quality metric: mean squared error of simulated vs. measured Trajectory Response Matrix (TRM)
- TRM = orbit response @ BPMs in dependence on kicker strengths
beam position @ BPM
kicker strength
What about other optimizers?
Gradient-Free
Gradient-Approx.
Gradient-Exact
E.g. Nelder-Mead, Evolutionary algorithms, Particle Swarm, etc.
Any derivative-based method, e.g. via finite-difference approx.
Balance exploration vs. exploitation of the loss landscape (parameter space)
Exploration: Try out yet unknown locations
Exploitation: Use information about landscape to perform an optimal parameter update
No "built-in" exploration, but can be included via varying starting points (global) and learning rate schedules (local)
Symbolic differentiation or differentiable programming
Additional advantages
- Good convergence properties, also for (very) large parameter spaces
- No need to rely on plain vanilla gradient descent, all sort of advanced optimizers are available, e.g. RMSprop or Adam
- Two function evaluations per gradient computation (where e.g. finite-difference approx. uses N function evaluations)
- Running on GPU / TPU? Yes!
Deep learning packages provide straightforward ways to switch devices
→ "Batteries included"
https://commons.wikimedia.org/wiki/File:Geforce_fx5200gpu.jpg
Not limited to particle tracking nor linear optics
High versatility w.r.t. data and metrics
Differentiable Particle Accelerator Simulations
By Dominik Vilsmeier
Differentiable Particle Accelerator Simulations
Introducing a novel framework for differentiable simulations of particle accelerators (DIPAS).
- 825