D. Vilsmeier
Example: "Fully connected, feed-forward network"
Input Layer
"Hidden" Layers
Output Layer
Transformation per layer
Bias term
Weight matrix
(Non-)linear function ("activation")
Input data
Prediction
"Supervised Learning" using samples of input and expected output
Sample 🡒 Forward pass 🡒 Prediction
Quality of prediction by comparing with (expected) sample output, e.g. MSE
Minimization of loss by tuning network parameters 🡒 Optimization problem
Gradient based parameter updates using (analytically) exact gradients via Backpropagation algorithm
Dataflow &
Differentiable
Programming
Example: Linear optics
Transfer Matrix
@ Entrance
@ Exit
Example: Quadrupole
Express as combination / chain of elementary operations
Output / prediction
Input
(Elementary) operation
Parameter of element
Layout as computation graph (keeping track of operations and values at each stage)
0.1
0.32
3.2
0.98
1.0
0.32
0.32
0.32
0.95
0.31
0.0295
0.0475
0.077
0.03
0.05
0.1
0.32
3.2
0.98
1.0
0.32
0.32
0.32
0.95
0.31
0.0295
0.0475
0.077
1.0
1.0
1.0
0.03
0.05
0.05
0.03
0.096
0.009
-0.016
0.091
0.075
-0.091
-0.025
For long lattices this type of derivatives can become very quickly very tedious
Beamline consists of many elements
Tracking in circular accelerator requires multiple turns
Dependence on lattice parameters typically non-linear
Symbolic differentiation can help to "shortcut"
21 Quadrupoles
Target
Beam dump
Envelope
Optimize gradients
beam position @ BPM
kicker strength
Gradient-Free
Gradient-Approx.
Gradient-Exact
E.g. Nelder-Mead, Evolutionary algorithms, Particle Swarm, etc.
Any derivative-based method, e.g. via finite-difference approx.
Balance exploration vs. exploitation of the loss landscape (parameter space)
Exploration: Try out yet unknown locations
Exploitation: Use information about landscape to perform an optimal parameter update
No "built-in" exploration, but can be included via varying starting points (global) and learning rate schedules (local)
Symbolic differentiation or differentiable programming
https://commons.wikimedia.org/wiki/File:Geforce_fx5200gpu.jpg
Not limited to particle tracking nor linear optics
Â
High versatility w.r.t. data and metrics