Machine Learning in Lattice QCD
Sam Foreman
02/10/2020
Introduction
-
LatticeQCD:
- Non-perturbative approach to solving the QCD theory of the strong interaction between quarks and gluons
-
Calculations in LatticeQCD proceed in 3 steps:
- Gauge field generation: Use Markov Chain Monte Carlo (MCMC) methods for sampling independent gauge field (gluon) configurations.
- Propagator calculations: Compute how quarks propagate in these fields ("quark propagators")
- Contractions: Method for combining quark propagators into correlation functions and observables.
Motivation: Lattice QCD
- Generating independent gauge configurations is a MAJOR bottleneck for LatticeQCD.
- As the lattice spacing, \(a \rightarrow 0\), the MCMC updates tend to get stuck in sectors of fixed gauge topology.
- This causes the number of steps needed to adequately sample different topological sectors to increase exponentially.
Critical slowing down!
\(a\)
Continuum limit
Markov Chain Monte Carlo (MCMC)
- Goal: Generate an ensemble of independent samples drawn from the desired target distribution \(p(x)\).
- This is done using the Metropolis-Hastings accept/reject algorithm:
-
Given:
- Initial distribution, \(\pi_{0}\)
- Proposal distribution, \(q(x^{\prime}|x)\)
-
Update:
- Sample \(x^{\prime} \sim q(\cdot | x)\)
- Accept \(x^{\prime}\) with probability \(A(x^{\prime}|x)\)
if \(q(x^{\prime}|x) = q(x|x^{\prime})\)
Metropolis-Hastings: Accept/Reject
import numpy as np
def metropolis_hastings(p, steps=1000):
x = 0. # initialize config
samples = np.zeros(steps)
for i in range(steps):
x_prime = x + np.random.randn() # proposed config
if np.random.rand() < p(x_prime) / p(x): # compute A(x'|x)
x = x_prime # accept proposed config
samples[i] = x # accumulate configs
return samples
As
,
Issues with MCMC
- Need to wait for the chain to "burn in" (become thermalized)
- Nearby configurations on the chain are correlated with each other.
- Multiple steps needed to produce independent samples ("mixing time")
- Measurable via integrated autocorrelation time, \(\tau^{\mathrm{int}}_{\mathcal{O}}\)
- Multiple steps needed to produce independent samples ("mixing time")
Smaller \(\tau^{\mathrm{int}}_{\mathcal{O}}\longrightarrow\) less computational cost!
correlated!
burn-in
Hamiltonian Monte Carlo (HMC)
- Target distribution \(p(x)\) defined by an energy function \(U(x)\) such that \(p(x) \propto \exp{\left(-U(x)\right)}\)
- Introduce a (fictitious) momentum variable \(v\) that is (normally) distributed independently from \(x\) as \(p(v)\propto \exp{\left(-\frac{1}{2}v^{T}v\right)}\)
- HMC samples from the canonical distribution:
- We can improve on the "guess and check" approach of MCMC by using Hamiltonian Monte Carlo (HMC).
We know how this "evolves" in time!
HMC: Leapfrog Integrator
- Integrate Hamilton's equations numerically using the leapfrog integrator.
- The leapfrog integrator proceeds in three steps:
Update momenta (half step):
Update position (full step):
Update momenta (half step):
HMC: Leapfrog Integrator
- Write the action of the leapfrog integrator in terms of an operator \(L\), acting on the state \(\xi \equiv (x, v)\):
- The acceptance probability is then given by:
- Introduce a "momentum-flip" operator, \(\mathbf{F}\):
(for HMC)
Jacobian determinant, \(|\mathcal{J}|\)
Hamiltonian Monte Carlo (HMC)
- Integrating Hamilton's equations allows us to move far in state space while staying (roughly) on iso-probability contours of \(p(x, v)\)
Integrate \(H(x, v)\):
\(t \longrightarrow t + \varepsilon\)
Project onto target parameter space \(p(x, v) \longrightarrow p(x)\)
\(v \sim p(v)\)
HMC: Issues
-
Cannot easily traverse low-density zones.
-
What do we want in a good sampler?
- Fast mixing
- Fast burn-in
- Mix across energy levels
- Mix between modes
-
Energy levels selected randomly \(\longrightarrow\) slow mixing!
(especially for Lattice QCD)
L2HMC: Learning to HMC
- L2HMC generalizes HMC by introducing 6 new functions, \(S_{\ell}, T_{\ell}, Q_{\ell}\), for \(\ell = x, v\) into the leapfrog integrator.
- Given an analytically described distribution, L2HMC provides a statistically exact sampler, with highly desirable properties:
- Fast burn-in.
- Fast mixing.
Ideal for lattice QCD due to critical slowing down!
-
Idea: MINIMIZE the autocorrelation time (time needed for samples to be independent).
- Can be done by MAXIMIZING the "distance" traveled by the integrator.
L2HMC: Augmented Leapfrog
Momentum scaling
Gradient scaling
Translation
inputs
-
Idea: Generalize HMC by introducing six new functions:
- \(S_{x}(\theta_{x}),\,T_{x}(\theta_{x}),\,Q_{x}(\theta_{x})\); \(\quad\) \(S_{v}(\theta_{v}),\,T_{v}(\theta_{v}),\,Q_{v}(\theta_{v})\)
L2HMC: Modified Leapfrog
- Writing the action of the new leapfrog integrator as an operator \(\mathbf{L}_{\theta}\), parameterized by \(\theta\).
- Applying this operator \(M\) times successively to \(\xi\):
- The "flip" operator \(\mathbf{F}\) reverses \(d\): \(\mathbf{F}\xi = (x, v, -d)\).
- Write the complete dynamics step as:
(trajectory length)
L2HMC: Accept/Reject
- \(\mathcal{J}\) can be computed efficiently!
- Only depends on: \(S_{x}, S_{v}, \zeta_{i}\).
- This has the effect of deforming the energy landscape:
- Accept the proposed configuration, \(\xi^{\prime}\) with probability:
\(A(\xi^{\prime}|\xi) = \min{\left(1, \frac{p(\mathbf{FL}_{\theta}\xi)}{p(\xi)}\left|\frac{\partial\left[\mathbf{FL}\xi\right]}{\partial\xi^{T}}\right|\right)}\)
\(|\mathcal{J}| \neq 1\)
Unlike HMC,
L2HMC: Loss function
- Choose a loss designed to reduce mixing (autocorrelation) time \(\tau_{\mathcal{O}}^{\mathrm{int}}\):
- Idea: minimize the autocorrelation time by maximizing the "distance" traveled during integration.
Encourages typical moves to be large
Penalizes sampler if unable to move effectively
scale parameter
"distance" between \(\xi, \xi^{\prime}\): \(\delta(\xi, \xi^{\prime}) = \|x - x^{\prime}\|^{2}_{2}\)
- Note:
\(\delta \times A = \) "expected" distance
Accept prob.
Network Architecture
Build model,
initialize network
Run dynamics, Accept/Reject
Calculate
Backpropagate
Finished
training?
Save trained
model
Run inference
on saved model
Train step
L2HMC
HMC
GMM: Autocorrelation
L2HMC: \(U(1)\) Lattice Gauge Theory
- Wilson action:
where:
- Link variables:
Sum of \(\phi\) around plaquette
\(U(1)\) Lattice Gauge Theory
Good sampling
Poor sampling
L2HMC
HMC
Thanks for listening!
- Code is available at: http://github.com/saforem2/l2hmc-qcd
- Slides are available at: http://slides.com/samforeman/l2hmc-qcd
l2hmc-qcd
By Sam Foreman
l2hmc-qcd
Machine Learning in Lattice QCD
- 268