Sam Foreman
May, 2021
This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357.
Collaborators:
Huge thank you to:
Critical slowing down!
Inefficient!
saved
dropped
\(x_{0}\rightarrow x_{1}\rightarrow x_{2}\rightarrow\cdots\rightarrow x_{m-1}\rightarrow x_{m}\rightarrow x_{m+1}\rightarrow\cdots\rightarrow x_{n-2}\rightarrow x_{n-1}\rightarrow x_{n}\)
1. Construct chain:
Goal: Generate an ensemble of independent configurations
random walk
\(x^{\prime} = x + \delta\), where \(\delta \sim \mathcal{N}(0, \mathbb{1})\)
\(x_{0}\rightarrow x_{1}\rightarrow x_{2}\rightarrow\cdots\rightarrow x_{m-1}\rightarrow x_{m}\rightarrow x_{m+1}\rightarrow\cdots\rightarrow x_{n-2}\rightarrow x_{n-1}\rightarrow x_{n}\)
2. Thermalize ("burn-in"):
3. Drop correlated samples ("thinning"):
\(x_{0}\rightarrow x_{1}\rightarrow x_{2}\rightarrow\cdots\rightarrow x_{m-1}\rightarrow x_{m}\rightarrow x_{m+1}\rightarrow\cdots\rightarrow x_{n-2}\rightarrow x_{n-1}\rightarrow x_{n}\)
Introduce fictitious momentum:
\(v\sim\mathcal{N}(0, 1)\)
Target distribution:
\(p(x)\propto e^{-S(x)}\)
Joint target distribution:
lift to phase space
Hamilton's Equations
(trajectory)
2. Full-step \(x\)-update:
3. Half-step \(v\)-update:
1. Half-step \(v\)-update:
Stuck!
(\(m_{t}\)\(\odot x\)) -independent
masks:
Momentum (\(v_{k}\)) scaling
Gradient \(\partial_{x}S(x_{k})\) scaling
Translation
(\(v\)-independent)
where \((s_{v}^{k}, q^{k}_{v}, t^{k}_{v})\), and \((s_{x}^{k}, q^{k}_{x}, t^{k}_{x})\), are parameterized by neural networks
masks:
Stack of fully-connected layers
\(x_{k} \in U(1) \longrightarrow x_{k} = \left[\cos\theta, \sin\theta\right]\)
construct trajectory
Compute loss + backprop
Metropolis-Hastings accept/reject
re-sample momentum + direction
(varied slowly)
e.g. \( \{0.1, 0.2, \ldots, 0.9, 1.0\}\)
(increasing)
Note:
\(A(\xi',\xi)\) = acceptance probability
\(A(\xi'|\xi)\cdot\delta(\xi',\xi)\)= avg. distance
\(\xi\) = initial state
\(\xi\) = initial state
HMC
L2HMC
expected squared jump distance:
continuous, differentiable
discrete, hard to work with
\(x_{k} \in U(1) \longrightarrow x_{k} = \left[\cos\theta, \sin\theta\right]\)
\(\beta = 5\)
\(\beta = 6\)
\(\beta = 7\)
Rescale: \(N_{\mathrm{LF}}\cdot\tau^{\mathcal{Q}_{\mathbb{Z}}}_{\mathrm{int}}\) to account for different trajectory lengths
Leapfrog step
variation in the avg plaquette
continuous topological charge
shifted energy
\(\beta = 7\)
\(\simeq \beta = 3\)
\(\beta = 7\)
\(\simeq \beta = 3\)
Momentum (\(v_{k}\)) scaling
Gradient \(\partial_{x}S(x_{k})\) scaling
Translation
where \((s_{v}^{k}, q^{k}_{v}, t^{k}_{v})\), and \((s_{x}^{k}, q^{k}_{x}, t^{k}_{x})\), are parameterized by neural networks
\(\alpha_{s^{k}_{v}}\cdot s_{v}^{k}(\zeta_{v_{k}})\)
\(\alpha_{q^{k}_{v}}\cdot q_{v}^{k}(\zeta_{v_{k}})\)
\(\alpha_{t^{k}_{v}}\cdot t_{v}^{k}(\zeta_{v_{k}})\)
\(\alpha_{s^{k}_{x}}\cdot s_{x}^{k}(\zeta_{x_{k}})\)
\(\alpha_{q^{k}_{x}}\cdot q_{x}^{k}(\zeta_{x_{k}})\)
\(\alpha_{t^{k}_{x}}\cdot t_{x}^{k}(\zeta_{x_{k}})\)
\(\alpha \in (0, 1)\)
4096
8192
1024
2048
512
\(4096 \sim 1.73\times\)
\(8192 \sim 2.19\times\)
\(1024 \sim 1.04\times\)
\(2048 \sim 1.29\times\)
\(512\sim 1\times\)
\(8192\sim \times\)
4096
1024
2048
512