Calculating properties in AFQMC using automatic differentiation

Properties in AFQMC

  • Mixed estimators often require very accurate trials
\langle \hat{O}\rangle_m = \dfrac{\langle\psi_T|\hat{O}\hat{P}|\psi_I\rangle}{\langle\psi_T|\hat{P}|\psi_I\rangle}
\langle \hat{O}\rangle_{\text{corrected}} = 2\dfrac{\langle\psi_T|\hat{O}\hat{P}|\psi_I\rangle}{\langle\psi_T|\hat{P}|\psi_I\rangle} - \dfrac{\langle\psi_T|\hat{O}|\psi_T\rangle}{\langle\psi_T|\psi_T\rangle}
  • Backpropagation correlates bra and ket to reduce noise
\langle \hat{O}\rangle = \dfrac{\langle\psi_T|\hat{P}'\hat{O}\hat{P}|\psi_I\rangle}{\langle\psi_T|\hat{P}'\hat{P}|\psi_I\rangle}
\langle \hat{O}\rangle_v = \dfrac{\langle\psi_T|\hat{P}\hat{O}\hat{P}|\psi_I\rangle}{\langle\psi_T|\hat{P}^2|\psi_I\rangle}
  • Variational estimators are tricky to evaluate in AFQMC
\langle\hat{O}\rangle = \dfrac{d E(\hat{H}+\lambda \hat{O})}{d\lambda}\bigg\vert_{\lambda=0}
  • Analytical derivatives (?)
  •  Finite difference: 

Response formulation

  • small energy differences between large noisy energies \(\rightarrow\) correlated sampling
  • Lots of calculations required if multiple observables or forces are desired
  • Automatic differentiation is a viable alternative

Outline

  • Forward and reverse mode automatic differentiation:
    putting the chain rule to good use
  • Analysis of stochastic and systematic errors:
    scaling of error and cost with system and basis size
  • Results:
    comparison to experimental dipole moments

Automatic differentiation (AD)

f:\mathbb{R}^m\rightarrow\mathbb{R}^n

Consider a program that evaluates a function \(f\)

Intermediates formed during execution

f = f_l\circ f_{l-1}\circ\dots\circ f_1
f'=f_l'.f_{l-1}'.\dots f_1'

The Jacobian vector products are performed numerically

The order in which these evaluations are performed depends on the problem and dictates efficiency

Forward mode

Suppose we only want derivatives wrt a small number of inputs

\dot{y}=f_l'.f_{l-1}'.\dots f_1'.\dot{x}

derivative time cost ~ 2-3 \(\times\) cost of \(f\)

Good choice if only a small number of observables are required

\langle\hat{O}\rangle = \dfrac{d E(\hat{H}+\lambda \hat{O})}{d\lambda}\bigg\vert_{\lambda=0}

Reverse mode

To calculate derivatives of few outputs wrt to a lot of inputs

\bar{x}^{T}=f_1'^{T}.f_{2}'^{T}.\dots f_l'^{T}.\bar{y}

gradient time cost ~ 2-4 \(\times\) cost of \(f\)

checkpointing to reduce memory cost

\frac{dE(\hat{H}+\lambda_{ij} a_{i}^{\dagger}a_j)}{d\lambda_{ij}}

Suitable for RDM's e.g.

Toy example

Harmonic oscillator in an electric field VMC:

H(F) = (p^2 + x^2)/2 - Fx
\psi_{\sigma}(x; \mu) = e^{-(x-\mu)^2/2\sigma^2}

Correlated sampling 1:

E(\delta)-E(0) = \langle E_L^{\delta}\dfrac{\psi_{\sigma}(\delta)^2}{\psi_{\sigma}(0)^2}-E_L^0\rangle

Correlated sampling 2:

Use the same random numbers in \(E(\delta)\) and \(E(0)\) calculations, 

FD equivalent of AD

Possibility of divergences in correlated sampling \(\rightarrow\) AD?

Assaraf, Caffarel, Kollias '11

Scaling of stochastic error in H chains (minimal basis)

Relative error

Scaling of error with basis size in water 

Systematic errors in ammonia dipole moment (dz basis)

Dipole moments in the continuum limit

Self-consistent AFQMC to improve properties

AFQMC reverse AD 1RDM \(\rightarrow\) natural orbitals for trial

Test for CO (in DZ basis):

CCSD(T) dipole: 0.087

AFQMC dipole without self-consistency: 0.073(4)

AFQMC dipole with self-consistency: 0.093(3)

Energy improved by about ~2m mH as well!

Summary

AD with AFQMC brings observable calculations on par with energy calculations

Forward and reverse AD can be employed depending on the problem

Higher order properties, nuclear forces, ...