Next Generation Reservoir Computing

          quantum systems

By

For

Ninnat Dangniam

2024 KMUTT-IF Winter School

19 Dec 2024

Quantum Machine Intelligence, 6, 57 (2024)

& manuscript in preparation

NG-RC

Krai Cheamsawat

Apimuk Sornsaeng

Thiparat Chotibut

Paramott Bunnjaweht

Now at

Now at

the (local) team

and more...

A new way to predict properties of quantum states

Heuristic, training-based

Type of algorithm

Type of data

quantum state tomography

  • Quantum states of \(N\) qubits are complex vectors in \(2^N\) dimensions—an exponential large object
  • Measuring the state gives one of the \(2^N\) outcomes The state is destroyed after the measurement
  • Therefore, learning quantum states always required multiple identical copies—\(\Theta(4^N)\), in fact
\ket{\psi} = \sum_{\vec{x}} \alpha_{\vec x} \ket{\vec{x}=x_1 x_2 \dots x_N}

\(x_j=\) 0 or 1

\alpha_{\vec{x}} \in \mathbb{C}
\ket{\psi}_{n}
\ket{\psi}_{n-1}
\dots
\ket{\psi}_1
\psi

Classical description of \(\ket{\psi}\)

Reconstruction algorithm

quantum state tomography

  • Learning quantum states also require several independent observables
  • This is best illustrated in the qubit case
\sigma^x = \begin{pmatrix}0&1\\1&0 \end{pmatrix}
\sigma^y = \begin{pmatrix}0&\!\!\!-i\\i&0 \end{pmatrix}
\sigma^z = \begin{pmatrix}1&0\\0&\!\!\!-1 \end{pmatrix}
\ket{\psi} = \alpha \ket{0} + \beta\ket{1}
|\alpha|^2 + |\beta|^2=1
\langle O \rangle \equiv \bra{\psi}O\ket{\psi}
\begin{align*} \langle \sigma^x \rangle &= 2\textrm{Re}(\alpha\beta^*) \\ \langle \sigma^y \rangle &= 2\textrm{Im}(\alpha\beta^*) \\ \langle \sigma^x \rangle &= |\alpha|^2 - |\beta|^2 \end{align*}
\ket{0}
\ket{1}
\begin{bmatrix} \langle \sigma^x \rangle \\ \langle \sigma^y \rangle \\ \langle \sigma^z \rangle \end{bmatrix} = \begin{bmatrix} \sin\theta\cos\varphi \\ \sin\theta\sin\varphi \\ \cos\theta \end{bmatrix}

quantum state tomography

quantum state tomography

quantum state tomography

quantum state tomography

quantum state tomography

quantum state tomography

  • Quantum tomography reconstructs a quantum state from its properties or "shadows" \(\langle O_j \rangle\)
  • But if the state is a tool for calculating shadows, why don't we estimate shadows directly from shadows?

Allegory of Plato's cave

  • This shadow tomography task (Aaronson 2018, coined by Steve Flammia) turns out to be much more manageable

Exp

polynomial

4edges, Wikimedia

shadow tomography

Prediction algorithm

\{\langle O_j \rangle \}
\{ \langle \widetilde{O}_j \rangle \}

deterministic chaos

Wikimol, Wikimedia 

\begin{align*} \dot{x} &= \sigma(y-x) \\ \dot{y} &= x(\rho-z)-y \\ \dot{z} &= xy-\beta z \end{align*}

Model of atmospheric convection governed by nonlinear DEs

Exhibits hypersensitivity to initial conditions (the "butterfly effect")

Lorenz attractor

\rho=28, \sigma=10, \beta=\frac{8}{3}

it may happen that small differences in the initial conditions produce very great ones in the final phenomena [...] Prediction becomes impossible, and we have the fortuitous phenomenon.

Henri Poincaré

deterministic chaos

Lyapunov time is a characteristic timescale at which nearby trajectories diverge

next-generation reservoir computing

NG-RC accurately predicts the Lorenz attractor up to ~5 Lyapunov times (1.1 unit time in the figure). How?

next-generation reservoir computing

Nonlinear transformations

Stirring liquid surface = computation

next-generation reservoir computing

o_k = s_k \oplus s_{k-{\color{yellow}\Delta}} \oplus s_{k-2 {\color{yellow}\Delta}} \oplus \cdots \oplus s_{k-({\color{pink}m}-1){\color{yellow}\Delta}}
  • NG-RC injects generic (polynomial) nonlinearity by constructing \(m\)-time-delay vectors out of observed time series \(\{s_0, s_1, \dots, s_{T-1}\}\)
s_k = \begin{bmatrix} a_1 \\ \vdots \\ a_M \end{bmatrix}
\underbrace{x_k = o_k \oplus (o_k)^{\otimes {\color{lightgreen}p}}}
a_j \in \mathbb{R}

3 hyperparameters

Step size \(\Delta\)

Time delay \(m\)

Nonlinearity degree \(p\)

dim \(mM+(mM)^p\)

  • \(x_k\) is then "trained" by a linear transformation that minimizes a (regularized) least-square loss w.r.t. the target

vs

NG-RC gives a heuristic for "time-translated shadow tomography"*

Prediction algorithm

\{\langle O_j \rangle \}
\{ \langle \widetilde{O}_j \rangle \}

Future

Past

*We don't claim to solve the shadow tomography task in its rigorous formulation

What is necessary for accurate prediction? 

  • Time delay \(m\) and the number of observables \(M\) should obey the Takens' condition (Sauer et al. 1991) related to attractors in late-time dynamics
  • Nonlinearity degree \(p\) should be related to the nonlinearity of DEs that govern the observables' dynamics (Zhang et al. 2023)

Some key messages:

s_k = \begin{bmatrix} \langle O_1\rangle \\ \vdots \\ \langle O_M\rangle \end{bmatrix}
\langle O_j \rangle \in \mathbb{R}

Ng-rc for quantum

Jpagett, Wikimedia 

Bose-Hubbard model

1D spin chains

Heisenberg XXZ model

Tilted-field Ising model (TIM)

AG-FKT, TU Braunschweig

H = -J \sum_{i=0}^{N-1} (\sigma_i^x \sigma_{i+1}^x + \sigma_i^y \sigma_{i+1}^y + g\sigma_i^z \sigma_{i+1}^z)
H = -J \sum_{i=0}^{N-1} [ \sigma_i^z \sigma_{i+1}^z + g(\sigma_i^x \sin\theta + \sigma_i^x \cos\theta)]
H = -J \sum_{i,j=0}^{N-1} J a^{\dagger}_i a_i + \sum_{i=0}^{N-1} \left( \epsilon a^{\dagger}_i a_i + \frac{U}{2}a^{\dagger2}_i a_i^2 +F(a^{\dagger}_i + a_i) \right)

Quantum chaotic

Quantum integrable

Linear EOMs

\frac{d}{dt}\langle \sigma_n \rangle = f_{\textrm{linear}} \left( \left\{ \langle \sigma_{n-1} \rangle ,\langle \sigma_{n} \rangle ,\langle \sigma_{n+1} \rangle \right\} \right)
\frac{d\alpha_j}{dt} = -(\gamma+i\Delta) \alpha_j - i\sum_{k=0}^{N-1} J_{jk} \alpha_k - iU\alpha_j |\alpha_j|^2 - iF

Non-linear EOMs (semiclassical limit)

model systems

takens' embedding theorem

\(\Omega\) = Tomographically complete set of observables

\(S_M \subset \Omega\) = Accessible observables

\mathbb R^{|\Omega|}
\mathcal A
\mathbb R^{M}

Attractor

\mathbb R^{mM}

Generic \(F\) = one-to-one embedding of the attractor provided

\(mM > 2\dim\mathcal (A)\)

Trade time dimension for number of observables!

F

takens' embedding theorem

XXZ model

Tilted Ising model

(NRMS) Error

m\ge69
  • Thus, for a fixed number of observables \(M\), there is a minimal embedding dimension \(m_{\textrm{Takens}} > 2\dim\mathcal (A)/M \)
  • For the spin chain models, the EOMs of accessible observables in \(S_M\) (one-site & nearest neighbor two-site Pauli observables) is not closed, nevertheless \(\max \dim \mathcal(A)\) is at most the number of (real) parameters of (pure) quantum states: \(2\cdot 2^N-2\)
\frac{d}{dt}\langle \sigma_n \rangle = f_{\textrm{linear}} \left( \left\{ \langle \sigma_{n-1} \rangle ,\langle \sigma_{n} \rangle ,\langle \sigma_{n+1} \rangle \right\} \right)
N=8, M=15, p=1

takens' embedding theorem

|\Omega|=20
M_{\textrm{one-site}} = 2 \implies m_{\textrm{Taken}}>20
M_{\textrm{two-site}} = 4 \implies m_{\textrm{Taken}}>10

In the semiclassical limit, observables are c-numbers \(\alpha_j = x_j + ip_j\)

p>1

Nonlinear EOMs

NG-RC is a heuristic, training-based "time-translated shadow tomography"

Prediction algorithm

\{\langle O_j \rangle \}
\{ \langle \widetilde{O}_j \rangle \}

Future

Past

Prior knowledge about the dynamics

optimization

!
  • We demonstrates a capability of NG-RC to predict quantum properties in quantum many-body systems
  • Performance can be understood via Takens' embedding theorem
  • However, hyperparameter optimization would require prior knowledge about the dynamics we want to predict, a catch-22 situation! (Zhang et al. 2023)
  • There're a lot of questions to think about e.g.
    • What do attractors of quantum observables look like?
    • How to choose informative observables?

conclusion

Quantum NG-rc

  • NG-RC algorithm can be quantized and works in the same way (but coherently takes quantum input and output instead)
  • Block encoding saves the space for storing \(\underbrace{x_k = o_k \oplus (o_k)^{\otimes {\color{lightgreen}p}}} \) exponentially
  • Iterative prediction is costly, so we invented skip-ahead method
  • However, inputting quantum data can be inefficient

dim \(mM+(mM)^p\)

Please see the paper if you're interested!

skippa skippa

Thank you

fractal dimensions

Generalized Takens' theorem in Sauer et al. allow fractal attractors, in which case the relevant dimension is the box-counting dimension

\dim_B = \lim_{\epsilon\to0} \frac{\log \mathcal N(\epsilon)}{\log \epsilon^{-1}}

\( \mathcal N(\epsilon) = \) the number of boxes of size \( \epsilon \) required to cover the attractor

\mathcal N(\epsilon) \sim \epsilon^{-\dim_B}

For dynamical systems, \(\dim_B\) is estimated by generating a trail of points in the attractor and count the number of boxes they visit, then plotting a log-log plot of \(\mathcal N(\epsilon)\) vs \( \epsilon \) 

The estimate is sensitive to statistical noise at small \(\epsilon\) 

fractal dimensions

\dim_C = \lim_{\epsilon\to0} \frac{\log \mathcal C(\epsilon)}{\log \epsilon}
\mathcal N(\epsilon) \sim V \epsilon^{-d_B}
C(\epsilon) = \frac{2}{N(N-1)} \sum_{i< j}^N \Theta(\epsilon-|x_i-x_j|)
C(\epsilon) \sim \epsilon^{\dim_C}

More robust to the statistical noise is the correlation dimension

\dim_C \le \dim_B

Tilted Ising

Bose-Hubbard

Quantum NG-rc

U_A = \quad\quad\begin{bmatrix} A/\alpha & & \cdot & \\ \cdot & & \cdot & \end{bmatrix}
\begin{matrix} |0\rangle^{\otimes a} \\ |0\rangle^{\otimes a}_{\perp} \end{matrix}
\begin{matrix} |0\rangle^{\otimes a} & |0\rangle^{\otimes a}_{\perp} \end{matrix}
\lVert A - \langle 0|^{\otimes a}U|0\rangle^{\otimes a} \rVert \le \epsilon

\(U_A\) is said to be a block-encoding of \(A\)

\( (\alpha,a,\epsilon) \)-approximate