Challenges and Opportunities of Quantum Machine Learning

Min-Hsiu Hsieh (謝明修)

Hon Hai Quantum Computing Center

f: X\to Y

Unknown Function

\{(x_i,y_i)\}_{i=1}^N

Training Data

\mathcal{H}

Hypothesis Set

Learning

Algorithm

\hat{f}

Comp. Complexity

Sample Complexity

Quantum Computation 

Classical Bit \(x\in\mathbb{Z}=\{0,1\}\)

QuBit \(\rho\in\mathbb{C}^{2\times 2}\geq0\) & Tr\([\rho]=1\)

Random Bit \(\left(\begin{array}{cc} p(0) & 0\\ 0 & p(1) \end{array}\right)\) is a special case.

Quantum Computation 

Quantum Operation: \(\rho\mapsto\sigma\) 

Unitary is a special case. 

Quantum Measurement: \(\rho\mapsto\mathbb{R}\) 

Quantum Challenge #1

Noncommutative: \(AB\neq BA\) 

Moment Generating Function: \(\mathbb{E}e^{\theta (A+B)}\neq\mathbb{E}e^{\theta A}e^{\theta B}\) 

\frac{a}{b} \mapsto A B^{-1}?
e^{a+b} \mapsto e^A e^B?

Quantum Challenge #2

Entanglement: \(\rho_{AB}\neq \rho_{A}\otimes\rho_B\) 

Problem Setup

=\{\pm 1\}
=\{\pm 1\}
=\{\pm 1\}
=\{\pm 1\}

Alice

Bob

Compute \((QS+RS+RT-QT)\)

Q
R
S
T

Classical Mechanics

\(\theta=(Q+R)S+(R-Q)T\leq 2\)

Let  \(\text{p}(qrst) := \text{Pr}\{Q=q,R=r,S=s,T=t\}\).

\mathbb{E}[\theta]= \sum_{qrst}\text{p} (qrst)(qs+rs+rt-qt)
\leq 2

Probabilistically, 

Quantum Mechanics

|\Psi_{AB}\rangle = \frac{1}{\sqrt{2}}\left(|0\rangle_A|1\rangle_B -|1\rangle_A |0\rangle_B\right)
=\{\pm 1\}
=\{\pm 1\}
=\{\pm 1\}
=\{\pm 1\}
Q
R
S
T
Q=Z
R=X
S=\frac{-Z-X}{\sqrt{2}}
T=\frac{Z-X}{\sqrt{2}}

Quantum Mechanics

\mathbb{E}[\theta] = \langle QS\rangle + \langle RS\rangle + \langle RT\rangle - \langle QT\rangle= 2\sqrt{2}

Why Quantum Computation Matters?

Many More!

Type of Input

Type of Algorithms

CQ
CC
QC
QQ
CQ
QQ
QC
  • Linear Equation Solvers

  • Peceptron

  • Recommendation Systems

  • Semidefinite Programming

  • Many Others (such as non-Convex Optimization)

  • State Tomography

  • Entanglement Structure

  • Quantum Control

CC
  • Linear Equation Solvers

  • Recommendation Systems

  • Semidefinite Programming

  • Minimum Conical Hull  

Quantum-Inspired Classical Algorithms 

CQ
QC
Readin
Readout
Q.C.

Input Models

[1] V. Giovannetti, S. Lloyd, and L. Maccone, Phys. Rev. Lett. 100, 160501 (2008).

Readout

\text{In general, requires } O(\frac{rd}{\epsilon^2}) \text{ copies of } \rho.

Readout

Our readout improvement

Given: Input \(A\in\mathbb{R}^{m\times n}\) of rank \(r\) &

\(|v\rangle \in\text{row}(A)\)

Thm: poly(\(r,\epsilon^{-1}\)) query to QRAM &

poly(\(r,\epsilon^{-1}\)) copies of \(|v\rangle\).

[1] Efficient State Read-out for Quantum Machine Learning Algorithms. Kaining Zhang, Min-Hsiu Hsieh, Liu Liu, Dacheng Tao. arXiv:2004.06421 

High Level Proof

1. \(|v\rangle = \sum_{i=1}^r x_i |A_{g(i)}\rangle\in\text{row}(A)\)

2. quantum Gram-Schmidt Process algorithm to construct \(\{A_{g(i)}\}\)

3. Obtain \(\{x_i\}\).

Neural Networks

Expressive Power

\(\rangle\)

\(\rangle\)

\(\rangle\)

[1] Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Dacheng Tao. The Expressive Power of Parameterized Quantum Circuits. Physical Review Research 2, 033125 (2020) [arXiv:1810.11922].

Trainability of QNN

Gradients vanish to zero exponentially with respect to the number of qubits.

Barren Plateau problem:

[1] Jarrod R McClean, Sergio Boixo, Vadim N Smelyanskiy, Ryan Babbush, and Hartmut Neven. Barren plateaus in quantum neural network training landscapes. Nature communications, 9(1):1– 6, 2018.

Trainability of QNN

[1] Kaining Zhang, Min-Hsiu Hsieh, Liu Liu, Dacheng Tao. Toward Trainability of Quantum Neural Networks. arXiv:2011.06258 (2020).

\mathbb{E}_{\bm{\theta}} \|\nabla_{\bm{\theta}} f_{\text{TT}} \|\geq O(\frac{2^{-2L}}{n})

Thm:

Learnability of QNN

Learnability = trainability + generalization

[1] ​Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Shan You, Dacheng Tao. On the learnability of quantum neural networks. arXiv:2007.12369 (2020)

Trainability of QNN: ERM

[1] ​Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Shan You, Dacheng Tao. On the learnability of quantum neural networks. arXiv:2007.12369 (2020)

\bm{\theta}^*= \arg \min_{\bm{\theta}\in\mathcal{C}} \mathcal{L}(\bm{\theta},\bm{z})
:= \frac{1}{n}\sum_{j=1}^n \ell(y_i, \hat{y}_i) + r(\bm{\theta})

Trainability of QNN: ERM

[1] ​Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Shan You, Dacheng Tao. On the learnability of quantum neural networks. arXiv:2007.12369 (2020)

R_1\left(\bm{\theta}^{(T)}\right) := \mathbb{E}\left[\left\|\nabla \mathcal{L}(\bm{\theta}^{(T)})\right\|^2\right]
R_1 \leq \tilde{O}\left(poly\left(\frac{d}{T(1-p)^{L_Q}}, \frac{d}{BK(1-p)^{L_Q}}, \frac{d}{(1-p)^{L_Q}} \right) \right)

Trainability of QNN: ERM

[1] ​Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Shan You, Dacheng Tao. On the learnability of quantum neural networks. arXiv:2007.12369 (2020)

R_2\left(\bm{\theta}^{(T)}\right) := \mathbb{E}[\mathcal{L}(\bm{\theta}^{(T)})] - \mathcal{L}(\bm{\theta}^*)
R_2\leq \tilde{O}\left( poly\left(\frac{d}{K^2B (1-p)^{L_Q}} + \frac{d}{(1-p)^{L_Q}}\right) \right)

Generalization of QNN

[1] ​Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Shan You, Dacheng Tao. On the learnability of quantum neural networks. arXiv:2007.12369 (2020)

Thm:

Quantum Statistical Query algorithms can be efficiently simulated by QNN.

QQ
  • Entanglement Test

with Jian-Wei Pan's group (submitted)

Quantum Generative and Adversarial Networks (QGAN)

\mathcal{L}(\sigma_G,\mathcal{D}) = P(\text{True}|\sigma_G)P(G) + P(\text{False}|\rho)P(R),
\min_{\sigma_G}\max_{\mathcal{D}}\mathcal{L}(\sigma_G,\mathcal{D})
[1] Lloyd, S. & Weedbrook, C. Quantum generative adversarial learning. Physical review letters 121, 040502 (2018). 

Results

Error Mitigation

Yuxuan Du, Tao Huang, Shan You, Min-Hsiu Hsieh, Dacheng Tao. Quantum circuit architecture search: error mitigation and trainability enhancement for variational quantum solvers. arXiv:2010.10217 (2020).

Hydrogen Simulation

CQ
QQ
CC

Sample Complexity

\{(x_i,y_i)\}_{i=1}^N

Training Data

R_n(h) = \frac{1}{N}\sum_{i=1}^N \ell (h(x_i), y_i)
h\in\mathcal{H}

Hypothesis Set

f: X\to Y

Unknown Function

Given a loss function 

\ell:Y\times Y \to \mathbb{R}

find

{f}_n = \arg \min_{h\in \mathcal{H}} R_n (h)

where

Empirical Risk Minimization

if for any \(\epsilon>0\)

Probably Approximately Correct (PAC) Learnable

\(\mathcal{H}\) is PAC learnable

$$ \lim_{n\to\infty}\sup_{\mu} \Pr\{\sup_{h\in\mathcal{H}}|R(h) - R_n(h)| >\epsilon\} = 0$$

Sample Complexity

\(\sup_{\mu} \Pr \left\{ \sup_{h\in\mathcal{H}} \big|R(h)-R_n(h)\big|\geq \epsilon \right\}\leq \delta\)

Sample complexity \(m_\mathcal{H}(\epsilon,\delta)\) is the first quantity such that

 for every \(n\geq m_\mathcal{H}(\epsilon,\delta),\)

 \(m_{\mathcal{H}}(\epsilon,\delta)= \frac{C}{\epsilon^2}\left(\text{VCdim}(\mathcal{H})\log\left(\frac{2}{\epsilon}\right)+\log\left(\frac{2}{\delta}\right)\right)\)

For Boolean functions \(\mathcal{H}\)  

[1] Vapnik, Springer-Verlag, New York/Berlin, 1982.

[2] Blumer, Ehrenfeucht, Haussler, and Warmuth, Assoc. Comput. Machine, vol. 36, no. 4, pp. 151--160, 1989.

\(Z=\sup_{f\in\mathcal{F}}\big| \sum_{i=1}^n f(x_i)\big|\)

\({Z}=\sup_{\bm{f}\in\mathcal{F}}\left\| \sum_{i=1}^n \bm{f}(\bm{X}_i)\right\|_p.\)

There are only very limited matrix concentration results!!

[1] Joel Tropp. User-friendly tail bounds for sums of random matrices. arXiv:1004.4389.

Sample Complexity for Learning Quantum Objects

Q. State

Measurement

f_\rho : \mathcal{E}(\mathcal{H}) \to \mathbb{R}
f_\rho(E) = \text{Tr} E\rho

Hypothesis Set

\{f_{\rho}:\rho\in \mathcal{D}(\mathcal{H})\}
\{(E_i,f_\rho(E_i)\}_{i=1}^N

Training Data

Unknown Function

f_\rho : \mathcal{E}(\mathcal{H}) \to \mathbb{R}

Hypothesis Set

\{f_{\rho}:\rho\in \mathcal{D}(\mathcal{H})\}
\{(E_i,f_\rho(E_i)\}_{i=1}^N

Training Data

Unknown Function

f_\rho : \mathcal{E}(\mathcal{H}) \to \mathbb{R}

fat\(_{\mathcal{D}(\mathcal{H})}(\epsilon,\mathcal{E}(\mathcal{H})) = O(\log d/\epsilon^2)\)

Sample Complexity for Learning Quantum States

f_E: \mathcal{D}(\mathcal{H}) \to \mathbb{R}
f_E(\rho) = \text{Tr} E\rho

Hypothesis Set

\{f_{E}:E\in \mathcal{E}(\mathcal{H})\}
\{(\rho_i,f_E(\rho_i)\}_{i=1}^N

Training Data

Unknown Function

f_E : \mathcal{D}(\mathcal{H}) \to \mathbb{R}

Learning Unknown Measurement

Learning States

Learning Measurements

fat\(_{\mathcal{D}(\mathcal{H})}(\epsilon,\mathcal{E}(\mathcal{H})) = O(\log d/\epsilon^2)\)

fat\(_{\mathcal{E}(\mathcal{H})}(\epsilon,\mathcal{D}(\mathcal{H})) = O( d/\epsilon^2)\)

Hao-Chung Cheng, MH, Ping-Cheng Yeh. The learnability of unknown quantum measurements. QIC 16(7&8):615–656 (2016).

Thank you for your attention!

Challenges and Opportunities of Quantum Machine Learning

By Lawrence Min-Hsiu Hsieh

Challenges and Opportunities of Quantum Machine Learning

CSIE, NTU. 16 July 2019

  • 106