The Power of Quantum Neural Networks

Min-Hsiu Hsieh

Hon Hai Quantum Computing Center, Taiwan

Why Quantum Computing ?

f: X\to Y

Unknown Function

\{(x_i,y_i)\}_{i=1}^N

Training Data

\mathcal{H}

Hypothesis Set

Learning

Algorithm

\hat{f}

Comp. Complexity

Sample Complexity

Type of Input

Type of Algorithms

CQ

CC

QC

QQ

CQ

QQ

QC

Linear Equation Solvers

Peceptron

Recommendation Systems

Semidefinite Programming

Many Others (such as non-Convex Optimization)

State Tomography

Entanglement Structure

Quantum Control

CQ

QC

Readin

Readout

Q.C.

Readout

\text{In general, requires } O(\frac{rd}{\epsilon^2}) \text{ copies of } \rho.

Our readout improvement

State Tomography:

Given: Input \(A\in\mathbb{R}^{m\times n}\) of rank \(r\) &

\(|v\rangle \in\text{row}(A)\)

Thm:

poly(\(r,\epsilon^{-1}\)) copies of \(|v\rangle\).

[1] Efficient State Read-out for Quantum Machine Learning Algorithms. Kaining Zhang, Min-Hsiu Hsieh, Liu Liu, Dacheng Tao. arXiv:2004.06421

High Level Proof

1. \(|v\rangle = \sum_{i=1}^r x_i |A_{g(i)}\rangle\in\text{row}(A)\)

2. quantum Gram-Schmidt Process algorithm to construct \(\{A_{g(i)}\}\)

3. Obtain \(\{x_i\}\).

Neural Networks

Expressivity

Trainability

Generalization

Learning

Model

Neural Network Expressivity

"how the architectural properties of a neural network (depth, width, layer type) affect the resulting functions it can compute"

[1] On the Expressive Power of Deep Neural Networks. (ICML2017) arXiv:1606.05336

Expressive Power

\(\rangle\)

[1] Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Dacheng Tao. The Expressive Power of Parameterized Quantum Circuits. Physical Review Research 2, 033125 (2020) [arXiv:1810.11922].

Learnability of QNN

Learnability = trainability + generalization

[1] Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Shan You, Dacheng Tao. On the learnability of quantum neural networks. arXiv:2007.12369 (2020)

Trainability of QNN

"How easy is it to find the appropriate weights of the neural networks that fit the given data?"

Gradients vanish to zero exponentially with respect to the number of qubits.

[1] Jarrod R McClean, Sergio Boixo, Vadim N Smelyanskiy, Ryan Babbush, and Hartmut Neven. Barren plateaus in quantum neural network training landscapes. Nature communications, 9(1):1– 6, 2018.

f(\mathbf{\theta},\rho) =\frac{1}{2}\left(1+ \text{Tr}[O U(\theta)\rho U(\theta)^\dagger]\right)

\mathbb{E}_{\theta}\left(\frac{\partial f}{\partial \theta_j}\right)^2 =\epsilon \leq 2^{-\text{poly}(n)}

Barren Plateau problem

Trainability of QNN

[1] Kaining Zhang, Min-Hsiu Hsieh, Liu Liu, Dacheng Tao. Toward Trainability of Quantum Neural Networks. arXiv:2011.06258 (2020).

\mathbb{E}_{\bm{\theta}} \|\nabla_{\bm{\theta}} f_{\text{TT}} \|^2\geq O(\frac{1+\log n}{n})

Thm:

Trainability of QNN in ERM

[1] Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Shan You, Dacheng Tao. On the learnability of quantum neural networks. arXiv:2007.12369 (2020)

\bm{\theta}^*= \arg \min_{\bm{\theta}\in\mathcal{C}} \mathcal{L}(\bm{\theta},\bm{z})

\mathcal{L}(\bm{\theta}):= \frac{1}{n}\sum_{j=1}^n \ell(y_i, \hat{y}_i) + r(\bm{\theta})

R_1\left(\bm{\theta}^{(T)}\right) := \mathbb{E} \left\|\nabla \mathcal{L}(\bm{\theta}^{(T)})\right\|^2

R_2\left(\bm{\theta}^{(T)}\right) := \mathbb{E}[\mathcal{L}(\bm{\theta}^{(T)})] - \mathcal{L}(\bm{\theta}^*)

Trainability of QNN in ERM

[1] Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Shan You, Dacheng Tao. On the learnability of quantum neural networks. arXiv:2007.12369 (2020)

R_1\left(\bm{\theta}^{(T)}\right) := \mathbb{E} \left\|\nabla \mathcal{L}(\bm{\theta}^{(T)})\right\|^2

R_1 \leq \tilde{O}\left(poly\left(\frac{d}{T(1-p)^{L_Q}}, \frac{d}{BK(1-p)^{L_Q}} \right) \right)

\(d\)＝ \(|\bm{\theta}|\)

\(T\)＝ # of iteration

\(L_Q\)＝ circuit depth

\(p\)＝ error rate

\(K\)＝ # of measurements

Trainability of QNN in ERM

[1] Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Shan You, Dacheng Tao. On the learnability of quantum neural networks. arXiv:2007.12369 (2020)

R_2\left(\bm{\theta}^{(T)}\right) := \mathbb{E}[\mathcal{L}(\bm{\theta}^{(T)})] - \mathcal{L}(\bm{\theta}^*)

R_2\leq \tilde{O}\left( poly\left(\frac{d}{K^2B (1-p)^{L_Q}} ,\frac{d}{(1-p)^{L_Q}}\right) \right)

\(d\)＝ \(|\bm{\theta}|\)

\(T\)＝ # of iteration

\(L_Q\)＝ circuit depth

\(p\)＝ error rate

\(K\)＝ # of measurements

Generalization of QNN

[1] Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Shan You, Dacheng Tao. On the learnability of quantum neural networks. arXiv:2007.12369 (2020)

Thm:

Quantum Statistical Query algorithms can be efficiently simulated by QNN.

Two Applications of QNN

QQ

Entanglement Test

with Jian-Wei Pan's group (submitted)

Quantum Generative and Adversarial Networks (QGAN)

\mathcal{L}(\sigma_G,\mathcal{D}) = P(\text{True}|\sigma_G)P(G) + P(\text{False}|\rho)P(R),

\min_{\sigma_G}\max_{\mathcal{D}}\mathcal{L}(\sigma_G,\mathcal{D})

[1] Lloyd, S. & Weedbrook, C. Quantum generative adversarial learning. Physical review letters 121, 040502 (2018).

Results

Error Mitigation

[1] Yuxuan Du, Tao Huang, Shan You, Min-Hsiu Hsieh, Dacheng Tao. Quantum circuit architecture search: error mitigation and trainability enhancement for variational quantum solvers. arXiv:2010.10217 (2020).

(\bm{\theta}^*,\bm{a}^*)= \arg \min_{\bm{\theta}\in\mathcal{C},\bm{a}\in\mathcal{A}} \mathcal{L}(\bm{\theta},\bm{a}, \mathcal{E}_{\bm{a}})

\(\mathcal{C}\): The collection of all parameters

\(\mathcal{A}\): The collection of all possible circuits

\(\mathcal{E}_{\bm{a}}\): The error for the architecture \(\bm{a}\)

Error Mitigation

Yuxuan Du, Tao Huang, Shan You, Min-Hsiu Hsieh, Dacheng Tao. Quantum circuit architecture search: error mitigation and trainability enhancement for variational quantum solvers. arXiv:2010.10217 (2020).

Hydrogen Simulation

Gradient

CQ

QQ

CC

Thank you for your attention!

The Power of Quantum Neural Networks

Min-Hsiu Hsieh

Hon Hai Quantum Computing Center, Taiwan

Why Quantum Computing ?

Type of Input

Type of Algorithms

Linear Equation Solvers

Peceptron

Recommendation Systems

Semidefinite Programming

Many Others (such as non-Convex Optimization)

State Tomography

Entanglement Structure

Quantum Control

Readout

Our readout improvement

State Tomography:

Given: Input \(A\in\mathbb{R}^{m\times n}\) of rank \(r\) &

\(|v\rangle \in\text{row}(A)\)

Thm:

poly(\(r,\epsilon^{-1}\)) copies of \(|v\rangle\).

High Level Proof

1. \(|v\rangle = \sum_{i=1}^r x_i |A_{g(i)}\rangle\in\text{row}(A)\)

2. quantum Gram-Schmidt Process algorithm to construct \(\{A_{g(i)}\}\)

3. Obtain \(\{x_i\}\).

Neural Networks

Learning

Model

Neural Network Expressivity

"how the architectural properties of a neural network (depth, width, layer type) affect the resulting functions it can compute"

Expressive Power

\(\rangle\)

\(\rangle\)

\(\rangle\)

Learnability of QNN

Learnability = trainability + generalization

Trainability of QNN

"How easy is it to find the appropriate weights of the neural networks that fit the given data?"

Gradients vanish to zero exponentially with respect to the number of qubits.

Barren Plateau problem

Trainability of QNN

Thm:

Trainability of QNN in ERM

Trainability of QNN in ERM

Trainability of QNN in ERM

Generalization of QNN

Thm:

Quantum Statistical Query algorithms can be efficiently simulated by QNN.

Two Applications of QNN

Entanglement Test

Quantum Generative and Adversarial Networks (QGAN)

Results

Error Mitigation

Error Mitigation

Hydrogen Simulation

Gradient

Thank you for your attention!