Min-Hsiu Hsieh

Hon Hai (Foxconn) Quantum Computing Research Center

AQIS 2021

Challenges and Opportunities of Quantum Machine Learning

"Quantum Computing" is a very young field.

Feynman, Richard (June 1982). "Simulating Physics with Computers"

"Let the computer itself be built of quantum mechanical elements which obey quantum mechanical laws"

Why Machine Learning?

f: X\to Y

Unknown Function

\{(x_i,y_i)\}_{i=1}^N

Training Data

\mathcal{H}

Hypothesis Set

Learning

Algorithm

\hat{f}

Comp. Complexity

Sample Complexity

Many More!

Type of Input

Type of Algorithms

CQ
CC
QC
QQ
CQ
QQ
QC
  • Linear Equation Solvers

  • Peceptron

  • Recommendation Systems

  • Semidefinite Programming

  • Many Others (such as non-Convex Optimization)

  • State Tomography

  • Entanglement Structure

  • Quantum Control

QML= Machine Learning + Quantum Computation

Holy Grail: Better End-to-End Runtime than Classical ML!

1.Readin

2. Readout

The Challenges!

4. Noise

3. Learning Machines

1. Input Oracles

[1] Aleksandrs Belovs, Quantum Algorithms for Classical Probability Distributions, 27th annual European symposium on algorithms (esa 2019), 2019, pp. 16:1–16:11.
O_p |0\rangle =\sum_{x\in\mathcal{X}} \sqrt{p_x} |x\rangle_A\otimes|\phi_x\rangle_B
O_p |0\rangle =\sum_{x\in\mathcal{X}} \sqrt{p_x} |x\rangle
O_{s} |0\rangle =\sum_{x\in\mathcal{X}} n^{-1/2} |x\rangle \otimes |\#_s(x)\rangle

QRAM

V. Giovannetti, S. Lloyd, L. Maccone, Phys. Rev. Lett. 100, 160501 (2008).

There is no general readin protocol (with runtime guarantee) for arbitrary datasets.

Take Home

2. Readout

O(\frac{rd}{\epsilon^2}) \text{ copies of } \rho.

State tomography:

Contribution:

Given: Input \(A\in\mathbb{R}^{m\times n}\) of rank \(r\) & \(|v\rangle \in\text{row}(A)\).

poly(\(r,\epsilon^{-1}\)) query to QRAM;

poly(\(r,\epsilon^{-1}\)) copies of \(|v\rangle\).

[1] Efficient State Read-out for Quantum Machine Learning Algorithms. Kaining Zhang, Min-Hsiu Hsieh, Liu Liu, Dacheng Tao. arXiv:2004.06421 

Theorem:

Proof Idea

 

1. \(|v\rangle = \sum_{i=1}^r x_i |A_{g(i)}\rangle\in\text{row}(A)\)

2. quantum Gram-Schmidt Process algorithm to construct \(\{A_{g(i)}\}\)

3. Obtain \(\{x_i\}\).

[1] Efficient State Read-out for Quantum Machine Learning Algorithms. Kaining Zhang, Min-Hsiu Hsieh, Liu Liu, Dacheng Tao. arXiv:2004.06421 

3. Learning Machine:

Expressivity
Trainability
Generalization

Learning

Model

Neural Network Expressivity

"how the architectural properties of a neural network (depth, width, layer type) affect the resulting functions it can compute"

[1] On the Expressive Power of Deep Neural  Networks. (ICML2017) arXiv:1606.05336

Expressive Power

\(\rangle\)

\(\rangle\)

\(\rangle\)

[1] Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Dacheng Tao. The Expressive Power of Parameterized Quantum Circuits. Physical Review Research 2, 033125 (2020) [arXiv:1810.11922].

Contribution:

Trainability of QNN

"How easy is it to find the appropriate weights of the neural networks that fit the given data?"

Gradients vanish to zero exponentially with respect to the number of qubits.

[1] Jarrod R McClean, Sergio Boixo, Vadim N Smelyanskiy, Ryan Babbush, and Hartmut Neven. Barren plateaus in quantum neural network training landscapes. Nature communications, 9(1):1– 6, 2018.

f(\mathbf{\theta},\rho) =\frac{1}{2}\left(1+ \text{Tr}[O U(\theta)\rho U(\theta)^\dagger]\right)
\mathbb{E}_{\theta}\left(\frac{\partial f}{\partial \theta_j}\right)^2 =\epsilon \leq 2^{-\text{poly}(n)}

Barren Plateau problem

[1] Kaining Zhang, Min-Hsiu Hsieh, Liu Liu, Dacheng Tao. Toward Trainability of Quantum Neural Networks. arXiv:2011.06258 (2020).

\mathbb{E}_{\bm{\theta}} \|\nabla_{\bm{\theta}} f_{\text{TT}} \|^2\geq O(\frac{1+\log n}{n})

Theorem:

Contribution:

Trainability of QNN in ERM

[1] ​Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Shan You, Dacheng Tao. On the learnability of quantum neural networks. arXiv:2007.12369 (2020)

\bm{\theta}^*= \arg \min_{\bm{\theta}\in\mathcal{C}} \mathcal{L}(\bm{\theta},\bm{z})
\mathcal{L}(\bm{\theta}):= \frac{1}{n}\sum_{j=1}^n \ell(y_i, \hat{y}_i) + r(\bm{\theta})
R_1\left(\bm{\theta}^{(T)}\right) := \mathbb{E} \left\|\nabla \mathcal{L}(\bm{\theta}^{(T)})\right\|^2
R_2\left(\bm{\theta}^{(T)}\right) := \mathbb{E}[\mathcal{L}(\bm{\theta}^{(T)})] - \mathcal{L}(\bm{\theta}^*)

Trainability of QNN in ERM

[1] ​Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Shan You, Dacheng Tao. On the learnability of quantum neural networks. arXiv:2007.12369 (2020)

R_1\left(\bm{\theta}^{(T)}\right) := \mathbb{E} \left\|\nabla \mathcal{L}(\bm{\theta}^{(T)})\right\|^2
R_1 \leq \tilde{O}\left(poly\left(\frac{d}{T(1-p)^{L_Q}}, \frac{d}{BK(1-p)^{L_Q}} \right) \right)

\(d\)= \(|\bm{\theta}|\)

\(T\)= # of iteration

\(L_Q\)= circuit depth

\(p\)= error rate

\(K\)= # of measurements

Trainability of QNN in ERM

[1] ​Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Shan You, Dacheng Tao. On the learnability of quantum neural networks. arXiv:2007.12369 (2020)

R_2\left(\bm{\theta}^{(T)}\right) := \mathbb{E}[\mathcal{L}(\bm{\theta}^{(T)})] - \mathcal{L}(\bm{\theta}^*)
R_2\leq \tilde{O}\left( poly\left(\frac{d}{K^2B (1-p)^{L_Q}} ,\frac{d}{(1-p)^{L_Q}}\right) \right)

\(d\)= \(|\bm{\theta}|\)

\(T\)= # of iteration

\(L_Q\)= circuit depth

\(p\)= error rate

\(K\)= # of measurements

經典: 1 error per 6 month in a 128MB PC100 SDRAM  (2009)
量子: 1 error per second per qubit (2021)

4. Noise 

4.1 Error Mitigation 

(\bm{\theta}^*,\bm{a}^*)= \arg \min_{\bm{\theta}\in\mathcal{C},\bm{a}\in\mathcal{A}} \mathcal{L}(\bm{\theta},\bm{a}, \mathcal{E}_{\bm{a}})

\(\mathcal{C}\): The collection of all parameters

\(\mathcal{A}\): The collection of all possible circuits

\(\mathcal{E}_{\bm{a}}\): The error for the architecture \(\bm{a}\)

[1] Yuxuan Du, Tao Huang, Shan You, Min-Hsiu Hsieh, Dacheng Tao. Quantum circuit architecture search: error mitigation and trainability enhancement for variational quantum solvers. arXiv:2010.10217 (2020).

Error Mitigation

[1] Yuxuan Du, Tao Huang, Shan You, Min-Hsiu Hsieh, Dacheng Tao. Quantum circuit architecture search: error mitigation and trainability enhancement for variational quantum solvers. arXiv:2010.10217 (2020).

Hydrogen Simulation

Could noise become useful in QML?

YES!

4.2 Harnessing Noise

4.2.1 Noise could preserve privacy.

Classical DP is well

studied; however, Quantum DP is not.

Quantum ML and DP learning have different aims!

​[1] Li Zhou and Mingsheng Ying. Differential privacy in quantum computation. In 2017 IEEE 30th Computer Security Foundations Symposium (CSF), pages 249–262. IEEE, 2017. 
[2] Scott Aaronson and Guy N Rothblum. Gentle measurement of quantum states and differential privacy. Proceedings of ACM STOC‘2019.

Dilemma:

1. The first quantum DP algorithm.

2. Have the same privacy guarantee with the best classical DP algorithm.

3. Huge runtime improvement. 

Contribution:

[1] ​​Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Shan You, Dacheng Tao. Quantum differentially private sparse regression learning. arXiv:2007.11921 (2020)

4.2.2 Noise could resist adversarial attack.

Lu et.al, “Quantum Adversarial Machine Learning, arXiv:2001.00030v1”

Adversarial Robustness

2. Depolarizing noise suffices.

Contribution:

1. Explicit relation between p and \(\tau\).

Thank you for your attention!

Challenges and Opportunities of Quantum Machine Learning

By Lawrence Min-Hsiu Hsieh

Challenges and Opportunities of Quantum Machine Learning

AQIS 2021

  • 105