Machine Learning Meets Quantum Computation

Min-Hsiu Hsieh (謝明修)

University of Technology Sydney

f: X\to Y

Unknown Function

\{(x_i,y_i)\}_{i=1}^N

Training Data

\mathcal{H}

Hypothesis Set

Learning

Algorithm

\hat{f}

Comp. Complexity

Sample Complexity

Quantum Computation

Classical Bit $x\in\mathbb{Z}=\{0,1\}$

QuBit $\rho\in\mathbb{C}^{2\times 2}\geq0$ & Tr$[\rho]=1$

Random Bit $\left(\begin{array}{cc} p(0) & 0\\ 0 & p(1) \end{array}\right)$ is a special case.

Quantum Computation

Quantum Operation: $\rho\mapsto\sigma$

Unitary is a special case.

Quantum Measurement: $\rho\mapsto\mathbb{R}$

Quantum Challenge #1

Noncommutative: $AB\neq BA$

Moment Generating Function: $\mathbb{E}e^{\theta (A+B)}\neq\mathbb{E}e^{\theta A}e^{\theta B}$

\frac{a}{b} \mapsto A B^{-1}?

e^{a+b} \mapsto e^A e^B?

Quantum Challenge #2

Entanglement: $\rho_{AB}\neq \rho_{A}\otimes\rho_B$

Problem Setup

=\{\pm 1\}

Alice

Bob

Compute $(QS+RS+RT-QT)$

Classical Mechanics

$\theta=(Q+R)S+(R-Q)T\leq 2$

Let $\text{p}(qrst) := \text{Pr}\{Q=q,R=r,S=s,T=t\}$.

\mathbb{E}[\theta]= \sum_{qrst}\text{p} (qrst)(qs+rs+rt-qt)

\leq 2

Probabilistically,

Quantum Mechanics

|\Psi_{AB}\rangle = \frac{1}{\sqrt{2}}\left(|0\rangle_A|1\rangle_B -|1\rangle_A |0\rangle_B\right)

=\{\pm 1\}

Q=Z

R=X

S=\frac{-Z-X}{\sqrt{2}}

T=\frac{Z-X}{\sqrt{2}}

Quantum Mechanics

\mathbb{E}[\theta] = \langle QS\rangle + \langle RS\rangle + \langle RT\rangle - \langle QT\rangle= 2\sqrt{2}

Why Quantum Computation Matters?

Many More!

Type of Input

Type of Algorithms

CQ

CC

QC

QQ

CQ

QQ

QC

Linear Equation Solvers

Peceptron

Recommendation Systems

Semidefinite Programming

Many Others (such as non-Convex Optimization)

State Tomography

Entanglement Structure

Quantum Control

CC

Linear Equation Solvers

Recommendation Systems

Semidefinite Programming

Minimum Conical Hull

Quantum-Inspired Classical Algorithms

$A\mathbf{x} = \mathbf{b}$

$A =\sum \sigma_\ell |u_\ell\rangle\langle v_\ell|$

$\mathbf{x} =\sum \lambda_\ell |v_\ell\rangle$

Discrete Fourier Transform

y_k = \frac{1}{\sqrt{N}}\sum_{\ell=0}^{N-1} e^{i\frac{2\pi}{N} \ell k} x_\ell

Classical Fast Fourier Transform requires $\Theta (n2^n)$ operations if $N=2^n$.

Quantum Fourier Transform requires $\Theta (n^2)$ operations if $N=2^n$.

CQ

QQ

CC

Sample Complexity

\{(x_i,y_i)\}_{i=1}^N

Training Data

R_n(h) = \frac{1}{N}\sum_{i=1}^N \ell (h(x_i), y_i)

h\in\mathcal{H}

Hypothesis Set

f: X\to Y

Unknown Function

Given a loss function

\ell:Y\times Y \to \mathbb{R}

find

{f}_n = \arg \min_{h\in \mathcal{H}} R_n (h)

where

Empirical Risk Minimization

if for any $\epsilon>0$

Probably Approximately Correct (PAC) Learnable

$\mathcal{H}$ is PAC learnable

$$ \lim_{n\to\infty}\sup_{\mu} \Pr\{\sup_{h\in\mathcal{H}}|R(h) - R_n(h)| >\epsilon\} = 0$$

Sample Complexity

$\sup_{\mu} \Pr \left\{ \sup_{h\in\mathcal{H}} \big|R(h)-R_n(h)\big|\geq \epsilon \right\}\leq \delta$

Sample complexity $m_\mathcal{H}(\epsilon,\delta)$ is the first quantity such that

for every $n\geq m_\mathcal{H}(\epsilon,\delta),$

$m_{\mathcal{H}}(\epsilon,\delta)= \frac{C}{\epsilon^2}\left(\text{VCdim}(\mathcal{H})\log\left(\frac{2}{\epsilon}\right)+\log\left(\frac{2}{\delta}\right)\right)$

For Boolean functions $\mathcal{H}$

[1] Vapnik, Springer-Verlag, New York/Berlin, 1982.

[2] Blumer, Ehrenfeucht, Haussler, and Warmuth, Assoc. Comput. Machine, vol. 36, no. 4, pp. 151--160, 1989.

$Z=\sup_{f\in\mathcal{F}}\big| \sum_{i=1}^n f(x_i)\big|$

${Z}=\sup_{\bm{f}\in\mathcal{F}}\left\| \sum_{i=1}^n \bm{f}(\bm{X}_i)\right\|_p.$

There are only very limited matrix concentration results!!

[1] Joel Tropp. User-friendly tail bounds for sums of random matrices. arXiv:1004.4389.

Sample Complexity for Learning Quantum Objects

Q. State

Measurement

f_\rho : \mathcal{E}(\mathcal{H}) \to \mathbb{R}

f_\rho(E) = \text{Tr} E\rho

Hypothesis Set

\{f_{\rho}:\rho\in \mathcal{D}(\mathcal{H})\}

\{(E_i,f_\rho(E_i)\}_{i=1}^N

Training Data

Unknown Function

f_\rho : \mathcal{E}(\mathcal{H}) \to \mathbb{R}

Hypothesis Set

\{f_{\rho}:\rho\in \mathcal{D}(\mathcal{H})\}

\{(E_i,f_\rho(E_i)\}_{i=1}^N

Training Data

Unknown Function

f_\rho : \mathcal{E}(\mathcal{H}) \to \mathbb{R}

fat$_{\mathcal{D}(\mathcal{H})}(\epsilon,\mathcal{E}(\mathcal{H})) = O(\log d/\epsilon^2)$

Sample Complexity for Learning Quantum States

f_E: \mathcal{D}(\mathcal{H}) \to \mathbb{R}

f_E(\rho) = \text{Tr} E\rho

Hypothesis Set

\{f_{E}:E\in \mathcal{E}(\mathcal{H})\}

\{(\rho_i,f_E(\rho_i)\}_{i=1}^N

Training Data

Unknown Function

f_E : \mathcal{D}(\mathcal{H}) \to \mathbb{R}

Learning Unknown Measurement

Learning States

Learning Measurements

fat$_{\mathcal{D}(\mathcal{H})}(\epsilon,\mathcal{E}(\mathcal{H})) = O(\log d/\epsilon^2)$

fat$_{\mathcal{E}(\mathcal{H})}(\epsilon,\mathcal{D}(\mathcal{H})) = O( d/\epsilon^2)$

Hao-Chung Cheng, MH, Ping-Cheng Yeh. The learnability of unknown quantum measurements. QIC 16(7&8):615–656 (2016).

Thank you for your attention!

Machine Learning Meets Quantum Computation

Min-Hsiu Hsieh (謝明修)

University of Technology Sydney

Quantum Computation

Classical Bit \(x\in\mathbb{Z}=\{0,1\}\)

QuBit \(\rho\in\mathbb{C}^{2\times 2}\geq0\) & Tr\([\rho]=1\)

Random Bit \(\left(\begin{array}{cc} p(0) & 0\\ 0 & p(1) \end{array}\right)\) is a special case.

Quantum Computation

Quantum Operation: \(\rho\mapsto\sigma\)

Unitary is a special case.

Quantum Measurement: \(\rho\mapsto\mathbb{R}\)

Quantum Challenge #1

Noncommutative: \(AB\neq BA\)

Moment Generating Function: \(\mathbb{E}e^{\theta (A+B)}\neq\mathbb{E}e^{\theta A}e^{\theta B}\)

Quantum Challenge #2

Entanglement: \(\rho_{AB}\neq \rho_{A}\otimes\rho_B\)

Problem Setup

Alice

Bob

Compute \((QS+RS+RT-QT)\)

Classical Mechanics

\(\theta=(Q+R)S+(R-Q)T\leq 2\)

Let \(\text{p}(qrst) := \text{Pr}\{Q=q,R=r,S=s,T=t\}\).

Probabilistically,

Quantum Mechanics

Quantum Mechanics

Why Quantum Computation Matters?

Type of Input

Type of Algorithms

Linear Equation Solvers

Peceptron

Recommendation Systems

Semidefinite Programming

Many Others (such as non-Convex Optimization)

State Tomography

Entanglement Structure

Quantum Control

Linear Equation Solvers

Recommendation Systems

Semidefinite Programming

Minimum Conical Hull

Quantum-Inspired Classical Algorithms

\(A\mathbf{x} = \mathbf{b}\)

\(A =\sum \sigma_\ell |u_\ell\rangle\langle v_\ell|\)

\(\mathbf{x} =\sum \lambda_\ell |v_\ell\rangle\)

Discrete Fourier Transform

Classical Fast Fourier Transform requires \(\Theta (n2^n)\) operations if \(N=2^n\).

Quantum Fourier Transform requires \(\Theta (n^2)\) operations if \(N=2^n\).

Sample Complexity

Given a loss function

find

where

Empirical Risk Minimization

if for any \(\epsilon>0\)

Probably Approximately Correct (PAC) Learnable

\(\mathcal{H}\) is PAC learnable

$$ \lim_{n\to\infty}\sup_{\mu} \Pr\{\sup_{h\in\mathcal{H}}|R(h) - R_n(h)| >\epsilon\} = 0$$

Sample Complexity

\(\sup_{\mu} \Pr \left\{ \sup_{h\in\mathcal{H}} \big|R(h)-R_n(h)\big|\geq \epsilon \right\}\leq \delta\)

Sample complexity \(m_\mathcal{H}(\epsilon,\delta)\) is the first quantity such that

for every \(n\geq m_\mathcal{H}(\epsilon,\delta),\)

\(m_{\mathcal{H}}(\epsilon,\delta)= \frac{C}{\epsilon^2}\left(\text{VCdim}(\mathcal{H})\log\left(\frac{2}{\epsilon}\right)+\log\left(\frac{2}{\delta}\right)\right)\)

For Boolean functions \(\mathcal{H}\)

\(Z=\sup_{f\in\mathcal{F}}\big| \sum_{i=1}^n f(x_i)\big|\)

\({Z}=\sup_{\bm{f}\in\mathcal{F}}\left\| \sum_{i=1}^n \bm{f}(\bm{X}_i)\right\|_p.\)

There are only very limited matrix concentration results!!

Sample Complexity for Learning Quantum Objects

Q. State

Measurement

fat\(_{\mathcal{D}(\mathcal{H})}(\epsilon,\mathcal{E}(\mathcal{H})) = O(\log d/\epsilon^2)\)

Sample Complexity for Learning Quantum States

Learning Unknown Measurement

Learning States

Learning Measurements

fat\(_{\mathcal{D}(\mathcal{H})}(\epsilon,\mathcal{E}(\mathcal{H})) = O(\log d/\epsilon^2)\)

fat\(_{\mathcal{E}(\mathcal{H})}(\epsilon,\mathcal{D}(\mathcal{H})) = O( d/\epsilon^2)\)

Thank you for your attention!

PhD Wanted!