Soutenance de thése de
Alessandro Luongo
20 Novembre 2020
Supervisor: Iordanis Kerenidis
Co-supervisor: Frédéric Magniez.
aluongo@irif.fr
* to be run on fault-tolerant quantum computers with quantum access to classical data
Runtime
O(nd2)
O(∥X∥0×poly(κ(X),ϵ,...))
O(∥X∥0+poly(κ(X),ϵ,μ(X),...))
Worst-case classical algorithms
Randomized classical algorithms
Input X∈Rn×d with n≫d
Quantum algorithms
Runtime
O(nd2)
Worst-case classical algorithms
Randomized classical algorithms
Quantum algorithms
Input X∈Rn×d with n≫d
O(∥X∥0) +O(poly(κ(X),ϵ,μ(X),...))
* to be run on fault-tolerant quantum computers with quantum access to classical data
O(∥X∥0×poly(κ(X),ϵ,...))
Quantum classification of the MNIST dataset via slow feature analysis. I. Kerenidis, AL - PRA. [QSFA] (supervised ML, dimensionality reduction, classification, experiments on real data)
q-means: A quantum algorithm for unsupervised machine learning.
I. Kerenidis, J. Landman, AL, A. Prakash - NeurIPS2019 [QMEANS] (unsupervised ML, clustering, experiments on real data)
Quantum Expectation-Maximization for Gaussian mixture models.
I. Kerenidis, AL, A. Prakash - ICML2020 [QEM] (unsupervised ML, clustering, experiments on real data)
Quantum algorithms for spectral sums. C. Shao, AL - arXiv:2011.06475 [QSS] (quantum algorithms numerical linear algebra)
Application of quantum algorithms for spectral sums. AL - (to appear) [AQSS] (statistics, applications.)
Query access to matrices
Quantum linear algebra
Distance estimations
Singular Value Estimation
Tomography (of pure states)
Hamiltonian simulation
Amplitude estimation and amplification
Singular value transformations
Polynomial approximations
...
Classical preprocessing time: O(ndlognd)
Classical space: O(ndlognd)
Query time: O(lognd)
Quantum space: O(lognd)
Iordanis Kerenidis, Anupam Prakash - 8th Innovations in Theoretical Computer Science Conference - ITCS 2017.
Anupam Prakash - Quantum algorithms for linear algebra and machine learning. Diss. UC Berkeley - 2014.
Quantum query:
∣i⟩∣0⟩↦∣i⟩∣xi⟩
Where ∣xi⟩=∥xi∥21xi=∥xi∥21∑i(xi)j∣j⟩
X∈Rn×d=[x1,…,xn]T xi∈Rd
Classical preprocessing time: O(ndlognd)
Classical space: O(ndlognd)
Query time: O(lognd)
Quantum space: O(lognd)
Iordanis Kerenidis, Anupam Prakash - 8th Innovations in Theoretical Computer Science Conference - ITCS 2017.
Anupam Prakash - Quantum algorithms for linear algebra and machine learning. Diss. UC Berkeley - 2014.
Where ∣xi⟩=∥xi∥21xi=∥xi∥21∑i(xi)j∣j⟩
Quantum query:
n1∑i∣i⟩∣0⟩↦n1∑i∣i⟩∣xi⟩
X∈Rn×d=[x1,…,xn]T xi∈Rd
Given:
quantum sparse access A∈Rn×n,
and a vector x∈Rn
The HHL algorithm produces a state ∣z⟩ such that
κ(A)=σn(A)σ1(A) and s=row’s sparsity
Harrow, Aram, Avinatan Hassidim, Seth Lloyd - Physical review letters - 2009
The HHL algorithm produces a state ∣z⟩ such that
κ(A)=σn(A)σ1(A)
Harrow, Aram, Avinatan Hassidim, Seth Lloyd - Physical review letters - 2009
Given:
quantum access A∈Rn×n,
and a vector x∈Rn
The HHL algorithm produces a state ∣z⟩ such that
κ(A)=σn(A)σ1(A)
Harrow, Aram, Avinatan Hassidim, Seth Lloyd - Physical review letters - 2009
Given:
quantum access A∈Rn×n,
and a vector x∈Rn
Given matrix A∈Rn×m, and a vector x∈Rm
It is possible to produce a state ∣z⟩ s.t.
In general:
f(A)=∑idf(σi)∣ui⟩⟨vi∣
Iordanis Kerenidis, Anupam Prakash - 8th Innovations in Theoretical Computer Science Conference - 2017.
András Gilyén, et al. - Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing - 2019.
Guang Hao Low, Isaac L. Chuang - Physical review letters - 2017.
μ(A)=min(∥A∥F,maxi∈[n]∥ai∥2p2pmaxi∈[d]∥a∗i∥2(1−p)(1−p))
Euclidean distance [QMEANS]
Quadratic forms [Thesis]
Distance induced by A [QEM]
∣i,j⟩↦∣i,j,∥xi−xj∥2⟩
∣i,j⟩↦∣i,j,dA(xi, xj)⟩
\(|i,j\rangle \mapsto |i,j ,x_i^TA^\textcolor{orange}{-1}x_j \rangle \)
O(ϵ1)
O(ϵμ(A))
O(ϵμ(A)κ(A))
X∈Rn×d=[x1,…,xn]T xi∈Rd
We can produce an estimate x of ∣x⟩ such that ∥x∥=1 and
using
Iordanis Kerenidis, Anupam Prakash - ACM Transactions on Quantum Computing, 2020.
Iordanis Kerenidis, Anupam Prakash, Jonas Landman - International Conference on Learning Representations, 2019.
samples
using
samples
Theorem: Assume to have quantum access to A,B∈Rn×n.
There is an algorithm that performs the mapping
∑iαi∣i⟩↦∑iαi∣i,σi⟩
where σi is the i-th singular value of AB in time
O(ϵ(κ(A)+κ(B))(μ(A)+μ(B)))
Shantanav Chakraborty., et al. - 46th International Colloquium on Automata, Languages, and Programming - 2019.
[QSFA]
X∈Rn×d dataset
L∈[K]n labels
(classification)
Quantum slow feature analysis for dimensionality reduction
Quantum Frobenius distance classifier: a simple NISQ classifier
Simulation on MNIST dataset of handwritten digits
Extension to other generalized eigenvalue problems in ML
[QSFA, Thesis]
https://cmp.felk.cvut.cz/cmp/software/stprtool
X∈Rn×d (images)
Y∈Rn×K
yi=[w1Txi,…,wKTxi]
⟨Y∗j⟩=0
⟨Yij2⟩=1
∀j′< j:⟨Y∗j′Y∗j⟩ =0
L∈[K]n (labels)
Finding the model {wj}j=1K reduces to a constrained optimization problem:
The componentwise average should be zero.
The componentwise variance should be 1.
Signals are maximally uncorrelated.
Constraints
X∈Rn×d (images)
Y∈Rn×K
yi=[w1Txi,…,wKTxi]
L∈[K]n (labels)
Finding the model {wj}j=1K reduces to an optimization problem:
s,t∈Tki<j
Theorem:
Quantum access to X, derivative matrix X˙
Let ϵ,θ,δ,η>0
There are quantum algorithms to get:
Map dataset in the slow feature space ∣Y⟩ in time: O~( δθ(κ(X)+κ(X˙))(μ(X)+μ(X˙)) γK−1)
Find {wj}j=0K in time: O(d1.5Kϵ2κ(X)κ(X˙X)(μ(X)+μ(X˙)))
[QSFA]
Theorem: Assume to have quantum access to ∣Y⟩ in time T, we can label images in k classes in time O(ϵkT).
Definition: QFDC (Quantum Frobenius distance classifier):
A point is assigned to the cluster with smallest
normalized--average-squared distance
between the point and the points of the cluster.
[QSFA]
Classification of handwritten digits of MNIST dataset
Ex: polynomial expansion
of degree 2:
[x1,x2,x3]↦[x12,x1x2…x33]
Malware detection via DGA classification
Accuracy | Original classifier | With slow-feature |
---|---|---|
Logistic Regression | 89% | 90.5% (+1.5%) |
Naive Bayes classifier | 89.3% | 92.3% (+3%) |
Decision Trees | 91.4% | 94.0% (+2.6%) |
[Thesis]
Using SFA in a classification problem improves its accuracy
QSFA can process old datasets in new ways!
SFA as instance of a more general problem
The GEP (Genrealized Eigenvalue Problem) is defined as:
In SFA:
ICA Independent Component Analysis
G-IBM Gaussian Information Bottleneck Method
CCA Canonical Correlation Analysis
SC (some) Spectral Clustering [1]
PLS Partial Least Squares
LE Laplacian Eigenmaps
FLD Fisheral Linear Discriminant
SFA Slow Feature Analysis
KPCA Kernel Principal Component Analysis
AW=BWΛ
[1] Iordanis Kerenidis, Jonas Landman - Quantum spectral clustering. arXiv2007.00280. (2020)
Quantum slow feature analysis for dimensionality reduction
Simulation on MNIST dataset of handwritten digits
Quantum Frobenius distance classifier: a simple NISQ classifier
Extension to other generalized eigenvalue problems in ML
[QSFA, Thesis]
X∈Rn×d
(clustering)
q-means for clustering (quantum version of k-means)
Quantum Expectation-Maximization
Simulation on VoxForge dataset for speaker recognition
t←0
Step 1:
Compute distance for all points xi and centroid μjt, d(xi,μjt)
Assign points to closest cluster: l(xi)=c∈[K]argmind(xi,μjt)
Step 2:
Compute the barycenter: μjt+1=∣Cj∣1i∈Cj∑xi
t ←t+1
t←0
Step 1:
Compute distance for all points vi and centroid μjt, ∣i,j⟩↦ ∣i,j,d(vi,μjt)⟩
Generate characteristic vector of a cluster: ∣χj⟩=∣Cj∣1i∈Cj∑∣i⟩
Step 2:
Use quantum linear algebra to build ∣μjt+1⟩=∣Cj∣1i∈Cj∑∣vi⟩
Perform tomograph on ∣μjt+1⟩
Build quantum access to μj.
t←t+1
Theorem: Given quantum access to a matrix X∈Rn×d, there is quantum algorithm that fits a k-means model in time:
O(k2dδ3η2.5 )
Classical: O(nkd)
[QMEANS]
∥μj−μj∗∥≤δ
η=maxi(∥xi∥2)
t←0
Expectation:
Compute distance for all points vi and centroid μjt, ∣i,j⟩↦ ∣i,j,d(vi,μjt)⟩
Generate characteristic vector of a cluster: ∣χj⟩=∣Cj∣1i∈Cj∑∣i⟩
Maximization:
Use quantum linear algebra to build ∣μjt+1⟩=∣Cj∣1i∈Cj∑∣vi⟩
Perform tomograph on ∣μjt+1⟩
Build quantum access to μj.
t←t+1
k labels
Multinomial distribution
[θ,μ1,…,μk,Σ1,…,Σk]
γ∗= argmax∏i∑j∈[k]θjp(xi∣μj,Σj)
Gaussian distribution
γ
Error introduced by quantum algorithm parameters
Maximum Likelihood Estimation
Repeat
t=0
t←t+1
Until ∣ℓ(γt−1;V)−ℓ(γt;V)∣<τ
Update the parameters θ,μ,Σ using the responsibilities rij
Repeat
t=0
t←t+1
Until ∣ℓ(γt−1;V)−ℓ(γt;V)∣<τ
Use UR to generate states proportional to θt+1,μt+1,Σt+1
Perform tomography and create quantum access.
Create mapping UR∣i,j⟩∣0⟩↦∣i,j⟩∣rijt⟩
Theorem: Given quantum access to a matrix X∈Rn×d there is a quantum EM algorithm that fits a GMM in time:
[QEM]
Classical: O(d2kn)
O(d2k4.5 γ(X)logn) γ(n)=O(δ3η3κ(X)κ2(Σ)μ(Σ)μ(X))
Classical ML accuracy: 169/170
Quantum ML accuracy: 167/170
Max element of Σj−1 set to 5 via κ=λτ1
Speaker recognition problem on VoxForge dataset
q-means for clustering (quantum version of k-means)
Quantum Expectation-Maximization
Simulation on VoxForge dataset for speaker recognition
Sf(A)=∑inf(λi)
Sf(A)=∑inf(σi)
A∈Rn×n SPD
f:R↦R
Theorem: Quantum access to a SPD matrix A,
∥A∥<1 and ϵ∈(0,1).
There is a quantum algorithm that estimate logdet(A) with relative error ϵ w.h.p. in time O(μ(A)κ(A)/ϵ).
Slog(x)(A)=logdet(A)=∑inlog(λi)
Application: Tyler's M-estimator.
[QSS]
Γ∗←n1∑i=1nxiTΓ∗−1xixixiT
Data from sub-Gaussian distributions.
Robust to outliers
Valid for data Xn×d with n,d↦∞
[Thesis, AQSS]
In many cases C=XTX is not a "good" sample covariance matrix
[Thesis, AQSS]
Goes, J, et al. - The Annals of Statistics 2020.
Might benefits from componentwise thresholding:
Runtime:
O~(d2ϵ3μ(X)κ(Σk)μ(Σk)γ)
Γk+1=∑i=1nxiTΓk−1xixixiT/Tr[∑i=1nxiTΓk−1xixixiT]
Stopping condition: log-likelihood with a log-determinant
Classical:
O(d2n)
Schatten p-norm O(2p/2μ(A)(p+κ(A))n/ϵ)
Von Neumann entropy O(μ(A)κ(A)n/ϵ)
Trace of Inverse O(μ2κ(A)2/ϵ)
Applications..
Counting number of spanning trees
Counting triangles
Estimating effective resistance
Training Gaussian processes..
...
[QSS]
We have a corpus of algorithms with provable speedups.
Simple to extend current algorithms to more powerful models.
Quantum algorithms seems to work promisingly well in ML:
κ(A), μ(A),s,η,ϵ,
QML might allow solving new or existing problems:
better, faster, cheaper, or a combination.
In a glorious future, with fault-tolerant quantum computers and quantum access to data:
Artificial Intelligence might be promising to explore
Smaller QRAM?
We should work directly on state-of-the-art ML algorithm:
Interpretable, explainable, fair, robust, privacy-preserving ML.
Thanks for your time, there is never enough.
Quantum correspondence analysis
O(ϵγ21+θϵδ2k(n+m))
Quantum latent semantic analysis:
O((ϵγ21+θϵδ2k(n+m))μ(A))
Armando Bellante - Master's thesis
Presented at Quantum Natuarl Language Processing Conference 2020
Orthogonal factors:
Factor scores:
Factor score ratios:
Consider two categorical random variables X, Y, and let C be matrix of occurrences.
P^X,Y=∑i=1∣X∣∑j=1∣Y∣cijC=n1C
p^X=P^X,Y1∣Y∣ and p^Y=1∣X∣TP^X,Y
Comparing words:
AAT=UΣ2U
L=U(k)Σ(k)
Comparing docs:
ATA=VΣ2V
R=V(k)Σ(k)
Comparing W & D:
A=UΣV
L′=U(k)Σ(k)1/2
R′=V(k)Σ(k)1/2
Evolution of Mutual Information between layers while training a QNN
The dropout technique for avoiding barren plateaus
Rebecca Erbanni - Master's thesis
Alexander Singh - IRIF internship
Δ(G)=61Tr[A3]
Create block encoding of B=A1.5
Estimate Tr[BTB]
Van Apeldoorn, Joran, et al. - 58th Annual Symposium on Foundations of Computer Science - 2017.
O(Δ(G)ϵn1/2s2(A)κ(A))
Hamoudi, Y, F. Magniez - 46th International Colloquium on Automata, Languages, and Programming - 2019.
O( (Δ1/6(G)n1/2 +Δ(G)m3/4 )⋅poly(1/ϵ) )
= O(Δ(G)ϵm1/2s1.5(A)κ(A))
Input: {(xi,yi)}i=0n where xi∈Rd1 and yi∈Rd2 i.e. matrices X,Y
Classical:
Step1: Solve GEP ΣXYΣYY−1ΣYXwx=λ2ΣXXwx
Step 2: Find wy by
wy=λΣYY−1ΣYXwx
CCA model: find wx,wy such that
wx,wy=argmaxwx,wycos((Xwx,Ywy))
Quantum:
ΣXXΣXY−1ΣYY=UΣVT
Wx=ΣXX−1/2U
Wy=ΣYY−1/2V.
A state-space exploration approach
Formalize our software as an automaton AP.
For a temporal property f we build the automaton A¬f.
Solve the emptiness problem of the language: L(AP×A¬f)=∅.
Software ↦ specification LTL
↦ Büchi automata ↦ ω-language
Idea: use quantum DFS!
Dürr, Christoph, et al. - SIAM Journal on Computing 35.6 (2006)
Theorem: The emptiness problem for ω-languages is decidable!