Algorithms for the network alignment problem
Emanuele Natale
Journées Julia et Optimisation
Oct 4 – 6, 2023, CNAM Paris
Centre Inria d'Université Côte d'Azur
COATI & Julia
COATI: Combinatorics, Optimization, and Algorithms for Telecommunications
Contributions to graph algorithms libraries
github.com/worlddynamics
Me & Julia:
- sponsoring it at INRIA Université Côte d'Azur since 2019
- developing Integrated Assessment Models, use in ML and graph algo.s
This talk: Network Alignment in Julia(?)
Theoretical overview of network alignment algorithms
Work In Progress: implement them in Julia
FAQ and GOAT
already in GraphsOptim.jl
(A. Rossi)
Graph Isomorphism
Find correspondence between nodes of two graphs so that they look the same:
Given \(G_A=(V_A,E_A)\) and \(G_B=(V_B,E_B)\), we want a matching \(m:V_A\rightarrow V_B\) such that
\[(u,v)\in E_A \iff (m(u),m(v))\in E_B\]
Quasi-polynomial time \(2^{\operatorname{poly}\log n}\) (Babai '15&'17)
Find correspondence between nodes of two graphs so that they look the same:
Generalizing Graph Isomorphism
Induced Subgraph Isomorphism: \(|V_A|\leq |V_B|\)
(E.g. max clique, max ind. set \(\implies\) NP-complete)
Graph Edit Distance: find best set of edit operations \(e_1,...,e_k\) (edge deletions, insertions...) such that the graph are isomorphic, with cost \(\sum_{i=1}^kc(e_i)\)
The network alignment problem
- \(A\) and \(B\) adjacency matrices of two graphs
- \(P\) permutation matrix
\[P^* = \arg\min_P\|A-PBP^T\|_F^2\]
Another way to formalize it:
Example: [Frigo et al. 2021] aligns brain atlases via a generalized Weisfeiler-Lehman embedding
Find correspondence between nodes of two graphs so that they look as similar as possible
Many ways to relax: n°1
- \(\mathcal P\) permutation matrices
- \(\mathcal D\) doubly stoch. matrices: \(P\mathbf 1=P^T\mathbf 1=\mathbf 1\)
\(\mathcal P\subset \mathcal D\)
\(\arg\min_{P\in \mathcal P}\|A-PBP^T\|_F^2\)
Frank–Wolfe algorithm
Initialization: Let \(k \leftarrow 0\), and let \(\mathbf{x}_0 \!\) be any point in \(\mathcal{D}\).
Step 1 - Find \(\mathbf{s}_k\) solving:
Minimize \(\mathbf{s}^T \nabla f(\mathbf{x}_k)\) subject to \(\mathbf{s} \in \mathcal{D}\)
Step 2 - Find \(\alpha\in [0,1]\) that minimizes \(f(\mathbf{x}_k+\alpha(\mathbf{s}_k -\mathbf{x}_k))\) .
Step 3 - Update \(\mathbf{x}_{k+1}\leftarrow \mathbf{x}_k+\alpha(\mathbf{s}_k-\mathbf{x}_k)\), \(k \leftarrow k+1\) and go to Step 1.
Iteratively minimize the linear approximation of the problem given by the first-order Taylor approximation of \(f\) around \(\mathbf{x}_k \!\) constrained to stay within \(\mathcal{D}\)
Many ways to relax: n°2
\(\arg\min_{P\in \mathcal D}\|A-PBP^T\|_F^2 \) too far
from \(\arg\min_{P\in \mathcal P}\|A-PBP^T\|_F^2\) for \(\prod_{\mathcal P}\) to succeed
Fundamental trick: \(\|M\|_F^2=\operatorname{tr}(M^TM)=\langle M,M\rangle \) where \(\langle A,B \rangle = \sum_{i,j}A_{i,j}B_{i,j}\) is the (Frobenius) dot product
\(A\) adjacency matrix, \(D_A\) degree matrices, \(L_A=D_A-A\) graph Laplacian, then
\(\operatorname{tr}(P^TL_A^TPL_B)\) Kronecker product of Laplacians \(\implies\) positive semidefinite \(\implies\) convex
The PATH algorithm [Zaslavskiy et al. 2009]
- \(F_0(P)=\)\(\|AP-PB\|_F^2\) convex
- \(F_1(P)=-\|D_AP-PD_B\|-2\operatorname{tr}(P^TL_A^TPL_B)\) concave
- \(F_\lambda(P)=(1-\lambda) F_0(P) + \lambda F_1(P)\)
PATH algorithm:
\(P_0= \arg\min F_0\) by FW
For \(i=1,...,n\)
\(P_{\frac in} = \arg \min F_{\frac in}\) by FW
starting from \(P_{\frac{i-1}n}\)
Output \(P_1\)
Many ways to relax: n°3
\(\|A-PBP^T\|_F^2 = \langle A-PBP^T, A-PBP^T\rangle \)
\(=\langle A,A \rangle + \langle B,B \rangle - 2\langle AP,PB\rangle\)
\(\Lambda : \Lambda_{i,j}\in [0,1]\), \(B\sim \operatorname{Bernoulli}(\Lambda)\)
\(A\) \(\rho\)-correlated to \(B\) iff \(A\sim \operatorname{Bernoulli}((1-\rho)\Lambda+\rho B)\)
Theorem. [Lyzinski et al. 2016].
\(A,B\) \(\rho\)-correlated with \(\Lambda_{i,j}\in (\alpha,1-\alpha)\) for some \(\alpha\in (0,\frac 12)\). \(A'=\tilde PA\tilde P^T\) for arbitrary perm. \(\tilde P\).
- If \(\rho<1\) then a.s. \(\tilde P\neq \arg\min_{P\in \mathcal D} \|A'P-PB\|_F^2\)
- If \((1-\alpha)(1-\rho)<\frac 12\) then a.s. \(\tilde P = \arg\max_{P\in \mathcal D}\langle A'P,PB\rangle\)
FAQ (Fast Approximate Quadratic) Algorithm
[Vogelstein et al. 2015]
Compute \(\prod_{\mathcal P}\arg\max_{P\in \mathcal D}f(P) \) with \(f(P)=\langle AP,PB\rangle\):
\(P_0 = \frac 1n \mathbf 1\mathbf 1^T\)
Iterate Frank-Wolfe \(i=1,...,k\):
\(\nabla f(P_i) = AP_iB^T+A^TP_iB\)
\(Q_{i}=\arg\max_{Q\in \mathcal D}\langle Q,\nabla f_i\rangle \) via Hungarian algorithm
\(\alpha = \arg\max_\alpha f(\alpha Q_{i}+(1-\alpha)Q_i)\)
\(P_{i+1}=\alpha P_i + (1-\alpha)Q_i\)
Output \(\arg\max \langle P, Q_k\rangle\) via Hungarian algorithm
Approximate Optimal Transport
[Cuturi 2013]
\(\arg\max_P\langle P, C\rangle\)
\(\arg\max_P\big(\langle P, C\rangle - \epsilon H(P)\) with \(\sum_{i,j}P_{i,j}\log\frac 1{P_{i,j}}\big)\)
\(\epsilon\rightarrow 0\)
pushes towards uniform \(P\)
Writing down the Lagrangian with \(P\mathbf 1 = P^T \mathbf 1 = \mathbf 1\) we get
\(P =D_r \exp.(\frac 1\epsilon C) D_c\)
with \(D_r,D_c\) making it doubly stochastic
Theorem [see Peyré, Cuturi 2020]: If \(C\geq 0\), unique \(D_r,D_c\) up to scalars, given by Sinkhorn algorithm
The Sinkhorn Algorithm
Difficult to analyze!
[Altschuler et al. 2017]: A refined Sinkhorn computes \(\tilde P\) s.t.
\(\langle \tilde P, C \rangle \geq \max_P\langle P, C\rangle -\epsilon\)
in \(\mathcal O(\frac{\|C\|_\infty^3 }{\epsilon^{3}}n^2\log n)\)
Implementations in OptimalTransport.jl
(Naive) Sinkhorn Algorithm.
Input: \(C\geq 0, \epsilon > 0\)
\(M = \exp.(\frac 1\epsilon C)\)
for \(i\in \{1,...,k\}\):
\(r = M\mathbf 1\)
\( M \leftarrow r^T M\)
\( c = M^T \mathbf 1\)
\(M \leftarrow M c\)
Output \(M\)
GOAT: Graph Matching via Optimal Transport
[Saad-Eldin et al. 2021]
Compute \(\prod_{\mathcal P}\arg\max_{P\in \mathcal D}f(P) \) with \(f(P)=\langle AP,PB\rangle\):
\(P_0 = \frac 1n \mathbf 1\mathbf 1^T\)
Iterate Frank-Wolfe \(i=1,...,k\):
\(\nabla f(P_i) = AP_iB^T+A^TP_iB\)
\(Q_{i}=\arg\max_{Q\in \mathcal D}\langle Q,\nabla f_i\rangle +\epsilon H(Q)\) via Sinkhorn
\(\alpha = \arg\max_\alpha f(\alpha Q_{i}+(1-\alpha)Q_i)\)
\(P_{i+1}=\alpha P_i + (1-\alpha)Q_i\)
Output \(\arg\max \langle P, Q_k\rangle\) via Hungarian algorithm
- Replacing the last Hungarian with Sinkhorn?
- Parallel Network Alignment algorithms?
- Wanna help? Reach out!
Thank You
Journées Julia et Optimisation CNAM 2023
By Emanuele Natale
Journées Julia et Optimisation CNAM 2023
- 166