## Algorithms for the network alignment problem

### Emanuele Natale

Journées Julia et Optimisation

Oct 4 – 6, 2023, CNAM Paris

Centre Inria d'Université Côte d'Azur

## COATI & Julia

COATI: Combinatorics, Optimization, and Algorithms for Telecommunications

Contributions to  graph algorithms libraries

github.com/worlddynamics

Me & Julia:

• sponsoring it at INRIA Université Côte d'Azur since 2019
• developing Integrated Assessment Models, use in ML and graph algo.s

### This talk: Network Alignment in Julia(?)

Theoretical overview of network alignment algorithms

Work In Progress: implement them in Julia

FAQ and GOAT
(A. Rossi)

## Graph Isomorphism

Find correspondence between nodes of two graphs so that they look the same:

Given $$G_A=(V_A,E_A)$$ and $$G_B=(V_B,E_B)$$, we want a matching $$m:V_A\rightarrow V_B$$ such that

$(u,v)\in E_A \iff (m(u),m(v))\in E_B$

Quasi-polynomial time $$2^{\operatorname{poly}\log n}$$ (Babai '15&'17)

Find correspondence between nodes of two graphs so that they look the same:

### Generalizing Graph Isomorphism

Induced Subgraph Isomorphism: $$|V_A|\leq |V_B|$$
(E.g. max clique, max ind. set $$\implies$$ NP-complete)

Graph Edit Distance: find best set of edit operations $$e_1,...,e_k$$ (edge deletions, insertions...) such that the graph are isomorphic, with cost $$\sum_{i=1}^kc(e_i)$$

## The network alignment problem

• $$A$$ and $$B$$ adjacency matrices of two graphs
• $$P$$ permutation matrix

$P^* = \arg\min_P\|A-PBP^T\|_F^2$

Another way to formalize it:

Example: [Frigo et al. 2021] aligns  brain atlases via a generalized Weisfeiler-Lehman embedding

Find correspondence between nodes of two graphs so that they look as similar as possible

## Many ways to relax: n°1

• $$\mathcal P$$ permutation matrices
• $$\mathcal D$$ doubly stoch. matrices: $$P\mathbf 1=P^T\mathbf 1=\mathbf 1$$

$$\mathcal P\subset \mathcal D$$

$$\arg\min_{P\in \mathcal P}\|A-PBP^T\|_F^2$$

\prod_{\mathcal P}\arg\min_{P\in \mathcal D}\|A-PBP^T\|_F^2

## Frank–Wolfe algorithm

Initialization: Let $$k \leftarrow 0$$, and let $$\mathbf{x}_0 \!$$ be any point in $$\mathcal{D}$$.
Step 1 - Find $$\mathbf{s}_k$$ solving:
Minimize $$\mathbf{s}^T \nabla f(\mathbf{x}_k)$$ subject to $$\mathbf{s} \in \mathcal{D}$$

Step 2 - Find $$\alpha\in [0,1]$$ that minimizes $$f(\mathbf{x}_k+\alpha(\mathbf{s}_k -\mathbf{x}_k))$$ .

Step 3 - Update $$\mathbf{x}_{k+1}\leftarrow \mathbf{x}_k+\alpha(\mathbf{s}_k-\mathbf{x}_k)$$, $$k \leftarrow k+1$$ and go to Step 1.

Iteratively minimize the linear approximation of the problem given by the first-order Taylor approximation of $$f$$ around $$\mathbf{x}_k \!$$ constrained to stay within $$\mathcal{D}$$

## Many ways to relax: n°2

$$\arg\min_{P\in \mathcal D}\|A-PBP^T\|_F^2$$ too far

from $$\arg\min_{P\in \mathcal P}\|A-PBP^T\|_F^2$$ for $$\prod_{\mathcal P}$$ to succeed

Fundamental trick: $$\|M\|_F^2=\operatorname{tr}(M^TM)=\langle M,M\rangle$$ where $$\langle A,B \rangle = \sum_{i,j}A_{i,j}B_{i,j}$$ is the (Frobenius) dot product

$$A$$ adjacency matrix, $$D_A$$ degree matrices, $$L_A=D_A-A$$ graph Laplacian, then

\|A-PBP^T\|_F^2=\|AP-PB\|\\ =\|(D_A-L_A)P-P(D_B-L_B)\|= ...\\ =-\|D_AP-PD_B\|+\operatorname{tr}(L_A^2)+\operatorname{tr}(L_B^2)-2\operatorname{tr}(P^TL_A^TPL_B)

$$\operatorname{tr}(P^TL_A^TPL_B)$$ Kronecker product of Laplacians $$\implies$$ positive semidefinite $$\implies$$ convex

### The PATH algorithm [Zaslavskiy et al. 2009]

• $$F_0(P)=$$$$\|AP-PB\|_F^2$$ convex
• $$F_1(P)=-\|D_AP-PD_B\|-2\operatorname{tr}(P^TL_A^TPL_B)$$ concave
• $$F_\lambda(P)=(1-\lambda) F_0(P) + \lambda F_1(P)$$

PATH algorithm:

$$P_0= \arg\min F_0$$ by FW

For $$i=1,...,n$$

$$P_{\frac in} = \arg \min F_{\frac in}$$ by FW

starting from $$P_{\frac{i-1}n}$$

Output $$P_1$$

## Many ways to relax: n°3

$$\|A-PBP^T\|_F^2 = \langle A-PBP^T, A-PBP^T\rangle$$

$$=\langle A,A \rangle + \langle B,B \rangle - 2\langle AP,PB\rangle$$

$$\Lambda : \Lambda_{i,j}\in [0,1]$$, $$B\sim \operatorname{Bernoulli}(\Lambda)$$
$$A$$ $$\rho$$-correlated to $$B$$ iff $$A\sim \operatorname{Bernoulli}((1-\rho)\Lambda+\rho B)$$

Theorem. [Lyzinski et al. 2016].
$$A,B$$ $$\rho$$-correlated with $$\Lambda_{i,j}\in (\alpha,1-\alpha)$$ for some $$\alpha\in (0,\frac 12)$$. $$A'=\tilde PA\tilde P^T$$ for arbitrary perm. $$\tilde P$$.

• If $$\rho<1$$ then a.s. $$\tilde P\neq \arg\min_{P\in \mathcal D} \|A'P-PB\|_F^2$$
• If $$(1-\alpha)(1-\rho)<\frac 12$$ then a.s. $$\tilde P = \arg\max_{P\in \mathcal D}\langle A'P,PB\rangle$$

### FAQ (Fast Approximate Quadratic) Algorithm

[Vogelstein et al. 2015]

Compute $$\prod_{\mathcal P}\arg\max_{P\in \mathcal D}f(P)$$ with $$f(P)=\langle AP,PB\rangle$$:

$$P_0 = \frac 1n \mathbf 1\mathbf 1^T$$

Iterate Frank-Wolfe $$i=1,...,k$$:

$$\nabla f(P_i) = AP_iB^T+A^TP_iB$$
$$Q_{i}=\arg\max_{Q\in \mathcal D}\langle Q,\nabla f_i\rangle$$ via Hungarian algorithm
$$\alpha = \arg\max_\alpha f(\alpha Q_{i}+(1-\alpha)Q_i)$$
$$P_{i+1}=\alpha P_i + (1-\alpha)Q_i$$
Output $$\arg\max \langle P, Q_k\rangle$$ via Hungarian algorithm

### Approximate Optimal Transport

[Cuturi 2013]

$$\arg\max_P\langle P, C\rangle$$

$$\arg\max_P\big(\langle P, C\rangle - \epsilon H(P)$$ with $$\sum_{i,j}P_{i,j}\log\frac 1{P_{i,j}}\big)$$

$$\epsilon\rightarrow 0$$

pushes towards uniform $$P$$

Writing down the Lagrangian with $$P\mathbf 1 = P^T \mathbf 1 = \mathbf 1$$ we get

$$P =D_r \exp.(\frac 1\epsilon C) D_c$$
with $$D_r,D_c$$ making it doubly stochastic

Theorem [see Peyré, Cuturi 2020]: If $$C\geq 0$$, unique $$D_r,D_c$$ up to scalars, given by Sinkhorn algorithm

### The Sinkhorn Algorithm

Difficult to analyze!

[Altschuler et al. 2017]: A refined Sinkhorn computes $$\tilde P$$ s.t.

$$\langle \tilde P, C \rangle \geq \max_P\langle P, C\rangle -\epsilon$$

in $$\mathcal O(\frac{\|C\|_\infty^3 }{\epsilon^{3}}n^2\log n)$$

Implementations in OptimalTransport.jl

(Naive) Sinkhorn Algorithm.

Input: $$C\geq 0, \epsilon > 0$$

$$M = \exp.(\frac 1\epsilon C)$$

for $$i\in \{1,...,k\}$$:

$$r = M\mathbf 1$$

$$M \leftarrow r^T M$$
$$c = M^T \mathbf 1$$

$$M \leftarrow M c$$

Output $$M$$

### GOAT: Graph Matching via Optimal Transport

Compute $$\prod_{\mathcal P}\arg\max_{P\in \mathcal D}f(P)$$ with $$f(P)=\langle AP,PB\rangle$$:

$$P_0 = \frac 1n \mathbf 1\mathbf 1^T$$

Iterate Frank-Wolfe $$i=1,...,k$$:

$$\nabla f(P_i) = AP_iB^T+A^TP_iB$$
$$Q_{i}=\arg\max_{Q\in \mathcal D}\langle Q,\nabla f_i\rangle +\epsilon H(Q)$$ via Sinkhorn
$$\alpha = \arg\max_\alpha f(\alpha Q_{i}+(1-\alpha)Q_i)$$
$$P_{i+1}=\alpha P_i + (1-\alpha)Q_i$$
Output $$\arg\max \langle P, Q_k\rangle$$ via Hungarian algorithm

• Replacing the last Hungarian with Sinkhorn?
• Parallel Network Alignment algorithms?
• Wanna help? Reach out!

# Thank You

#### Journées Julia et Optimisation CNAM 2023

By Emanuele Natale

• 49