Graph Neural Networks

Brian Liu @CRAI 23/09/2021

https://slides.com/brianliu/deck-955282

Image source

What is GNN

GNN is a class of neural networks that operate directly on graph-structured data by passing node-node messages.
Graph has arbitrary size and complex topological structure; no spatial locality like images/grids; No fixed node ordering (node permutation invariance); Often dynamic.

Image Source

Graph Definition

Mathematically, a graph is defined as a tuple of a set of nodes/vertices, and a set of edges/links.
Each edge is a pair of two vertices, and represents a connection between them.

\mathcal{G}=(\mathcal{V},\mathcal{E})

\mathcal{G}=(\mathcal{V},\mathcal{E})

\mathcal{E} = \{\varepsilon_1\, ..., \varepsilon_n\}

\mathcal{E} = \{\varepsilon_1\, ..., \varepsilon_n\}

\varepsilon_i = \{\varepsilon_{ij} = (i, j, \alpha_{i j})\}_{j \in \mathcal{N}(i)}

\varepsilon_i = \{\varepsilon_{ij} = (i, j, \alpha_{i j})\}_{j \in \mathcal{N}(i)}

\mathcal{V} = \{v_1\, ..., v_n\}

\mathcal{V} = \{v_1\, ..., v_n\}

v_i = \left(i, x_{i} \in \mathbb{R}^{d}\right)

v_i = \left(i, x_{i} \in \mathbb{R}^{d}\right)

\mathcal{G}=(A, X)

\mathcal{G}=(A, X)

Summarized node feature matrix.

X = [x_1, ...,x_i,..., x_n], \in \mathbb{R}^{n \times d}

X = [x_1, ...,x_i,..., x_n], \in \mathbb{R}^{n \times d}

Adjacency matrix

A = \left[\begin{array}{ccc}{\alpha_{11}} & {\cdots} & {\alpha_{1 n}} \\ {\vdots} & {\ddots} & {\vdots} \\ {\alpha_{n 1}} & {\cdots} & {\alpha_{n n}}\end{array}\right]

A = \left[\begin{array}{ccc}{\alpha_{11}} & {\cdots} & {\alpha_{1 n}} \\ {\vdots} & {\ddots} & {\vdots} \\ {\alpha_{n 1}} & {\cdots} & {\alpha_{n n}}\end{array}\right]

Graph Definition

A simple example graph containing 4 nodes

\mathcal{V}=\{1,2,3,4\}

\mathcal{V}=\{1,2,3,4\}

\mathcal{E}=\{(1,2), (2,3), (2,4), (3,4)\}

\mathcal{E}=\{(1,2), (2,3), (2,4), (3,4)\}

An undirected graph without self-loop

with its representing adjacency matrix (symmetric)

A directed graph with self-loop

with its representing adjacency matrix (unsymmetric)

GNN: message-passing

GNNs rely on message passing methods, which means that vertices exchange information with the neighbors, and send "messages" to each other.

Message passing rules describe how node embeddings are learned. A generalized abstract GNN model can be defined as:

{X}^{(k)} = \operatorname{GNN}\left({X}^{(k-1)}, A\right) = \mathcal{U}\left({X}^{(k-1)}, \mathcal{M}\left({X}^{(k-1)}, A\right)\right)

{X}^{(k)} = \operatorname{GNN}\left({X}^{(k-1)}, A\right) = \mathcal{U}\left({X}^{(k-1)}, \mathcal{M}\left({X}^{(k-1)}, A\right)\right)

h^{(k)}_i = \overbrace{\mathcal{M}^{(k)}_{j \in \mathcal{N}(i)} \left(\mathbf{x}_i^{(k-1)}, \mathbf{x}_j^{(k-1)},{\alpha}_{ij}\right)}^{\textbf{neighborhood aggregation function}}

h^{(k)}_i = \overbrace{\mathcal{M}^{(k)}_{j \in \mathcal{N}(i)} \left(\mathbf{x}_i^{(k-1)}, \mathbf{x}_j^{(k-1)},{\alpha}_{ij}\right)}^{\textbf{neighborhood aggregation function}}

\mathbf{x}_i^{(k)} = \overbrace{\mathcal{U}^{(k)} \left( \mathbf{x}_i^{(k-1)}, h^{(k)}_i \right)}^{\textbf{embedding update function}}

\mathbf{x}_i^{(k)} = \overbrace{\mathcal{U}^{(k)} \left( \mathbf{x}_i^{(k-1)}, h^{(k)}_i \right)}^{\textbf{embedding update function}}

Variants of GNNs

[1] Kipf et al., "Semi-supervised Classification with Graph Convolutional Networks", (ICLR-2017)

[2] Willianm et al.,"Inductive Representation Learning on Large Graphs", (NeurIPS-2017)

[3] Xu, Keyulu, et al. "How powerful are graph neural networks?." (ICLR-2019).

[4] Petar et al., "Graph Attention Networks",(ICLR-2018)

[5] Rex Ying et al., "Hierarchical Graph Representation Learning with Differentiable Pooling", (NeurIPS-2018)

Based on aggregation and update functions

Spectral methods: GCN [1], ...
Non-spectral / Spatial methods: GraphSAGE [2], GIN [3], ...
Attention methods: GAT [5], ...

Based on tasks

Graph classification: DiffPool [5], ...
Node classification: GCN, GAT, GraphSAGE ..

GCNs

GCN implements "message-passing" functions in the graph by a combination of linear transformations over one-hop neighbourhoods and non-linearities as defined: