| Objects: nodes, vertices | N |
| Interactions: links, edges | E |
| System: network, graph | G(N,E) |
How to build a graph:
Choice of the proper network representation of a given domain/problem determines our ability to use networks successfully:
Undirected Graphs
Links: undirected (symmetrical, reciprocal)
Examples: Collaborations Friendship on Facebook
Directed Graphs
Links: directed (arcs)
Examples: Phone calls Following on Twitter
The graph is called (un)directed iff. the set of pairs is (un)directed respectively.
The edge that consists of the same elements is called loop.
The subset of edges that consists of the same elements is called multiple edges (or multi-edge).
Two nodes/vertices are adjacent if they share a common edge An edge and a node on that edge are called incident.
Example:
(1, 1) — loop,
((1, 2),(1, 2),(1, 2)) — multiple edges.
Directed graph is called oriented graph if there are no two-side edges between any two vertices of the graph.
Two graphs are called isomorphic if one can re-number the vertices of one graph to obtain another.
Two nodes/vertices are adjacent if they share a common edge An edge and a node on that edge are called incident.
Properties of adjacency matrix:
Two nodes/vertices are adjacent if they share a common edge An edge and a node on that edge are called incident.
Properties of incidence matrix:
Two nodes/vertices are adjacent if they share a common edge An edge and a node on that edge are called incident.
For unweighted undirected graph incidence matrix is defined equivalently, but instead of -1 will be 1.
To generalize incidence matrix definition for weighted graphs (or weighted incidence matrix) multiply each column by weight of corresponding edge.
Graph on the left is connected but not strongly connected (e.g., there is no way to get from F to G by following the edge directions).
A set of vertices (edges) is called k-vertex (edge) cut if a graph becomes not connected after the deletion of this set.
A graph is called κ-vertex (edge) connected iff. if it doesn’t have any k − 1-vertex (edge) cuts.
κ-vertex connectivity is also shortly called κ-connectivity.The paths between two vertices are called k-vertex (edge) independent iff. there exist k paths from it which consist of disjoint sets of vertices (edges).
Undirected Graphs
Node degree, : the number of edges adjacent to node i
Directed Graphs
Node degree = in-degree + out-degree.
The maximum number of edges in an undirected graph on N nodes is
An undirected graph with the number of edges E = Emax is called a complete graph, and its average degree is N-1
Degree distribution \( P(k) \) :
Probability that a randomly chosen node has degree \( k \)
\( N_k \) - # nodes with degree \( k \)
Bipartite graph is a graph whose nodes can be divided into two disjoint sets U and V such that every link connects a node in U to one in V; that is, U and V are independent sets
Examples:
“Folded” networks:
Connected undirected graph is called bipartite iff. one can divide the vertices into two groups such that any two vertices from one group are not adjacent.
Consider the bipartite property testing algorithm:
Connected undirected graph is bipartite iff. it doesn’t contain odd length cycles.
1. Necessarity. Assume the contrary: the graph contains odd length cycle. Let’s start to divide vertices from this cycle to two groups by the rule: two vertices from one group should not be adjacent. By this rule vertices will sequentially alternate to each other. Since the length of cycle is odd, the first and the last vertices will be from the same group. This gives a contradiction with bipartite condition.
2. Sufficiency. Consider bipartite property testing algorithm with spanning tree T construction. It is easy to see that any tree is a bipartite graph. Let’s start to add remaining edges. Denote first edge by (v, w). Assume that vertices v and w corresponds to the same group by the testing algorithm. By the lemma 5 there exists unique path from v to w in T. Since the marks alternate to each other along this path by the algorithm, this path with the edge (v, w) form an odd length cycle. This gives a contradiction. Hence, all remaining edges connect vertices from different groups.
A path is a sequence of nodes in which each node is linked to the next one
A path can intersect itself and pass through the same edge multiple times
E.g.: ACBDCDEG
Cycle is a path consisted from distinct edges where the first and last vertices coincide.
Simple path is a path where any edges and vertices are distinct except the first and the last vertices.
Simple cycle is a simple path where the first and last vertices coincide.
Centrality is a function defined for each vertex of a graph that contains some information of a graph structure.
Let’s denote by N(v) the set of vertices which adjacent to the vertex v. The simplest example is the degree centrality deg v = ∥N(v)∥.
Let’s denote by G(N(v)) the maximal sub-graph on vertices V (N(v)). Then MC(v) be the largest connected component in G(N(v)). Maximum neighborhood component MNC(v) is the number of vertices in MC(v).
Density of maximum neighborhood component DMNC(v) = ∥E(MC(v))∥ ∥V (MC(v))∥ ϵ , for some ϵ ∈ [1, 2].
5. Average clustering coefficient:
2. Density:
1. The simplest example is the diametre:
3. Global efficiency:
4. Average shortest path length:
6. Small world coefficient:
where \(G_{rand}\) is a random graph \((|V(G)|, |E(G)|)\)
Distance (shortest path, geodesic) between a pair of nodes is defined as the number of edges along the shortest path connecting the nodes
In directed graphs, paths need to follow the direction of the arrows Consequence: Distance is not symmetric:
If the two nodes are not connected, the distance is usually defined as infinite (or zero)
Global clustering coefficient:
Local clustering coefficient (per vertex)
How connected are i’s neighbors to each other?
where ei is the number of edges between the neighbors of node i
Average clustering coefficient
How connected are i’s neighbors to each other?
M. Bronstein, Geometric deep learning: going beyond Euclidean data, 2017
where
The normalized Laplacian matrix \(\Delta\) is defined as \(D^{-\frac{1}{2}}LD^{-\frac{1}{2}}\), where \(L\) is the Laplacian matrix of \(G\), and \(D = diag(deg \ v_1, \dots, deg \ v_n)\) is the diagonal matrix consisting of degrees of \(G\). Then the \((i, j)\)-entry of \(\Delta\) is
in which we simply write \(d_i = deg \ v_i\) for convenience.
Properties of normalized Laplacian matrix
1. By viewing \(\Delta\) as a linear operator \(\Delta : \mathbb{R}^n \to \mathbb{R}^n\), \(\Delta\) is self-adjoint with respect to the scalar product \((\cdot, \cdot)\), i.e.,
\((x,\Delta y)\) = \((\Delta x, y)\)
for all \(x, y \in \mathbb{R}^n\). Here the scalar product is defined for pairs of vectors; formally, for any \(x, y \in \mathbb{R}^n\), \((x, y) := \sum_{i=1}^{n} x_i y_i\).
2. \(\Delta\) is non-negative:
\((\Delta x, x) \ge 0\), for all \(x\).
3. \(\Delta x = 0\) precisely when \(x\) is a vector collinear to \((\sqrt{d_1}, \dots, \sqrt{d_n})\).
4. The trace of \(\Delta\) is \(n\).
The normalized Laplacian matrix \(\Delta\) is defined as \(D^{-\frac{1}{2}}LD^{-\frac{1}{2}}\), where \(L\) is the Laplacian matrix of \(G\), and \(D = diag(deg \ v_1, \dots, deg \ v_n)\) is the diagonal matrix consisting of degrees of \(G\). Then the \((i, j)\)-entry of \(\Delta\) is
Karate club network
Fiedler Vector
Javer, An open-source platform for analyzing and sharing worm-behavior data, Nature, 2018
OpenWorm project visualization
The first comprehensive computational model of Caenorhabditis elegans (C. elegans), a microscopic roundworm. With only a thousand cells, it solves basic problems such as feeding, mate-finding and predator avoidance.