Marriage alliances among leading
Florentine families 15th century.
Determine the most ”important” or ”prominent” actors in the network based on actor location.
a)
b)
c)
Degree centrality: number of nearest neighbors
Normalized degree centrality
High centrality degree -direct contact with many other actors
Closeness centrality: how close an actor to all the other actors in network
Normalized closeness centrality
High closeness centrality - short communication path to others, minimal number of steps to reach others
Betweenness centrality: number of shortest paths going through the actor σst(i)
Normalized betweenness centrality
Hight betweenness centrality - vertex lies on many shortest paths Probability that a communication from s to t will go through i
Importance of a node depends on the importance of its neighbors (recursive definition)
Select an eigenvector associated with largest eigenvalue λ = λ1, v = v1
A) Degree centrality
B) Closeness centrality
D) Eigenvector centrality
C) Betweenness centrality
from Claudio Rocchini
Centralization (network measure) - how central the most central node in the network in relation to all other nodes.
\(C_x \) - one of the centrality measures
\(p_* \) - node with the largest centrality value
max - is taken over all graphs with the same number of nodes (for degree, closeness and betweenness the most centralizedstructure is the star graph)
Linton Freeman, 1979
Directed graph: distinguish between choices made (outgoing edges) and choices received (incoming edges)
sending - receiving
export - import
cite papers - being cited
Degree centrality (normalized):
Closeness centrality (normalised):
Betweenness centrality (normalized):
All based on outgoing edges
Hyperlinks - implicit endorsements
Web graph - graph of endorsements (sometimes reciprocal)
”PageRank can be thought of as a model of user behavior. We assume there is a ”random surfer” who is given a web page at random and keeps clicking on links, never hitting ”back” but eventually gets bored and starts on another random page. The probability that the random surfer visits a page is its PageRank.”
The anatomy of a large-scale hypertextual Web search engine Sergey Brin and Larry Page, 1998 [link]
Random walk on graph
With teleportation
Perron-Frobenius Theorem guarantees existence and uniqueness of the solution limt→∞ p = π
Personalized PageRank (PPR) is a measure for node proximity on large graphs. For a pair of nodes s and t, the PPR value πs(t) equals the probability that an αdiscounted random walk from s terminates at t and reflects the importance between s and t in a bidirectional way
Efficient Algorithms for Personalized PageRank Computation: A Survey, 2024, arxiv
is the indicator function. The degree matrix of \(G\) is an \(n \times n\) diagonal matrix \(mathbf{D}\) whose \(i\)-th diagonal entry equals \(d_{\text{out}}(v_i)\). The transition matrix of \(G\) is formalized as \(mathbf{P} = \mathbf{A}^\top \mathbf{D}^{-1}\). To ensure that \(mathbf{P}\) is well-defined, we assume that \(d_{\text{out}}(v) > 0\)
Citation networks. Reviews vs original
research (authoritative) papers
authorities, contain uself information, ai
hubs, contains links to authorities, hi
Mutual recursion
good authorities referred bu good hubs
good hubs point to good authorities
System of linear equation
Symmetric eigenvalue problem
where eigenvalue λ = (αβ)−1
Hubs
Authorities
image from J. Leskovec, K. Lang, 2010
A k-core is the largest subgraph such that each vertex is connected to at least k others in subset
Every vertex in k-core has a degree ki ≥ k
(k + 1)-core is always subgraph of k-core
The core number of a vertex is the highest order of a core that contains this vertex
k-cores: 1:1458, 2:594, 3:142, 4:12, 5:6
k-shells: 1:864-red, 2:452-pale green, 3:130-green, 5:6-blue, 6:6-purple
R:graph.coreness(gcc)
Find 3-core of the given network
Text
Dyad is a pair of vertices and possible relational ties between them:
- mutual
- asymmetric
- null (non-existent)
Triad is a subgraph of three vertices and possible ties between them:
Triad census :16 isomorphism classes
D - down, U - up, T - transitive, C - cyclic.
| mutual diads | assymetric dyads | null dyads |
Network motifs are recurrent statistically significant subgraphs or patterns in graphs connected subgraphs that (compare to random network)
Motifs are not induced subgraphs, i.e. they do not contain all the graph edges between selected vertices.
Motifs appear in a network more frequently than in a comparable random network
- calculate the number of occurrences of a sub graph
- evaluate the significance
For Gt subgraph (motif candidate) of G ,
R - random graph, µ - mean frequency, σ-standard deviatiom
Undirected graphs: motifs of size 3 and 4
Connected triads - motifs of size 3
More complicated motifs:
Ribeiro, 2011, Shen-Orr, 2002
S. Omidi, 2009
Ribeiro, 2011, Milo, 2002
146 nodes, 187 edges
What if we use Bag of node degrees?
Deg1: Deg2: Deg3:
Both Graphlet Kernel and Weisfeiler-Lehman (WL) Kernel use Bag-of-* representation of graph, where * is more sophisticated than node degrees!
Given: A graph С with a set of nodes V.
Given: A graph С with a set of nodes V.
Aggregate neighboring colors
Assign initial colors
Hash aggregated colors
Aggregated colors
Hash aggregated colors
Aggregated colors
After color refinement, WL kernel counts number of nodes with a given color.
Key idea: Bag-of-Words (BoW) for a graph