Spectral clustering and Louvain's algorithm
France ROSE
Machine Learning Journal Club
February 28th, 2018
From data to graph
Spectral theory and clustering
When your graph is too large: Louvain's algorithm
Retrieving cell categories with graph clustering
A Tutorial on Spectral Clustering, by U. Luxburg (2007)
Fast unfolding of communities in large networks, by Blondel et al (2008)
Data-Driven Phenotypic Dissection of AML Reveals Progenitor-like Cells that Correlate with Prognosis, by Levine et al (2015)
C++ and Matlab implementations of Louvain algorithm
phenoGraph python package
Data are already a graph
Data are not a graph yet
Ex: protein interactions
Similarity graph
-neighborhood graph
k-nearest neighbors graph
Compute
distances
RBF kernel
Euclidean, Mahalanobis, Manhattan...
Spectrum: eigenvalues and eigenvectors
Laplacian matrix = Degree matrix - Adjacency matrix
0 is now eigenvalue with 3 orthogonal vectors:
Count how many times 0 is eigenvalue
fully connected graph (weighted)
Eigengap heuristic
Looking for communities/groups:
- many links inside a group
- few links between groups
Modularity
m: total number of edges
A: adjacency matrix
d: node degree
c: community membership
m: total number of edges
A: adjacency matrix
d: node degree
c: community membership
Actual edge presence between v and w
Expected edge presence knowing the degrees and the total number of edges
Only count if v and w are classified in the same community
Sum over all pairs of nodes
Cell categories
Number of cells
Examples of cells