Theory of Everything

Graph Signal Processing

with application to scRNA-seq

Quantifying the effect of experimental perturbations in single-cell RNA-sequencing data using graph signal processing

*Quantifying the effect of experimental perturbations in single-cell RNA-sequencing data using graph signal processing. Daniel B. Burkhardt, et. al bioRxiv 532846

Quantifying the effect of experimental perturbations in single-cell RNA-sequencing data using graph signal processing

*Quantifying the effect of experimental perturbations in single-cell RNA-sequencing data using graph signal processing. Daniel B. Burkhardt, et. al bioRxiv 532846

def filterfunc(x):
    return (np.exp(-b * np.abs(x / graph.lmax - a)**p))

filt = pygsp.filters.Filter(graph, filterfunc)
EES = filt.filter(RES, method="chebyshev", order=50)

Graph Signal Processing:

Overview, Challenges and Applications

*https://github.com/epfl-lts2/pygsp/

**Graph Signal Processing: Overview, Challenges, and Applications. A. Ortega, et. al. Proceedings of the IEEE, vol. 106, no. 5, pp. 808-828, May 2018

Quick plan

Spectral Graph Theory: study graph properties. Here I'll explain some existing algorithms.
Graph Signal Processing: study signals on graphs. Here I'll propose some new algorithms, including the mentioned paper.

Intro to Spectral Graph Theory

Questions:

Who remembers what are eigenvalues?
Who knows what is Fourier Transform?
Who knows what the Laplace Operator (∆) does?

*wiki pagerank page; **Daniel Spielman's presentation; ***Scanpy tutorial

Markov Random Fields
Electrical chains
Heat diffusion
Finite element method
Google PageRank algorithm
Random Walks on graphs
Spectral clustering
Pseudo-time ordering of cells
DiffusionMap visualization
Conos annotation transfer

What do all these processes have in common?

Intro to Spectral Graph Theory

Why should you even care?

Intro to Spectral Graph Theory

Why should you even care?

void smooth_cm(const std::vector<Edge> &edges, Mat &cm, 
               int max_n_iters, double c, double f, double tol, 
               const std::vector<bool> &is_label_fixed) {
  std::vector<double> sum_weights(cm.rows(), 1);
  for (int iter = 0; iter < max_n_iters; ++iter) {
    Mat cm_new(count_matrix);
    for (auto const &e : edges) {
      double weight = exp(-f * (e.length + c));
      if (is_label_fixed.empty() || !is_label_fixed.at(e.v_start)) {
        cm_new.row(e.v_start) += cm.row(e.v_end) * weight;
      }

      if (is_label_fixed.empty() || !is_label_fixed.at(e.v_end)) {
        cm_new.row(e.v_end) += cm.row(e.v_start) * weight;
      }
    }

    double inf_norm = (cm_new - cm).array().abs().matrix().lpNorm<Infinity>();
    if (inf_norm < tol)
      break;

    cm = cm_new;
  }
}

Markov Random Fields
Electrical chains
Heat diffusion
Finite element method
Google PageRank algorithm
Random Walks on graphs
Spectral clustering
Pseudo-time ordering of cells
DiffusionMap visualization
Conos annotation transfer

What do all these processes have in common?

\frac{dx}{dt} = -L x

Some examples

Gene networks

*Example from Daniel Spielman's presentation (slides)

Associated with disease

No association with disease

0.5

0.375

0.625

0.5

Markov Random Field

Minimize

\sum_{(a,b) \in E} \left( x(a) - x(b) \right)^2

Some examples

Electrical networks

*Example from Daniel Spielman's presentation (slides)

0.5

0.325

0.675

0.5

Potentials

\sum_{(a,b) \in E} \left( x(a) - x(b) \right)^2

Minimize

Some examples

Quadratic form minimization

*Example from Daniel Spielman's presentation (slides)

\sum_{(a,b) \in E} \left( x(a) - x(b) \right)^2

Minimize over x

= x^T L x

Quadratic form

0.5

0.325

0.675

Graph Laplacian

System of linear equations

Applying the operator

creates some flow on the graph

{\color{darkred} L} x

*Matrix and Graph: diffusion idea (youtube, ckai2)

It's all about Laplacian

{\color{darkred}L} = {\color{darkblue} D} - {\color{darkgreen} A}

{\color{darkblue} d_{i,i}} = \sum_{j=1}^{N} {\color{darkgreen} A_{i,j}}

{\color{darkgreen} A}\\ {\color{darkblue} D}

: degree matrix

: adjacency matrix

Non-Normalized Graph Laplacian:

Connection to the Laplace Operator

Finite element method

\frac{dx}{dt} = {\color{darkred} -L} x

*Heat Transfer L11 p3 - Finite Difference Method (youtube, Ron Hugo)

{\color{darkred} \Delta}u =\sum _{i=1}^{n}{\frac {\partial ^{2}u}{\partial x_{i}^{2}}}

Laplace Operator:

\frac{\partial u}{\partial t} = {\color{darkred} \Delta} u

Heat Equation:

Informally, the Laplacian operator $∆$ gives the difference between the average value of a function in the neighborhood of a point, and its value at that point. [wiki]

Graph diffusion:

Other graph matrices:

\widetilde{L} = {\color{darkblue} D^{-\frac{1}{2}}} {\color{darkred} L} {\color{darkblue} D^{-\frac{1}{2}}}

P = {\color{darkblue} D^{-1}} {\color{darkred} L}

\color{darkgreen} A

Adjacency matrix

Normalized Laplacian

Random walk matrix

...

Other ways to define flow on graphs

{\color{darkred}L} = {\color{darkblue} D} - {\color{darkgreen} A}

{\color{darkblue} d_{i,i}} = \sum_{j=1}^{N} {\color{darkgreen} A_{i,j}}

{\color{darkgreen} A}\\ {\color{darkblue} D}

: degree matrix

: adjacency matrix

Non-Normalized Graph Laplacian:

Practical applications

Pseudotime estimation**

*wiki pagerank page; **Scanpy tutorial

Conos Annotation Transfer

Google PageRank algorithm*

k-NN Smoothing

(no picture)

Practical applications

Pseudotime estimation**

*wiki pagerank page; **Scanpy tutorial

Conos Annotation Transfer

Google PageRank algorithm*

k-NN Smoothing

(no picture)

Can we re-write the Velocity equations on graph?

Eigenvalues of the Laplacian matrix

L v_i = \lambda_i v_i

Eigenvalues:

\lambda_1 \le \lambda_2 \le ... \le \lambda_n

Eigenvectors:

v_1, ..., v_n

Eigenvalues of the Laplacian matrix

Courant-Fisher Theorem

\lambda_0 = \underset{||x||=1}{\mathrm{min}} x^T L x;\\ v_0 = \underset{||x||=1}{\mathrm{argmin}}\ x^T L x

x^T L x = \sum_{(a,b) \in E} w_{a,b} \left( x(a) - x(b) \right)^2

\lambda_1 = \underset{\substack{||x||=1\\x^Tv_0=0}}{\mathrm{min}} x^T L x;\\ v_1 = \underset{\substack{||x||=1\\x^Tv_0=0}}{\mathrm{argmin}}\ x^T L x

\lambda_0 = 0

v_0

is a constant vector

Connected nodes have close vector values!

Eigenvalues of the Laplacian matrix

Spectral Drawing

*Example from Daniel Spielman's presentation (slides)

v_1

v_2

Eigenvalues of the Laplacian matrix

Spectral Drawing

*Example from Daniel Spielman's presentation (slides)

emb <- uwot::umap(
    X, 
    metric="cosine", 
    init="spectral"
)

Eigenvalues of the Laplacian matrix

Diffusion Map

Normalized Laplacian

\widetilde{L_\alpha} = {\color{darkblue} D^{-\alpha}} {\color{darkred} L} {\color{darkblue} D^{-\alpha}}

Normalized Laplacian Random walk matrix

P_\alpha = \widetilde{D_\alpha^{-1}} \widetilde{L_\alpha^{-1}}

Non-Normalized Laplacian

{\color{darkred}L} = {\color{darkblue} D} - {\color{darkgreen} A}

Laplacian

Degree matrix

\widetilde{d_{\alpha_{i,i}}} = \sum_{j=1}^{N} \widetilde{L_{\alpha_{i,j}}}

Spectral drawing

Connected to Hitting Distances on graph

Eigenvalues of the Laplacian matrix

Spectral Clustering and Min Cut

*https://towardsdatascience.com/spectral-clustering-aba2640c0d5b

k-Means in spectral space approximates Minimum k-cut

v_1

v_2

v_1

v_2

Practical applications

Spectral visualization
Diffusion Maps
Spectral clustering
Fast Minimal Cuts

Graph Signal Processing

Signals on graphs

Example: Gene expression as a signal

Signals on graphs

Alternative view on the Quadratic Form

Signal is given over the graph domain.

{\color{purple} f}

\frac{\partial \color{purple} f(i)}{\partial{e_{i,j}}} = \sqrt{A_{i,j}} \left({\color{purple} f(i)} - {\color{purple} f(j)}\right)

Derivative over an edge:

Gradient:

\nabla {\color{purple} f(i)} = \Big[ \frac{\partial \color{purple} f(i)}{\partial{e_{i,j}}}, j \in adj(i) \Big]

Local variation:

||\nabla {\color{purple} f(i)}||_2 = \sum_{j \in adj(i)} \left( \sqrt{A_{i,j}} \left({\color{purple} f(i)} - {\color{purple} f(j)}\right) \right)^2 =

= \sum_{j \in adj(i)} A_{i,j} \left({\color{purple} f(i)} - {\color{purple} f(j)}\right)^2

\Delta = -L = -(D - A) = \nabla^T * \nabla

Laplacian:

Signals on graphs

Alternative view on the Quadratic Form

Local variation:

||\nabla {\color{purple} f(i)}||_2 = \sum_{j \in adj(i)} \left( \sqrt{A_{i,j}} \left({\color{purple} f(i)} - {\color{purple} f(j)}\right) \right)^2 =

= \sum_{j \in adj(i)} A_{i,j} \left({\color{purple} f(i)} - {\color{purple} f(j)}\right)^2

= {\color{purple} f^T} L {\color{purple} f}

Global variation:

S_2({\color{purple} f}) = \sum_{i \in V} \sum_{j \in adj(i)} A_{i,j} \left({\color{purple} f(i)} - {\color{purple} f(j)}\right)^2 =

= \sum_{(i, j) \in E} A_{i,j} \left({\color{purple} f(i)} - {\color{purple} f(j)}\right)^2

Signals on graphs

Possible application: measure alignment quality

Local variation:

||\nabla {\color{purple} f(i)}||_2 = \sum_{j \in adj(i)} \left( \sqrt{A_{i,j}} \left({\color{purple} f(i)} - {\color{purple} f(j)}\right) \right)^2 =

= \sum_{j \in adj(i)} A_{i,j} \left({\color{purple} f(i)} - {\color{purple} f(j)}\right)^2

= {\color{purple} f^T} L {\color{purple} f}

Global variation:

S_2({\color{purple} f}) = \sum_{i \in V} \sum_{j \in adj(i)} A_{i,j} \left({\color{purple} f(i)} - {\color{purple} f(j)}\right)^2 =

= \sum_{(i, j) \in E} A_{i,j} \left({\color{purple} f(i)} - {\color{purple} f(j)}\right)^2

Signal = Expression: we can measure local / global variation of expression after the alignment and compare it to the variation before
Signal = Sample labels: we can measure local variation of samples

Example: Gene expression as a signal

k-NN smoothing is noise filtering

Image filtering

*Image Processing 101 Chapter 2.3: Spatial Filters

Image filtering

*Image Processing 101 Chapter 2.3: Spatial Filters

Filter size = k-hop on graph

Can we increase k?

Grid = Graph

Image filtering

Fourier Transform

*Jez Swanson, An Interactive Introduction to Fourier Transforms

Represents signal as a combination of waves
Spectral domain: frequency vs amplitude
Every pixel has some info from every other pixel
We can work with frequencies!
Fast Fourier Transform: O(N log N)

Image filtering

Spectral Domain Filters

*Gaussian Blur, wikipedia

**Jez Swanson, An Interactive Introduction to Fourier Transforms

Low-pass filter*

High-pass filter**

Fourier transform on graphs

*Shuman D I. et. al. "Vertex-frequency analysis on graphs" (2016)

For eigenvector :

v_i

v_i^T L v_i = \lambda_i ||v_i||_2 = \lambda_i

Global variation:

S_2(f) = f^T L f

Fourier transform on graphs

*Ricaud B. et. al. "Fourier could be a data scientist: From graph Fourier transform to signal processing on graphs" (2019)

f(i) = \sum_{k = 1}^{N-1} \hat{f}(\lambda_k) v_k(i)

Inverse Fourier transform of :

\hat{f}

\hat{f}(\lambda_k) = \sum_{i \in V} (f(i) v_k(i))

Fourier transform of :

For eigenvector :

v_i

v_i^T L v_i = \lambda_i ||v_i||_2 = \lambda_i

Global variation:

S_2(f) = f^T L f

Fourier transform on graphs

Function projections on frequencies

*The Emerging Field of Signal Processing on Graphs: Extending High-Dimensional Data Analysis to Networks and Other Irregular Domains (2011)

Fourier transform on graphs

General definition of filters

*Ricaud B. et. al. "Fourier could be a data scientist: From graph Fourier transform to signal processing on graphs" (2019)

f(i) = \sum_{k = 1}^{N-1} \hat{f}(\lambda_k) v_k(i)

Inverse Fourier transform of :

\hat{f}

f(i) = \sum_{k = 1}^{N-1} {\color{darkred} g(\lambda_k)} \hat{f}(\lambda_k) v_k(i)

Applying a

spectral filter :

Fourier transform on graphs

Example: Tikhonov regularization

It's just a low-pass filter

*The Emerging Field of Signal Processing on Graphs: Extending High-Dimensional Data Analysis to Networks and Other Irregular Domains (2011)

\underset{x}{\mathrm{argmin}}\left( ||x - f||_2^2 + \gamma x^T L x \right)

f(i) = \sum_{k = 1}^{N-1} {\color{darkred} \frac{1}{1 + \gamma \lambda_k}} \hat{f}(\lambda_k) v_k(i)

Fourier transform on graphs

Computational problems

Direct Fourier transform takes

\mathrm{O}(|V|^3)

We can do a polynomial approximation for

\mathrm{O}(|E|)

Aviyente, S. et. al. "Cooperative and Graph Signal Processing" (2018)

Shuman D.I. et. al. "Chebyshev Polynomial Approximation for Distributed Signal Processing" (2011)

Possible application

Tikhonov regularization for expression correction?

We can use Tikhonov regularization as for correcting batch effect using the Conos graph
Gamma is a correction strength
We can determine it comparing corrected expression variation to expression variation of individual samples

\underset{x}{\mathrm{argmin}}\left( ||x - f||_2^2 + \gamma x^T L x \right)

Filters on graphs

Back to images

*Gaussian Blur, wikipedia

**Jez Swanson, An Interactive Introduction to Fourier Transforms

Low-pass filter*

High-pass filter**

MELD: low-pass filter

*Quantifying the effect of experimental perturbations in single-cell RNA-sequencing data using graph signal processing. Daniel B. Burkhardt, et. al bioRxiv 532846

def filterfunc(x):
    return (np.exp(-b * np.abs(x / graph.lmax - a)**p))

filt = pygsp.filters.Filter(graph, filterfunc)
EES = filt.filter(RES, method="chebyshev", order=50)

g(\lambda) = e^{(-\beta |\frac{\lambda}{\lambda_{max}} - \alpha|)^p}

\beta = 60\\ \alpha = 0\\ p = 1

MELD: low-pass filter

*Quantifying the effect of experimental perturbations in single-cell RNA-sequencing data using graph signal processing. Daniel B. Burkhardt, et. al bioRxiv 532846

Other filter parameters

*Quantifying the effect of experimental perturbations in single-cell RNA-sequencing data using graph signal processing. Daniel B. Burkhardt, et. al bioRxiv 532846

g(\lambda) = e^{(-\beta |\frac{\lambda}{\lambda_{max}} - \alpha|)^p}

Other filter parameters

*Quantifying the effect of experimental perturbations in single-cell RNA-sequencing data using graph signal processing. Daniel B. Burkhardt, et. al bioRxiv 532846

g(\lambda) = e^{(-\beta |\frac{\lambda}{\lambda_{max}} - \alpha|)^p}

Possible applications

Whatever we need to smooth, low-pass filters work much better than k-NN smoothing.
Low-pass filters are a proper way for Kernel Density Estimation on graphs. We can at least apply it to estimate density of healthy vs control samples for cross-condition comparison.
Can we use high-pass filters over expression to detect cell type boundaries? :)