Some deep learning methods
for single-cell NGS data
Sasha Galitsyna
agalitzina@gmail.com

Single-cell NGS methods

  • scDNA-Seq (2012): genomic DNA
  • scRNA-Seq (2009): transcriptome
  • scBS-Seq (2013): methylation
  • scATAC-Seq (2015): accessibility
  • scDNase-Seq (2015): accessibility
  • scChIP-Seq (2015): protein binding
  • scNMT-Seq (2018): joint RNA,
  • methylation, accessibility
  • scHi-C (2013): chromatin interactions

Clark et al. NatCom 2018

Single-cell NGS methods

  • scRNA-Seq (2009): transcriptome
  • scDNA-Seq (2012): genomic DNA
  • scBS-Seq (2013): methylation
  • scHi-C (2013): chromatin interactions
  • scATAC-Seq (2015): accessibility
  • scDNase-Seq (2015): accessibility
  • scChIP-Seq (2015): protein binding
  • scNMT-Seq (2018): joint RNA+methylation
    +accessibility
  • scCAT (2019): joint accessibility
    +transcriptome

Clark et al. NatCom 2018

Single-cell NGS methods

Liu et al. NatCom 2019

scCAT

Tasks in sc NGS methods

Hwang et al. 2018

  • Data imputation
  • Differential expression analysis
  • Cell type identification
  • Detection of rare cell types
  • Cell hierarchy reconstruction
  • Inference of regulatory networks

Confounding factors in scNGS

Hwang et al. 2018

Tasks in sc NGS methods

Hwang et al. 2018

  • Data imputation
  • Differential expression analysis
  • Cell type identification
  • Detection of rare cell types
  • Cell hierarchy reconstruction
  • Inference of regulatory networks

Recent advances...

scAlign (Johansen, 2018 bioRxiv)

scBLAST (Cao 2019 bioRxiv)

BS-Seq: bisulfite sequencing

DeepCpG: methylation prediction in single cells

Angermueller et al. 2017 Genome Biology

https://github.com/cangermueller/deepcpg.git

DeepCpG: tasks

Angermueller et al. 2017 Genome Biology

DeepCpG architecture

Other methods

  • WinAvg
    (3001 bp)

     
  • CpGAvg

     
  • Random Forest
    (k-mers in 1001 bp window, CpG state in 25 neighbors)


     
  • Random Forest from Zhang
    (2 CpG neighbors, genomic context, histone modifications, DHS)

DeepCpG performance

DeepCpG performance

Angermueller et al. 2017 Genome Biology

DeepCpG performance

Motifs analysis

Motifs analysis

Effect of mutations

Modified model to predict mean and variance

estimated mean methylation rate of cell t computed by averaging the binary methylation state of all observed CpG sites in window s

estimated mean methylation level for a win- dow centred on target site n of a certain size indexed by s

cell-to-cell variance

Modified model results


Right: motif effects
brown: association with variance

purple: influence on mean

Left:

correlation between activity vs 

conservation or variation

Modified model results

Modified model results

scVI: variational inference of RNA-Seq

Lopez et al. 2018 December Nature Methods

scRNA-Seq data problem:

Zero-inflated distribution of counts:

Zero-inflated negative binomial:

Variational autoencoders
for scRNA-Seq embedding

  • scvis, scVAE, VASC, DCA: do not distinguish technical and biological effects
  • Example for DCA (deep count autoencoder network):

Variational autoencoders
for scRNA-Seq embedding

  • Example for VASC:

scVI

n - cells

g - genes

multivariate normal prior

prior for log-scaling factor for library size

latent variables:

scANVI

Variables:

Grey: observed
Semi-shaded: ovserved or random
White: latent

scVI runtime

Imputation errors assessment

SIMLR:

 single-cell interpretation via multikernel learning

Wang et al. 2017 Nature Methods

scVI latent space

scVI latent space

scVI latent space

scVI: differential expression benchmarking

scVI: differential expression benchmarking

scVI: differential expression benchmarking

scVI: differential expression benchmarking

See also:

  • single-cell ANnotation using Variational Inference (scANVI) - Jan 2019, bioRxiv, Xu et al. (the same authors)

sc-ngs-ml

By agalicina

sc-ngs-ml

  • 421