A Bayesian model for single-cell transcript expression analysis with MERFISH data

Johannes Köster

 

2016

MERFISH

FISH

Fluorescence in-situ hybridization:

  • label RNA with fluorescent probes
  • see RNA molecules in single cells

 

Problem:

needs 1 color per transcript

MERFISH

...

MERFISH

Problem with raw counts:

20% loss and misidentification

MERFISH

Known error probabilities:

1→0 error: 10%

0→1 error: 4%

 

Goal:

  • Bayesian model on top of error probabilities
  • estimate gene/transcript expression
  • "Bayesian"-style differential expression analysis

Approach

Urn model for

expression likelihoods

Bayesian model for differential expression

Example results

Results

Simulation

Pr(0→1)    Pr(1→0)

simulated

hybridization

Bayesian model recovers biased counts

Bayesian model recovers biased counts

Application

Characterize batch effects in real data

Published dataset:

  • ~200 fibroblast cells
  • 7 batches
  • same biological condition

 

Question:

Are expression profiles influenced by batch effects?

t-SNE analysis

cell size

cell position

batch

Approach

  • calculate posterior estimate of coefficient of variation (CV) between means of batches
  • null model: CV < 0.5
  • control expected FDR at 5%

Differentially expressed genes

Gene ontology enrichment

Term expected observed size
response to temperature stimulus 0.36 3 6
cellular response to heat 0.36 3 6
negative regulation of endopeptidase activity 0.12 2 2
second-messenger-mediated signaling 0.12 2 2
positive regulation of protein kinase B signaling 0.12 2 2
regulation of peptidase activity 0.12 2 2
pos. regulation of reactive oxygen species metabolic process 0.12 2 2

controlled FDR at 5% with Benjamini-Yekuteli

Conclusion

Bayesian model for gene expression analysis on MERFISH data:

  • provides estimates of expression, fold change and coefficent of variation
  • credible intervals, expected FDR

 

Simulation:

can correct for biases in MERFISH data

 

Application:

applied to characterize batch effects

Acknowledgements

Shirley Liu

Myles Brown

Bo Li

Peng Jiang

Eric Severson

A Bayesian model for gene expression analysis with MERFISH data

By Johannes Köster

A Bayesian model for gene expression analysis with MERFISH data

  • 3,002