Machine learning and metagenomics

Adam R. Rivers

Cross-JGI informatics talk

October 1, 2015

The netflix challenge

  • 100,480,507 ratings
  • 480,189 users
  • 17,770 movies
  • Most users rate only 200 movies
  • 10% improvement = $1,000,000

DNAnerd

MrsDarcy85

ARivers_JGI

Train

Predict

5 4 2 5
4 5 1 2
2 1 5

What is machine learning

Useful when:

  • a mechanistic model cannot be built
  • The number of random variables is large
  • Amount of training data is large

Evaluated with formal methods

A group of statistical methods historically associated with CS/AI for:

  • Classification
  • Prediction
  • Clustering
  • Dimensionality reduction

 

Machine learning applications

Supervised

Supervised

Unsupervised

Unsupervised

Finding highly divergent RNA viruses

Machine Learning and Metagenomics

By Adam Rivers

Machine Learning and Metagenomics

  • 358