iTag sequencing

Adam R. Rivers

Microbial genomics and metagenomics workshop

April 2016

DW Waite, P Deines, MW Taylor. 2012. Gut Microbiome of the Critically Endangered New Zealand Parrot, the Kakapo (Strigops habroptilus).  PLOS One. v. 7, p. e35803.

Gustav Klimt, "Tree of life", 1909,  Wikimedia

Amplicon sequencing at JGI

Bacterial and Archaeal 16S rRNA

Eukaryotic  18S rRNA  

Fungal Internally Transcribed Spacer (ITS)

5.8S

18S

28S

18S

16S

V4-V5

V4

ITS2

Questions iTags answer most often

  • Taxonomic composition
  • Sample similarity
  • Alpha diversity
  • Beta diversity
  • Community structure
  • Imputed function

iTags sequencing is transitioning from descriptive to experimental applications

Langille et al. 2013

An overview of iTags sequencing at JGI

10,000

samples in 2016

4 Billion

iTags reads per year

  • Production scale sequencing
  • An automated QC system
  • The iTagger automated analysis system
  • Archival data storage 
  • Public web portal

Library creation and sequencing

Two 96 well plates

PCR with 1 forward and 196 reverse primers

  • Illimina MiSeq

  • 2 X 300 bp

  • 45-50M reads

  • 350,000 reads per sample

Multiplexing

Illumina iTags primer layout

5'-

-3'

Forward primer 515Y-F

Illumina adapter

Barcode

Spacer

Primer

AATGATACGGCGACCACCGAGATCTACAC TCTTTCCCTACA CGACGCTCTTCCGATCT NNNNN GTGYCAGCMGCCGCGGTAA

Linker

Illumina adapter

Barcode

Spacer

Primer

AGCAGAAGACGGCATACGAGAT AGCGAACCTGTT GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT CTA CCGYCAATTYMTTTRAGTTT

Linker

Reverse primer 926-R

Current primers used

Bacterial and Archaeal 16S rRNA

Eukaryotic  18S rRNA  

Fungal Internally Transcribed Spacer (ITS)

5.8S

18S

23S

18S

16S

V4-V5

V4

ITS2

515YF     GTGYCAGCMGCCGCGGTAA      Parada et al. 2015

926R      CCGYCAATTYMTTTRAGTTT         Parada et al. 2015

565F     CCAGCASCYGCGGTAATTCC    Stoeck et al. 2010

948R    ACTTTCGTTCTTGATYRA              Stoeck et al. 2010

ITS9F     GAACGCAGCRAAIIGYGA           Menkis et al. 2012

ITS4R     TCCTCCGCTTATTGATATGC      White et al. 1990

Updates to iTags sequencing

  • V4-V5 region  515YF - 926R  (Parada et al. 2015)
  • Change to new annotation databases:
    • Silva version 123
    • Unite version 7.0
    • Automated database updates
  • Major version release of iTagger 2.0
    • Simplified code base 
    • OTU clustering with USEARCH 8.1 Uparse (Edgar 2013)
    • Taxonomic assignment with Utax rather than RDP
    • Core diversity analysis with QIIME 1.9.1
    • A central database of OTUs for comparative research

 

 

Locked primers for symbiosis

In Host dominated communities their  DNA predominate

iTagger: Read quality control

Read QC

OTU Clustering

Taxonomic analysis

  1. Remove contaminants  (JGI RQC)
  2. Merge reads (Usearch)
  3. Primer trim (Usearch)
  4. Size and expected error filtering 
  5. Dereplication
Total Reads 40 M
PhiX 0.3%
Illumina artifacts 0.1%
Primer trimmed 5%
Mate pair extended 2%
Length Filtered <0.1%
Final quality filtering 15%
Final Reads 26M
Average seqs per sample 364k

iTagger: OTU clustering

Read QC

OTU Clustering

Taxonomic analysis

  • Reads are sorted by abundance
  • USEARCH-Uparse is used to cluster iteratively at 99-97% 

Edgar (2013)

iTagger: Taxonomic analysis

Read QC

OTU Clustering

Taxonomic analysis

Utax classifier classifies centroids using 8-mers and a principled training model

Databases

16S : Silva v. 123.1

18S: Silva v. 123.1

ITS2: UNITE v. 7.0

Precision

Error precision curve for V3-V5

Summary plots  (QIIME)

Making connections across data

JGI has one of the largest 16S repositories.  We  have created a database  to record and compare V4-V5 OTUs across projects. 

 

We are investigating adding visualization tools to explore iTag data. 

 

Bik et al.  2014 (preprint)

Genome portal demonstration

Going further with iTags data

Future directions: Full length reads

Future directions: Ecological networks

Future directions: Ecological networks

A bayesian network constructed from 350 wetland samples

 

Directed links indicate conditional dependence 

 

 

 

 

Tringe and Theroux, unpublished

estimated with BNlearn

Questions?

Adam R. Rivers

arrivers@lbl.gov

Made with Slides.com