White Tailed Deer Genome Assembly

Theodore B. Davis, Richard D. Morgan, Bradley W. Langhorst

New England Biolabs

Motivation

Is it possible for a small group of scientists to sequence and assemble a large eukaryotic genome as an evening and weekend project?

 

Goals:

  • Produce a useful scientific resource
  • Gain first hand experience with latest methods
  • Fun project

Why Deer?

Odocoileus virginianus borealis

  • Interesting biology (prion disease, tick-borne illnesses, malaria host[1])
  • Likely to be inbred due to extensive historical hunting and constrained habitat
  • Minimal sequence data in public databases
  • Significant interest in species  ($10B spent annually)

[1] Martinsen, E. S., Mcinerney, N., Brightman, H., Ferebee, K., Walsh, T., Mcshea, W. J., … Fleischer, R. C. (2016). Hidden in plain sight : Cryptic and endemic malaria parasites in North American white-tailed deer ( Odocoileus virginianus ). Science Advances, (February), 1–8.

Assembly

DNA Libraries

  • PacBio
    • Proteinase K + Phenol-Chloroform
    • Modified 20 kb Template Preparation protocol
    • 283 SMRT Cells, 10M filtered reads
  • Illumina
    • Zymo Quick DNA Kit
    • NEBNext Ultra II DNA Library Prep
    • 6x PCR
    • 1.5B NextSeq Reads (2x150 bp)

Assembly Strategy

  • PacBio RSII Reads
    • Falcon Assembler
    • Quiver
  • Illumina
    • Pilon
    • Reapr

Read Lengths

Contig Lengths

20X

28X

35X

Results

N50 6Mb
# of Contigs 3449
Longest Contig 33Mb
Genome size (in contigs) 2.5Gb

Annotation

RNA-Seq Library

 

  • Muscle tissue
  • Proteinase K + Qiagen RNA columns
  • NEBNext rRNA depletion
  • Ultra II RNA Library Prep
  • 470M NextSeq Reads (2x150bp)

Annotation plan

  • De-novo Transcriptome
    • Small Sample Trinity
    • Eliminate most abundant
      • (Salmon, top10 -> Mirabait)
    • Trinity on "interesting" reads
  • Genome alignment
    • Hisat 2
    • Stringtie
  • Annotation
    • Pasa
    • Augustus

Abundant Transcripts

Acknowledgments

Ted Davis - Initiative, Tissue, DNA Isolation, Illumina libraries

Rick Morgan - DNA Isolation, PacBio libraries

Brad Langhorst - Assembly, Annotation, RNA isolation, RNA-seq library

 

 

 

 

Images: https://www.flickr.com/photos/bullfrogphoto/2374256270

https://www.flickr.com/photos/lsmith2010/5950507945

https://www.flickr.com/photos/avexbhphotos/12332751825

Josh McCoy - DNA/RNA Source tissue

Copy of White Tailed Deer Genome - ISCB NGS 2016

By theodorebdavis

Copy of White Tailed Deer Genome - ISCB NGS 2016

  • 657