White Tailed Deer Genome Assembly
Theodore B. Davis, Richard D. Morgan, Bradley W. Langhorst
New England Biolabs
Motivation
Is it possible for a small group of scientists to sequence and assemble a large eukaryotic genome as an evening and weekend project?
Goals:
- Produce a useful scientific resource
- Gain first hand experience with latest methods
- Fun project
Why Deer?
Odocoileus virginianus borealis
- Interesting biology (prion disease, tick-borne illnesses, malaria host[1])
- Likely to be inbred due to extensive historical hunting and constrained habitat
- Minimal sequence data in public databases
- Significant interest in species ($10B spent annually)
[1] Martinsen, E. S., Mcinerney, N., Brightman, H., Ferebee, K., Walsh, T., Mcshea, W. J., … Fleischer, R. C. (2016). Hidden in plain sight : Cryptic and endemic malaria parasites in North American white-tailed deer ( Odocoileus virginianus ). Science Advances, (February), 1–8.
Assembly
DNA Libraries
- PacBio
- Proteinase K + Phenol-Chloroform
- Modified 20 kb Template Preparation protocol
- 283 SMRT Cells, 10M filtered reads
- Illumina
- Zymo Quick DNA Kit
- NEBNext Ultra II DNA Library Prep
- 6x PCR
- 1.5B NextSeq Reads (2x150 bp)
Assembly Strategy
- PacBio RSII Reads
- Falcon Assembler
- Quiver
- Illumina
- Pilon
- Reapr
Read Lengths
Contig Lengths
20X
28X
35X
Results
N50 | 6Mb |
# of Contigs | 3449 |
Longest Contig | 33Mb |
Genome size (in contigs) | 2.5Gb |
Annotation
RNA-Seq Library
- Muscle tissue
- Proteinase K + Qiagen RNA columns
- NEBNext rRNA depletion
- Ultra II RNA Library Prep
- 470M NextSeq Reads (2x150bp)
Annotation plan
- De-novo Transcriptome
- Small Sample Trinity
- Eliminate most abundant
- (Salmon, top10 -> Mirabait)
- Trinity on "interesting" reads
- Genome alignment
- Hisat 2
- Stringtie
- Annotation
- Pasa
- Augustus
Abundant Transcripts
Acknowledgments
Ted Davis - Initiative, Tissue, DNA Isolation, Illumina libraries
Rick Morgan - DNA Isolation, PacBio libraries
Brad Langhorst - Assembly, Annotation, RNA isolation, RNA-seq library
Images: https://www.flickr.com/photos/bullfrogphoto/2374256270
https://www.flickr.com/photos/lsmith2010/5950507945
https://www.flickr.com/photos/avexbhphotos/12332751825
Josh McCoy - DNA/RNA Source tissue
Copy of White Tailed Deer Genome - ISCB NGS 2016
By theodorebdavis
Copy of White Tailed Deer Genome - ISCB NGS 2016
- 657