White Tailed Deer Genome Assembly
Theodore B. Davis, Richard D. Morgan, Bradley W. Langhorst
New England Biolabs
Is it possible for a small group of people to sequence and assemble a large eukaryotic genome as an evening and weekend project?
- Produce a useful scientific resource
- Gain first hand experience with latest methods
- Fun project
Why Deer?
Odocoileus virginianus borealis
- Interesting biology (prion disease, tick-borne illnesses, malaria host[1])
- Likely to be inbred due to extensive historical hunting and constrained habitat
- Minimal sequence data in public databases
- Significant interest in species ($10B spent annually)
DNA Libraries
- PacBio
- Proteinase K + Phenol-Chloroform
- Modified 20 kb Template Preparation protocol
- 283 SMRT Cells, 10M filtered reads
- Illumina
- Zymo Quick DNA Kit
- NEBNext Ultra II DNA Library Prep
- 6x PCR
- 1.5B NextSeq Reads (2x150 bp)
Assembly Strategy
- PacBio RSII Reads
- Falcon Assembler
- Quiver
- Illumina
- Pilon
- Reapr
Read Lengths
Contig Lengths
N50 | 6Mb |
# of Contigs | 3,449 |
Longest Contig | 33Mb |
Genome size (in contigs) | 2.5Gb |
RNA-Seq Library
- Muscle tissue
- Proteinase K + Qiagen RNA columns
- NEBNext rRNA depletion
- Ultra II RNA Library Prep
- 470M NextSeq Reads (2x150bp)
Annotation plan
- De-novo Transcriptome
- Small Sample Trinity
- Eliminate most abundant
- (Salmon, top10 -> Mirabait)
- Trinity on "interesting" reads
- Genome alignment
- Hisat 2
- Stringtie
- Annotation
- Pasa
- Augustus
Abundant Transcripts
Ted Davis - Initiative, Tissue, DNA Isolation, Illumina Libraries
Rick Morgan - DNA Isolation, PacBio libraries
Brad Langhorst - Assembly, Annotation, RNA isolation, RNA-seq library
NEB Management
NEB Sequencing: Joanna Bybee, Danielle Rivizzigno, Laurie Mazzola
