Genomic epidemiology during a pandemic: lessons learned

​Fred Hutch Cancer Research Center, Seattle, Washington, USA
Chan Zuckerberg Initiative & CZBiohub, San Francisco, California, USA

Sidney M. Bell (@sidneymbell)

Introduction

  • Genomic epidemiology
     
  • Nextstrain
     
  • COVIDTracker

Learnings

  • Open-source development
     
  • Data management
     
  • Partnership building

ACG

AGG

AGT

Many viruses mutate and spread on similar time-scales

ACG

AGG

AGT

Genomic epidemiology contextualizes outbreaks and resolves ambiguous cases

Project to conduct real-time genomic epidemiology and evolutionary analysis of emerging epidemics

 

James Hadfield,  Emma Hodcroft,  Thomas Sibley,  
John Huddleston,
  Louise Moncla,  Cassia Wagner,  Miguel Parades,  Misja Ilcisin,  Kairsten Fay,  Jover Lee,  Allison Black,  Colin Megill, Barney Potter,  Charlton Callender

Richard Neher

Trevor Bedford

Behind the scenes:
Nextstrain for SARS-CoV-2

  • Labs contribute directly to GISAID (now have >63k full genomes)
  • Nextstrain pulls a complete dataset from GISAID every 24 hours
  • This triggers an automatic rebuild on Amazon Web Services
  • We manually update new GPS coordinates, metadata, etc...
  • We push this build online to nextstrain.org and tweet the update from @nextstrain

Early warning: Washington state

2 day turnaround

24 languages
~500k weekly readers

Early warning: Kinshasa

SARS-CoV-2 is everywhere.

The fight is now (hyper)local.

COVIDTracker

Project Leads: Josh Batson, David Dynerman, Amy Kistler
Genomic Epidemiology: Patrick Ayscue, Sidney Bell
Sequencing: Michelle Tan, Angela Detweiler, Renuka Kumar, Lienna Chan, Lusajo Mwakibete, Karan Bhatt
Data & Engineering: Shannon Axelrod, Sam Hao, Danielle Kain, Phoenix Logan, Jack Kamm, Aaron McGeever, Angela Pisco, Jonathan Sheu, Tony Tung, James Webber, Mark Zhang

Project to build capacity for genomic epidemiology in Departments of Health across California

Coverage in California

Genomic epidemiology in practice

  • Resolve ambiguous contact tracing

  • Link outbreaks together

  • Identify introductions to county

Learnings:

open-source development, data management & partnership building

Developing
open-source software
during a pandemic

Sidebar: if you intend for your code to ever be run by someone else, you're a developer

To grow, you need an 'architecture of participation'

Reuters/Christian Hartmann

README

Contributing guidelines

Groomed issue board

Code of conduct!!

Clear process for PR review

Continuous integration will save your bacon

(or: the boring stuff no one wants to think about but will bite you in the tuckus if you don't)

Data management

Context is everything

Equity: we have work to do

Data sharing: meeting people where they're at

JSON with public data

CSV with private (PII/PHI) metadata

Linking data is a huge challenge

DOH ID CZB ID GISAID ID

Queryable, version-controlled, single source of truth

ACTG...

ACTG...

ACTG...

Scripted, reproducible QC & disambiguation

Linking data is a huge challenge

Partnerships & teams

Impact requires trusted collaborators

CLIAHub

Lightweight "process" helps get things done

Standup + Expectations + Code review

Work yourself out of a job

  Iteration, rest, appreciation  

Takeaways

  • Global progress is largely the sum of local progress
     
  • Genomic epidemiology can provide key context and disambiguation during outbreaks
     
  • Building an architecture of participation is an investment in project resilience
     
  • A single, version-controlled source of truth will simplify your life
     
  • "Process" can be lightweight and amplify efforts
     
  • Partnerships and capacity building are required for long-term impact

Acknowledgements

COVIDTracker Project Leads: Josh Batson, David Dynerman, Amy Kistler
Genomic Epidemiology: Patrick Ayscue, Sidney Bell
Sequencing: Michelle Tan, Angela Detweiler, Renuka Kumar, Lienna Chan, Lusajo Mwakibete, Karan Bhatt
Data & Engineering: Shannon Axelrod, Sam Hao, Danielle Kain, Phoenix Logan, Jack Kamm, Aaron McGeever, Angela Pisco, Jonathan Sheu, Tony Tung, James Webber, Mark Zhang

Nextstrain: Trevor Bedford, Richard Neher,
James Hadfield,
  Emma Hodcroft  
Thomas Sibley,
  John Huddleston,  Louise Moncla,  Cassia Wagner,  Miguel Parades,  Misja Ilcisin,  Kairsten Fay,  Jover Lee,  Allison Black,  Colin Megill, Barney Potter,  Charlton Callender

Questions?

@sidneymbell

slides.com/sidneymbell

sbe-coronavirtual-2020

By Sidney Bell

sbe-coronavirtual-2020

  • 755