Inês Mendes

The role of bioinformatics in clinical microbiology

MRamirez Lab - iMM João Lobo Antunes

Instituto Gulbenkian de Ciência

14 July 2022

World Health Organisation Global Priority Pathogens list. This catalogue includes, besides Mycobacterium tuberculosis considered the number one global priority, a list of twelve microorganisms grouped under three priority tiers according to their antimicrobial resistance: critical (Acinetobacter baumannii, Pseudomonas aeruginosa and Enterobacteriaceae), high (Enterococcus faecium, Helicobacter pylori, Salmonella species, Staphylococcus aureus, Campylobacter species and Neisseria gonorrhoeae), and medium (Streptococcus pneumoniae, Haemophilus influenzae and Shigella species). The major objective was to encourage the prioritisation of funding and incentives, align research and development priorities of public health relevance, and garner global coordination in the fight against antimicrobial-resistant bacteria. Adapted from World Health Organization, 2017.

Clinical Microbiology

Bacterial Population Genetics

Pathogenesis and Natural History of Infection

Outbreak Investigation and Control

Surveillance of Infectious Diseases

Principles of current processing of bacterial pathogens. Schematic representation of the current workflow for processing samples for bacterial pathogens is presented, with high complexity and a typical timescale of a few weeks to a few months. Samples that are likely to be normally sterile are often cultured on rich medium that will support the growth of any culturable organism. Samples contaminated with colonising flora present a challenge for growing the infecting pathogen. Many types of culture media (referred to as selective media) are used to favour the growth of the suspected pathogen. Once an organism is growing, the likely pathogens are then processed through a complex pathway that has many contingencies to determine species and antimicrobial susceptibility. Broadly, there are two approaches. One approach uses MALDI-TOF for species identification prior to setting up susceptibility testing. The other uses Gram staining followed by biochemical testing to determine species; susceptibility testing is often set up simultaneously with doing biochemical tests. Lastly, depending on the species and perceived likelihood of an outbreak, a small subset of isolates may be chosen for further investigation using a wide range of typing tests. Adapted from Didelot et al., 2012

Joana Silva, MRamirez Lab

Principles of current processing of bacterial pathogens based on whole genome sequencing. Schematic representation of the workflow for processing samples for bacterial pathogens after the adoption of whole genome sequencing, with an expected timescale that could fit within a single day. The culture steps would be the same as currently used in a routine microbiology laboratory (see Figure 1.2). Once a likely pathogen is ready for sequencing, DNA is extracted, taking as little as 2 hours to prepare the DNA for sequenc- ing. After sequencing, the main processes for yielding information is computational. Automated sequence assembly algorithms are necessary for processing the raw sequence data, from which species, relationship to other isolates of the same species, antimicrobial resistance profile and virulence gene content can be assessed. All the results can also be used for outbreak detection and infectious diseases surveillance. Adapted from  Didelot et al., 2012

https://www.genome.gov/about-genomics/fact-sheets/DNA-Sequencing-Costs-Data

The three revolutions in sequencing technology that have transformed the landscape of bacterial genome sequencing. The first-generation, also known as Sanger sequencers, is represented by the ABI Capillary Sequencer (Applied Biosystems). The second-generation, also known as high-throughput sequencers, is represented by MiSeq, a 4-channel sequencer, and NextSeq, a 2-channel sequencer (Illumina), both sequencing by synthesis instruments. Lastly, the third-generation, also known as long-read sequencers, is represented by the Pacific Bioscience BS sequencer and Oxford Nanopore MinION sequencer. 

Hypothetical workflow based on metagenomic sequencing. Schematic representation of the hypothetical workflow for the direct processing of samples from suspected sources of pathogens after adoption of metagenomic sequencing, with an expected timescale that could fit within a single day. Adapted from Didelot et al., 2012

https://eurofinsgenomics.eu/en/eurofins-genomics/material-and-methods/metagenome-analysis/

https://www.frontiersin.org/articles/10.3389/fpubh.2022.899077/full

Clinical presentation of the patient. (A) At the admission to ICU, it could be seen the moderate swelling, hemorrhagic blisters and bullae, marked pain with movement, and overlying erythema of his left lower limb. (B) Three days after admission, the area of redness and swelling of the left lower limb gradually spread to the root of the thigh and perineum.

  • A healthy 39-year-old man presented a sudden pain in the left lower extremity. He suffered from septic shock, dysfunction of coagulation, acute kidney dysfunction, acute respiratory distress syndrome, and acute liver function injury.

 

  • The diagnosis was obtained through clinical manifestation and metagenomic next-generation sequencing  drawn from the pustule and deep soft tissue (lower limb) samples while all bacterial cultures came back negative.

 

  • The pustule mNGS report detected a total of 132 unique group A streptococcus sequence reads, representing 96.3% of microbial reads while the soft tissue mNGS report identified a total of 142474 unique group A streptococcus sequence reads, representing 100% of microbial reads.

https://www.frontiersin.org/articles/10.3389/fpubh.2022.899077/full

Bioinformatics

 | Whishful thinking

Magic box of NGS Wonders for Microbiology

Completely characterized strain:

  • Identification & Typing
  • Antibiotic resistance profile
  • Virulence factors present
  • Other information
    • spa (S. aureus)
    • emm (GAS)

Bioinformatics

 | The role in microbiology

Read mapping

  • Using a reference strain:
    • Outbreak determination
    • Comparative studies
  • Caveats:
    • Recombination/horizontal gene transfer
    • Bias towards reference/reference dependent

Read mapping software

reads

reference genome

https://nextstrain.org/ncov/gisaid/global/6m

https://www.who.int/publications/m/item/weekly-epidemiological-update-on-covid-19---1-june-2022

https://covariants.org/per-country?region=World

https://nextstrain.org/ncov/gisaid/global/6m

Bioinformatics

 | The role in microbiology

Gene-by-Gene

  • No need for a reference strain
  • "Discovery" of new features
  • Caveats:
    • Missing data

de novo assembly software

reads

contigs

annotation/ comparison

Two different approaches to genome assembly: (a) in Overlap, Layout, Consensus assembly, (i) overlaps are found between reads and an overlap graph constructed (edges indicate overlapping reads). (ii) Reads are laid out into contigs based on the overlaps (dashed lines indicate overlapping portions). (iii) The most likely sequence is chosen to construct consensus sequence. (b) In dBg assembly, (i) reads are decomposed into kmers by sliding a window of size k across the reads. (ii) The kmers become vertices in the dBg, with edges connecting overlapping kmers. Polymorphisms (red) form branches in the graph. A count is kept of how many times a kmer is seen, shown here as numbers above kmers. (iii) Contigs are built by walking the graph from edge nodes. A variety of heuristics handle branches in the graphs—for example, low coverage paths, as shown here, may be ignored.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7299287/

A billion piece puzzle with no reference image

Bioinformatics

 | The role in microbiology

Multilocus Sequence Typing

Why stop here?

Bioinformatics

 | The role in microbiology

Whole-Genome/Core-Genome MLST

  • Profile
File Locus 1 Locus 2
genome1.fasta Allele 1 Allele 2
genome2.fasta Allele 1 Allele 6
  • Visualization

https://online.phyloviz.net/; https://doi.org/10.1093%2Fnar%2Fgkw359

Clinical Microbiology

 | Global Impact

https://www.who.int/emergencies/disease-outbreak-news/item/2022-DON369

  • On 27 March 2022, the United Kingdom notified WHO of a cluster of cases with Salmonella Typhimurium sequence type 34 infection.
  • Investigations linked the outbreak to chocolate produced in Belgium, which have been distributed to at least 113 countries.

Clinical Microbiology

 | Global Impact

ECDC, EFSA. 2022. Rapid OutbreaK Assessment; Sun et al. 2020. Foodborne Pathog Dis 17:87–97;

Cumulative number of confirmed and probable monophasic S. Typhimurium cases by week and country in nine EU/EEA countries and the UK, as of 8 April 2022.

Clinical Microbiology

 | Global Impact

Geographical distribution of reported Salmonella Typhimurium outbreak cases (n=151) and countries where implicated products have been distributed (n=113), as of 25 April 2022.

https://www.who.int/emergencies/disease-outbreak-news/item/2022-DON369

Clinical Microbiology

 | Global Impact

ECDC, EFSA. 2022. Rapid OutbreaK Assessment; Sun et al. 2020. Foodborne Pathog Dis 17:87–97;

Minimum spanning tree of 24 human (dark blue) and nine non-human (light blue) monophasic Salmonella Typhimurium isolates. cgMLST with 3 225 loci. Cluster includes the representative isolates of the outbreak strain (SRR17830210 and SRR18021617), all human isolates from HC5_296366 and four non-human isolates from the Belgian Processing Plant B.

Bioinformatics

 | The role in microbiology

Soucy, S., Huang, J. & Gogarten, J. Horizontal gene transfer: building the web of life. Nat Rev Genet 16, 472–482 (2015). https://doi.org/10.1038/nrg3962

Pangenome:

" (...) hence, core and dispensable genes represent the essence and the diversity of the species, respectively.

Mendes, CI. Pan-genome comparison between Streptococcus dysgalactiae subsp. equisimilis isolates from human and animal sources. Tese de mestrado, Bioinformática e Biologia Computacional, FCUL, 2016. http://hdl.handle.net/10451/25958

https://pha4ge.org/

PHA4GE

 | A global coalition

 Why PHA4GE: 

  • Establish global consensus data standards
  • Document and share best practices
  • To improve the availability of critical bioinformatic tools and resources
  • Advocate for greater openness, interoperability, accessibility and reproducibility in public health bioinformatics

PHA4GE

 | A global coalition

reads

metadata

repository

stakeholders

https://doi.org/10.1093/gigascience/giac003

Algorithms

Data Structures

Interfaces

The role of bioinformatics in clinical microbiology

https://slides.com/

inesmendes/igc_2022

@ines_cim

cimendes

Thank you for your attention

MRamirez Lab, iMM

2019

The role of bioinformatics in clinical microbiology - IGC 2022

By Inês Mendes

The role of bioinformatics in clinical microbiology - IGC 2022

  • 479