Modelling notes, overview

Tags, SAGs and MAGs

Tags:

Pros:

  • Comprehensive
  • Quantitative
    • Networks / interactions
    • Simple to interpret

Cons:

  • No genome info
  • Comparatively low resolution

SAGs

Pros:

  • Genome content
    • Traits
  • Many have rRNA sequences

Cons:

  • Incomplete (~1/3 on average)
  • Surface only
  • Rabbit hole

MAGs:

Pros:

  • Genome content
  • Deep sample info (BGT)
  • May be more complete

Cons:

  • Harder to process/interpret
  • No rRNA
  • Rabbit hole

Moving forward

Ways to proceed:

  • BioGEOTRACES samples are obvious first step:
    • Tags, MAGs, SAGs all together
    • Sarah already working on integration Tags <=> SAGs
      • Input from Emily would be helpful here
    • Jesse working on Tags:
      • Distributions in relation to water column features
      • Network analysis (i.e. with phototrophs)
        • Help inform ecological interpretation

New datasets

  • CLIVAR cruises of particular interest:
    • Unfractionated, pass through very distinct regions
    • Evenly sampled down to 1000m, historical record
  • Cons:
    • No $ to do metagenomes, only Tags + SAGs

Text

GORG SAGs

Integrating 'omics and modelling

Suggest to prioritize "high level" metrics:

  • Genome size, Codon Usage Bias
  • Check for obvious pangenome differences
    • Could be combined with derived Tag data, i.e. phototroph-bacterioplankton interactions
  • Other metrics could include existing databases:
    • MEROPS (proteases)
    • CAzyme (carbohydrate-active enzymes)