Suggest to prioritize "high level" metrics:
- Genome size, Codon Usage Bias
- Check for obvious pangenome differences
- Could be combined with derived Tag data, i.e. phototroph-bacterioplankton interactions
- Other metrics could include existing databases:
- MEROPS (proteases)
- CAzyme (carbohydrate-active enzymes)