BioCicle

A Tool for Summarizing and Comparing Taxonomic Profiles out of Biological Sequence Alignments

Meili Vanegas-Hernandez

Why BioCicle?

15,000-18,000

1

new species are discovered yearly

BLAST, HMM

algorithms for comparing primary biological sequence information

how can we clasify them?

Biological Sequence Comparisons

2

Biological Sequence Comparisons

3

misclassifications

erroneous data stored in database

widespread

future comparisons consider innacurate information

Biological Sequence Comparisons RESULTS

4

VS.

output

________?

homolog protein

RS1

RS2

RS3

Sequence alignment

Taxonomic profile

Sequence description

5

VS.

RS1

RS2

RS3

Sequence alignment

Taxonomic profile

Sequence description

output

Related Work

6

RS1

RS2

RS3

Sequence alignment

Taxonomic profile

Sequence description

BOV, Blast2Go, Artemis, HMM Editor

CLANS, GenoPlotR

Circoleto, Clustal W, Hmmer, GeneCluster VIZ, Megan, Blast Grabber

MG Rast

Amphora Vizu, MetaPHlAn, KRONA

MEGAN, METAREP, Blast Grabber

NCBI

7

VS.

VS.

Analysis Tasks Identification

Classify an

unknown

Identify

relationships

between

multiple

ORGANISMS

ORGANISM

ATb

ATa

Related Work

8

RS1

RS2

RS3

Sequence alignment

Taxonomic profile

Sequence description

BOV, Blast2Go, Artemis, HMM Editor

CLANS, GenoPlotR

Circoleto, Clustal W, Hmmer, GeneCluster VIZ, Megan, Blast Grabber

MG Rast

Amphora Vizu, MetaPHlAn, KRONA

MEGAN, METAREP, Blast Grabber

NCBI

AT1b

AT1a

AT2b

AT2a

AT3b

AT3a

Related Work

9

RS1

RS2

RS3

Sequence alignment

Taxonomic profile

Sequence description

BOV, Blast2Go, Artemis, HMM Editor

CLANS, GenoPlotR

Circoleto, Clustal W, Hmmer, GeneCluster VIZ, Megan, Blast Grabber

MG Rast

Amphora Vizu, MetaPHlAn, KRONA

MEGAN, METAREP, Blast Grabber

NCBI

AT1b

AT1a

AT2b

AT2a

AT3b

AT3a

Non-restrictive input: independant from the algorithm used for the comparison

AT2a: Taxonomic Profiles Single-Query Comparison

10

* Non-restrictive input: independant from the algorithm used for the comparison

METAREP **

Amphora Vizu *

MetaPHlAn *

UNIQUE RANK: TRADITIONAL GRAPHS

ALL RANKS:

TREE REPRESENTATIONS

** Multiple Organism Comparison: Supports analysis task AT2b

MG Rast

MEGAN **

Blast Grabber **

KRONA *

11

MG Rast

MEGAN **

METAREP **

Blast Grabber **

Amphora Vizu *

MetaPHlAn *

KRONA *

UNIQUE RANK: TRADITIONAL GRAPHS

ALL RANKS:

TREE REPRESENTATIONS

* Non-restrictive input: independant from the algorithm used for the comparison

No overview first

Detailed information

No score representation

Hard to read nodes

Overview

No overview

Hard to read leaves

Overview

Score representation

Efficient space filling

No score representation

Overview

Readable nodes

AT2a: Taxonomic Profiles Single-Query Comparison

** Multiple Organism Comparison: Supports analysis task AT2b

BioCicle

12

TAXONOMY

state-of-the-art

1

VISUALIZATION

taxonomic profiles

2

single comparison

VISUALIZATION

taxonomic profiles

3

multi-comparisons

OPEN SOURCE

WEB-BASED

NCBI/EBI UNIPROT API's

BioCicle

13

VISUALIZATION

taxonomic profiles for single query comparison

Less efficient space filling than KRONA

Score representation

Overview of the results

Readable nodes

Details on demand

14

METAREP **

Amphora Vizu *

MetaPHlAn *

UNIQUE RANK: TRADITIONAL GRAPHS

ALL RANKS:

TREE REPRESENTATIONS

* Non-restrictive input: independant from the algorithm used for the comparison

** Multiple Organism Comparison: Supports analysis task AT2b

MG Rast

MEGAN **

Blast Grabber **

KRONA *

AT2b: Taxonomic Profiles Multi-Query Comparisons

15

METAREP **

Amphora Vizu *

MetaPHlAn *

UNIQUE RANK: TRADITIONAL GRAPHS

ALL RANKS:

TREE REPRESENTATIONS

* Non-restrictive input: independant from the algorithm used for the comparison

** Multiple Organism Comparison: Supports analysis task AT2b

MG Rast

MEGAN **

Blast Grabber **

KRONA *

AT2b: Taxonomic Profiles Multi-Query Comparisons

All have restrictive input

Group all query-results in one visualization

Overview

Support AT2b

16

VISUALIZATION

taxonomic profiles for multi-query comparisons

Not scalable: Relies in user's memory

No details on demand

Overview

Independent visualizations for each query-result

17

VISUALIZATION

taxonomic profiles for multi-query comparisons

Not scalable: Relies in user's memory

No details on demand

Selection (ROI): Present only a subset of data considering the user's interest.

?

How can we define regions in the given dataset?

Interaction: Stop in a given section and present more information.

Indentify similarities in dataset: group results with similar outputs.

Resume button: be able to stop iteration.

Sparklines: present an overview for each of the queries and indicate where we are in the iteration.

Provide interaction in a specific icicle.

18

VISUALIZATION

taxonomic profiles for multi-query comparisons

Indentify similarities in dataset: group results with similar outputs.

?

Provide a score threshold set by the user and assing a temporal taxonomy for each of the unclassified organisms.

We have another tree!

19

VISUALIZATION

taxonomic profiles for multi-query comparisons

20

Current State

DEMO

21

What are we missing?

  1. Score threshold
  2. Results grouping
  3. Sparklines
  4. Resume option

BioCicle

By Meili Vanegas-Hernandez

Loading comments...

More from Meili Vanegas-Hernandez