Take a sad chart and make it better

Trang Le

University of Pennsylvania

Talk at UNC Institute of Marine Sciences

2021-08-30

Image by Tyler Morgan Wall

Rules don't always apply.

what am I trying to say

with my chart?

I would use histograms or stacked bar charts for each comparison.
Trade-off: node sizes, edges not shown.

Day 3 vs. Day 0

Day 7 vs. Day 0

Metabolome

Proteome

Transcriptome

Novel nodes

how much information
should I show?

Histograms heavily depend on bin width and location

performance

Model B – Model A

Model B

Model A

but sometimes binning is helpful... 

e.g. to solve overcrowding problem

ggcyto::ggcyto() + geom_hex(bins = 64)

what other ways to

reduce the cognitive burden?

  • direct labeling
  • reduce number of labels
  • highlighting
  • consistent color scheme

Risk of hypertension

utilize

(facets)

facet_grid(
  cols = vars(...),
  scales = 'free', 
  space = 'free'
)

paper ≠ presentation

  • amount of information
  • annotation
  • highlights
  • abbreviations
  • builds
  • text sizes
  • ...

 

White scientists are overrepresented

scientists are underrepresented

Asian

We were not adequately powered to detect the differences in other groups.

We were not adequately powered to detect the differences in other groups.

We were not adequately powered to detect the differences in other groups.

on colors

  • sequential, quantitative, divergent
  • colorblind friendly palettes
+ scale_color_viridis_c()
+ rcolorcarto::scale_color_carto_d()

Resources

your turn

Tools Global Options R Markdown:

uncheck Show output inline

Great data stories

windRose(openair_data)
library(treeheatr)
heat_tree(penguins, target_lab = 'species')

How do you remove legend title?
What about y-axis title?