Shirley Wu
get data → explore data → design → code/make
↑
get data
↑
explore
data
↑
design
🔄
↑
code/make
so over the years, I've refined a process and collected tools to reduce the time it takes to finish a project
but everyone's process is a little bit different. I offer my process as a starting point, but feel free to experiment and see what works for you.
let's get started →
collect your own data
find pre-existing data
Notebooks/phones
Excel
APIs
↓
more intimately understand dataset
collect your own data
find pre-existing data
look for outliers, missing values, duplicates
↘ ↙
start with a curiosity!
↘ ↙
clean & validate your data
Excel
R, Python
Charts
ChatGPT
List all the attributes,
ask all the questions!
Bar chart
For categorical comparisons
Domain: categorical
Range: quantitative
Histogram
For categorical distributions
Domain: quantitative bins
Range: frequency of quantitative bin
Scatterplot
For correlation
2 categories, and the relationship between their quantitative values
Line chart
For temporal trends
Domain: temporal
Range: quantitative
how to choose charts:
charting tools:
Excel
R, Python
⚠️ ChatGPT
Brainstorm some charts
to answer the questions.
Quickly sketch them
and how you'd map the data.
Map individual or
aggregate data
elements to marks.
Map data attributes
to channels.
Visualization Analysis and Design. Tamara Munzner, with illustrations by Eamonn Maguire. A K Peters Visualization Series, CRC Press, 2014.
Visualization Analysis and Design. Tamara Munzner, with illustrations by
Eamonn Maguire. A K Peters Visualization Series, CRC Press, 2014.
Quantitative
Categorical
Temporal
Visualization Analysis and Design. Tamara Munzner, with illustrations by
Eamonn Maguire. A K Peters Visualization Series, CRC Press, 2014.
mark
bar
channels
x: category
y: quant
mark
point
channels
x: quant
y: quant
mark
point
channels
x: quant
y: quant
color: category
mark
point
channels
x: quant
y: quant
color: category
size: quant
Visualization Analysis and Design. Tamara Munzner, with illustrations by
Eamonn Maguire. A K Peters Visualization Series, CRC Press, 2014.
One-to-one mapping of data to channel
Multiple mappings of channel to mark (x, y, size, color usually)
Do not EVER map multiple data attributes to the same channel
Titles, descriptions, and legends
to explain the visualization
Axes and annotations
to describe the data
Resource:
Simulated Dendrochronology of U.S. Immigration 1790-2016
United States gun death data visualization
(cw: gun deaths)
Poppy Field - Visualising War Fatalities
(cw: war, death)
Create a more refined sketch, keeping in mind marks, channels, and visual metaphors
Break it down! What do you need to draw the marks? What do you need to calculate the channels?
To draw marks: SVG (or HTML5 Canvas)
To calculate channels: D3 scales, shapes, and layouts (or straight-up math!)
rect
x: x-coordinate of top-left
y: y-coordinate of top-left
width
height
circle
cx: x-coordinate of center
cy: y-coordinate of center
r: radius
text
x: x-coordinate
y: y-coordinate
dx: x-coordinate offset
dy: y-coordinate offset
text-anchor: horizontal text alignment
Hi!
path
d: path to follow
Moveto, Lineto, Curveto, Arcto
For translating raw data to what SVG needs to draw
Take output of layout calculations and draw SVG elements
Sometimes all you need are scales to get from data to screen space
Often times, you may need specific layouts.
These output x/y positions
And these generate path commands
Great dataviz-specific interactions
↰
← great for interactivity
← great for prototyping
Books:
The Functional Art by Alberto Cairo
Online:
Data Visualization Society
Information is Beautiful Awards