with Altair and Vega-Lite
2020 TomTom Applied Machine Learning Conference
2020 Open Data Science Conference (ODSC) West
There is a lot of data in the world.
Transforming data from nonintuitive representations to intuitive representations is essential.
1.
2.
exploratory graphics
explanatory graphics
The way forward is not always clear.
Even after choosing a library, things can get hairy...
"All I want to do is change the tick format!"
"Ugh, why is the API for the scatterplot completely different from the bar chart?"
"What? This library doesn't support mouseover?"
"Well, I found the old .png file, but the code won't run in this new environment to recreate it."
There has to be a better way.
What makes a good data visualization flow?
A good data visualization workflow must enable you to:
1. Produce visualizations quickly and intuitively.
2. Easily create interactive visualizations.
3. Easily share and reproduce visuals.
DATA VISUALIZATION 4EVA!!!
Who's excited?
Excellent! Now it's time for Altair and Vega-Lite.
D3.js
D3 is a JS library that manipulates the HTML, SVG, and CSS in your web browser to render visualizations.
D3.js
Vega
Vega is a low-level declarative visualization grammar with a JSON syntax. Vega is sits atop D3 and provides the building blocks to create interactive visualizations.
D3.js
Vega
Vega-Lite
Vega-Lite is a high-level, concise declarative visualization grammar with a JSON syntax, based on Vega. It works by compiling its specifications to Vega specifications.
D3.js
Vega
Vega-Lite
Altair
Altair is a declarative visualization library for Python that is built on top of Vega-Lite.
That is one sweet stack.
Thanks to D3, you aren't tied to a particular framework or backend to render your visualizations.
All you need is modern web browser.
This makes your work highly portable and very shareable.
All visualizations in this stack can be represented as a JSON specification that is interchangeable amongst Altair, Vega-Lite, and Vega.
You can tinker at whichever stack level you prefer.
What if instead of passing around pixels, we pass around visualization specifications plus data?
With Vega-Lite, we can share open-source specs and data, not just static copies of visualizations.
Hello, reproducibility!
- Jake VanderPlas,
Creator of Altair
"Altair is a declarative visualization library..."
"Vega-Lite is a high-level grammar of interactive graphics..."
"Vega is a low-level declarative grammar with a JSON syntax..."
No worries, let's break it down.
A grammar of graphics is a tool to concisely describe the components of a visualization.
Graphics grammars provide mental models and building blocks, rather than chart templates, to create visualizations.
The Vega-Lite grammar is declarative, enabling you to describe what you want to accomplish rather than how the computer must accomplish it.
This results in a more natural language-based (and human) approach to thinking about and implementing graphics.
declarative
imperative
Again, credit to: Jake VanderPlas, The Python Visualization Landscape. (PyCon 2017)
When visualizing, we want to focus on
data and relationships, not syntax and libraries.
Talk is cheap.
Now for some examples.
data
encoding
mark
data transformation
chart customization
data
encoding
mark
data transformation
chart customization
Nifty, huh?
Vega-Lite enables quick, intuitive creation of visualizations.
In 2017, Vega-Lite introduced
a grammar of interactions.
mouseover
pan and zoom
interval selection
brushing
...and more, all as first-class citizens of the graphics grammar.
data
encoding
mark
data transformation
chart customization
interactions
data
encoding
mark
data transformation
chart customization
interactions
Interactive visualizations?
Check.
Pythonistas can reap all the benefits of
Vega-Lite grammar when using Altair.
Click the link to access the heroku app directly, or view it as an iframe on this slide.
The Python visualization landscape is diverse, and there are many great libraries out there.
It would be shortsighted to crown a single library as the only tool to use for visualization.
However...
I contend that making Altair and Vega-Lite
the backbone of your data visualization process will enable you to achieve the following in your work:
Intuitive, quick creation of explanatory and exploratory graphics.
Easy generation of interactive visualizations.
Reproducibility of shared visualizations.
Now, go forth and visualize!
Vega: https://vega.github.io/vega/
Vega-Lite: https://vega.github.io/vega-lite/
Altair: https://altair-viz.github.io/
The Python Visualization Landscape, Jake VanderPlas. (PyCon 2017)
https://www.youtube.com/watch?v=FytuB8nFHPQ
How to Think about Data Visualization, Jake VanderPlas. (PyCon 2019)
https://www.youtube.com/watch?v=vTingdk_pVM