Visualizing Statistical and Machine Learning Concepts
Michael Freeman, University of Washington
@mf_viz
#ODSC
Today's Objective
Develop a process for designing and building visual explanations of statistical and machine learning concepts.
What is division?
With the people around you, take 5 minutes and draw a visual explanation of this concept.
In order to visualize concepts, we need to isolate specific ideas, identify underlying data structures, and leverage corresponding algorithms.
Process
Concepts → Ideas
Ideas → Data
Data → Algorithm
Process
Concepts → Ideas
Ideas → Data
Data → Algorithm
Concepts → Ideas
What foundational ideas underlie your statistical concept?
"What is the Central Limit Theorem?"
Central Limit Theorem
"Distribution of the sampling mean"
What foundational ideas underlie this statistical concept?
Ideas Underlying CLT
Variation within a population
Ideas Underlying CLT
Sampling and how it varies
Ideas Underlying CLT
Repeated sampling from your population
Ideas Underlying CLT
Distributions and normality
Ideas Underlying CLT
Distributions of sample means
Process
Concepts → Ideas
Ideas → Data
Data → Algorithm
Ideas → Data
What data expresses your idea?
"What is hierarchical modeling?"
Hierarchical Modeling
What data expresses these ideas?
Source code available here
Data Generation Demo
Process
Concepts → Ideas
Ideas → Data
Data → Algorithm
Data → Algorithm
What algorithm enables you to express your data?
"What is conditional probability?"
Bouncing
What algorithms are necessary to express this data?
// Select circles inside the svg and bind data to the selection
var bubbles = mySvg.selectAll('circle')
.data(myData);
// Use D3.js to create and position circles
bubbles
.enter()
.append("circle")
.attr("cx", (d) => xScale(d.x))
.attr("cy", (d) => yScale(d.y))
.attr('r', radius)
// Merge (updating) circles and stage a transition
.merge(bubbles)
.transition()
.delay(() => Math.random() * 50)
.ease(d3.easeBounce)
.attr("cx", (d) => xScale(d.x))
.attr("cy", (d) => yScale(d.y))
Bouncing
What algorithms are necessary to express this data?
// D3.js bouncOut algorithm
var b1 = 4 / 11,
b2 = 6 / 11,
b3 = 8 / 11,
b4 = 3 / 4,
b5 = 9 / 11,
b6 = 10 / 11,
b7 = 15 / 16,
b8 = 21 / 22,
b9 = 63 / 64,
b0 = 1 / b1 / b1;
export function bounceOut(t) {
return (t = +t) < b1 ? b0 * t * t :
t < b3 ? b0 * (t -= b2) * t + b4 :
t < b6 ? b0 * (t -= b5) * t + b7 :
b0 * (t -= b8) * t + b9;
}
All 3 stages need to be done well
Concepts → Ideas
Ideas → Data
Data → Algorithm
In order to visualize concepts, we need to isolate specific ideas, identify underlying data structures, and leverage corresponding algorithms.
Thank you
All materials available at mfviz.com/odsc-2017
@mf_viz
odsc-2017
By Michael Freeman
odsc-2017
Visualizing Statistics and ML
- 1,957