Interactive "Ibry" + MetroNorth
+ framework for self learning dataviz
Cameron Yick
DVS NYC Chapter - June 2019 Edition
![](https://cl.ly/6db91c3df934/Image%252525202019-06-23%25252520at%252525204.57.05%25252520PM.png)
Outline
1. Motivation:
Why build Ibry (Marey) Charts of MetroNorth Data?
2. Process:
A Prioritization Framework for Side Learning Projects
3. Demo (🚂)
4. RECAP (Slides will be shared)
-
🏊♂️ Swimmer
- Season Progress
- Splits, fatigue rate, DPS, SPS
-
⚡️ Studying Electrical Engineering
- Signal processing/ brain data (MATLAB)
-
🧠 Studying Cognitive Science / Design
- Visual perception
- Motivation (game design, menus, playgrounds)
-
🕵️♂️ Data Engineer / Analyst
- Data quality (esp "public data") over time
-
📊 Software Engineer for Data Visualization Product
- Reusability/Performance (Interactivity / Density)
- Exploratory vs Tactical/Operational Tooling
My Visualization Past
#location-nyc
![](https://cl.ly/f94b0d259ddb/Image%252525202019-06-25%25252520at%252525204.23.48%25252520PM.png)
Data collected / analyzed / visualized by Stephanie Coker
Outline
1. Motivation:
Why build Ibry Charts of MetroNorth Data?
2. Process:
A Prioritization Framework for Side Learning Projects
3. Demo
4. Recap
![](https://images-na.ssl-images-amazon.com/images/I/41IXJO4Mk6L.jpg)
![](https://www.3cs.ch/wp-content/uploads/marey-train-schedule.jpg)
E.J. Marey's rendition (Concept by Charles Ibry)
![](https://sandrarendgen.files.wordpress.com/2019/03/bildschirmfoto-2019-03-14-um-13.46.34.png)
1847 Traffic Engineering: Constructing, not just displaying
(via Sandra Rendgren's Post)
![](https://pasarelapr.com/images/mnr-map/mnr-map-20.gif)
![](https://upload.wikimedia.org/wikipedia/commons/thumb/7/72/NY_Metropolitan_Area.png/1920px-NY_Metropolitan_Area.png)
![](https://cdn.theatlantic.com/assets/media/img/posts/2019/01/public_transit-2/dfc61f299.png)
MetroNorth enables 2 of the top 7!
2017 U.S. Census American Community Survey
Karen King's CityLab Analysis
Graph via Vega-Lite Editor
Just 7 US Metros where 10%+ workers take public transit to work.
Existing Train Tables
![](https://pbs.twimg.com/media/DfhQ5vxWsAEvFrx.jpg)
(See Physical Handouts)
Checkpoint
1. Motivation:
Why Build an Ibry Chart of MetroNorth Transit Data?
2. Process:
My Prioritization Framework for Side Learning Projects
3. Demo
4. Recap
Why Sharing Learning Approaches Matters in Data Visualization
Many (most?) practitioners are self taught!
![](https://cl.ly/ea84f9f78a03/Image%252525202019-06-23%25252520at%2525252010.27.32%25252520PM.png)
![](https://cl.ly/e9c5045d4086/Image%252525202019-06-23%25252520at%2525252010.55.01%25252520PM.png)
Data: github.com/data-visualization-society/data_visualization_survey
Survey by Elijah Meeks, data rendered in Tableau
Cathy O'Neil - most important lesson in data science
![](https://images-na.ssl-images-amazon.com/images/I/51V3piRZY4L.jpg)
![](https://covers.oreillystatic.com/images/0636920028529/lrg.jpg)
How can we keep people in the room?
Obstacles To Self Directed Learning
- Common fears
- What if the result is bad?
- What if I waste time?
- Overwhelmed by possibilities
- Can happen at start, in the middle...
- 🐰 Rabbit Holes
Taming Rabbit Holes
- Tutorial blog post you started with
- Javascript / Python Syntax Cheatsheet
- StackOverflow Answers
- Links from Slack
- Pudding / 538 / NYT / Kantar IIB Awards
- Property "x" does not exist
- Github issues open since 2015
- undefined is not a function
- New charting library
- What is this React thing?
- Property "y" is not a date
- etc...
![](https://cl.ly/3a1e3a5ebd10/Image%252525202019-06-24%25252520at%2525252010.01.57%25252520PM.png)
Make it run,
Make it right,
Make it fast.
![](https://media3.giphy.com/media/HuVCpmfKheI2Q/giphy.gif)
![](https://media3.giphy.com/media/YSB6SGgVKmfGpPfGyl/giphy.gif)
![](https://i.kinja-img.com/gawker-media/image/upload/s--8YpFiwyo--/c_scale,f_auto,fl_progressive,q_80,w_800/rmmbygnafoink4yexlzr.jpg)
- Kent Beck (Software Engineer)
Skipping Steps?
Who needs "first drafts" / "validation"?
Agile - not just for work
![](https://vignette.wikia.nocookie.net/simpsons/images/0/05/TheHomer.png/revision/latest?cb=20090908145331)
- Sandi Metz (Rubyist)
- Purpose of design is to enable doing design later
- Goal of design is to reduce the cost of change
Design and Code Quality Extras
- Uncle Bob (Robert Martin) - "Clean Architecture"
- Enable defer architectural choice as late as possible
- You'll have more information later, which will help with making better decisions
Over time- drafts get cleaner, but the ceiling moves
Design and Code Quality Extras
Overcoming Obstacles with Remakes
-
Common fears
- What if the result is bad?
- What if I waste time?
-
Overwhelmed by possibilities
- Can happen at start, in the middle...
- 🐰 Rabbit Holes
You already like the idea
The mantra gives you milestones
You're guaranteed to learn
Making it (run|right|fast) in practice
Make it Run: Overview
- Find existing code
- Find your own data
- Adapt (2) to fit (1) (or vice versa)
Goal:
Decide quickly if spending more time is worth it!
Making it Run - Find Code + Test Drive
![](https://cl.ly/a672062f4d5f/Image%2525202019-06-24%252520at%25252010.58.21%252520PM.png)
Google, Slack, Pinterest, Blockbuilder-Search (Links Below)
Prior Art - Code
-
San Francisco - Mike Bostock
-
MBTA (Massachusetts) - Twitter Viz
-
Explore variety of transit data
-
-
Metra (Chicago) - Nicholas Rougeux
-
Examine impact of removing some lines
-
Prior Art - History
- Tufte Bulletin on Graphical Timetables
- Sanda Rendgren - From Paris with Love (1845)
- Charles Ibry Chart not Marey Chart
- (Rename project?)
- (h/t Jason Forrest)
Prior Art - Usecases
- Singapore Open Data - Catching a Rogue Train
-
Analyzing the Flow of Work with Marey Charts (Agile)
- Reading patterns
- Attend to slope of line
- Attend to what lines are in parallel
- Attend to whether segments overall line up
- Reading patterns
- TrainVis Student Final Project
Make it Run: Data
- Finding MTA data
- 26 Hours in a day
- Data Enrichment/parsing/cleaning
- Vincenty Distance (non spherical)
- Precompute Slopes
- Jupyter Notebook for Route 3 - Link (pre-open)
- Python environment to mix code + docs
![](https://cl.ly/fe4cdb83a612/%255B38b2945a269bdae36c092cd209696f56%255D_Image%2525202019-06-24%252520at%25252010.43.20%252520PM.png)
MTA Data - 10 TXT Files
![](https://cl.ly/bd650c2c429d/Image%2525202019-06-24%252520at%25252010.31.13%252520PM.png)
Data Quality with Missingno
![](https://cl.ly/d3e96af0afa0/Image%252525202019-06-24%25252520at%2525252010.51.47%25252520PM.png)
![](https://cl.ly/f6056db27a4d/Image%252525202019-06-24%25252520at%2525252010.52.36%25252520PM.png)
Vincenty Distance Data Quality Check
![](https://cl.ly/1a2a9da9ebee/Image%252525202019-06-24%25252520at%2525252010.54.55%25252520PM.png)
- 1 Giant file
![](https://cl.ly/791d3c1e7958/Image%252525202019-06-24%25252520at%2525252011.48.07%25252520PM.png)
(historical re-enactment w/ 2019 data)
Speedbumps
- Where to parametrize?
- Showed coworker different line
- Hardcoded separate file
- Magic numbers
- Global variables
- CPU fan working hard 🚀
- Visual Noise
- Updating the DOM was verbose/tricky
It Runs - We're Done? 🏁
- MVP January 2018 (D3 V4)
- Unsure about next steps
- Filtering
- Changing source
- (Data-view relationship)
- Skills / Goals Gap
- Unsure about next steps
- Plan
- Bridge gap, then return
- OR wait for a need
The Return: Making it "Right"
- Revisited April 2019
- What Changed?
- New tools (React/Redux/Typescript/Parcel)
- Motivated by "run" problems
- Copied other things
- Practiced teaching (writing/in person)
- DVS #historical-viz / #location-nyc
- New tools (React/Redux/Typescript/Parcel)
Making it "Right" - 2 Hats
- Design
- Chasing a moving target
![](https://images-na.ssl-images-amazon.com/images/I/81StjJ9-goL._UX569_.jpg)
- Technical
- It already "runs"
- Enabling Change
- Wrangling State (Power Cycle)
Defining "Right"
- Reusability with React
- Declarative / Modular
- Contain state
- Safety with Typescript / Redux Dataflow
- Smart Spellcheck + Complete (demo)
- Visualizations Enabling Visualization
- Time Travel
- Livecode Debugging
- Fast Feedback!
Evolution (Sessions)
- Starting: Pure HTML / JS / D3, with Prepros to bundle
- Prepros out, parcel in Parcel
- Breakup large file into reusable functions
- Commit to react, replace Parcel with CRA
- Add "rescripts" for HMR (demo)
- Incremental SVG -> Canvas with Konva
- (Managed positioning bugs)
- Redux in a single file (copy from prev app)
- Add Rewired-typescript to type reducer/actions
- Use reselect
- Rewrite reducer with typesafe-actions
- React-hooks / redux hooks (released last month)
- Future - web workers, gatsby, svelte, elm?
Defining Right
- Where do trains usually bunch up?
- How many trains is the MTA managing right now? (Empathy)
- Assess options to get from (A,B,C) to (F)
...
![](https://images-na.ssl-images-amazon.com/images/I/81StjJ9-goL._UX569_.jpg)
Too many trains to visualize all at once! (Common Q)
Future: commuter (me) vs Train Engineer
Picking Features
![](https://images-na.ssl-images-amazon.com/images/I/81StjJ9-goL._UX569_.jpg)
Fun to make
Useful to have
Error handling
Login form
Xenographics*
Exit animations*
👌
(not to scale)
"Sweet Spot"
Gradients*
Date parsing
Selected Features
- Data source toggle
- IBM Carbon
- Widen Audience
- Station Filter
-
Redux + Typescript
- typesafe-actions
- Reduce noise
-
Redux + Typescript
- Time Filter (Brush)
-
Integrating d3-brush with React
- @vx/brush (framework pivot)
- Direct Manipulation (Schneiderman)
-
Integrating d3-brush with React
![](https://cl.ly/19eb112fb911/Image%252525202019-06-25%25252520at%2525252012.35.03%25252520AM.png)
- Fun
- Useful
Make it Fast (Optional 🐰!)
- (No extra work) - React Fiber Updates
- Caching in Redux with Reselect
- "Recently Visited" Shelf @ Library
- SVG + Canvas Together
- SVG for axes
- Canvas for workhorse elements
- React-Konva
- Future
- Optimize data structures (~50k stops)
- Web workers
- (Very easy to get distracted with optimizations)
- Talk to me later for details!
Outline
1. Motivation:
Why Build a Marey Chart of MetroNorth Transit Data?
2. Process:
A Prioritization Framework for Side Learning Projects
3. Demo
- Filters, Brushing, HMR, Devtools, Time-Travel
4. Recap
Checkpoint
1. Motivation:
Why Build a Marey Chart of MetroNorth Transit Data?
2. Process:
A Prioritization Framework for Side Learning Projects
3. Demo
- Filters, Brushing, HMR, Devtools, Time-Travel
4. Next Steps / Recap
Next Steps
- Pull requests welcome!
- Sync state to URL (make settings shareable)
- Mobile friendly / responsive
- Investigate late night data issues
- Filter trips by direction
- Replace Jupyter pipeline
- Layer in true "lateness" data
Next Steps
-
UI overhaul
- Thoughtful control panel
- "Admin" visual variables / canvas size
-
Tooltips
- MVP: onClick to console
- Fare, track #, pricing zone
- Marginal plots (trip count per time instant)
- Express vs semi-express vs local trips
- Animation / transitions
![](https://images-na.ssl-images-amazon.com/images/I/81StjJ9-goL._UX569_.jpg)
Takeaways
1. Motivation: Remake classics with modern data
- "Ibry Charts" reveal patterns tables don't
- MetroNorth train schedules affect millions
![](https://images-na.ssl-images-amazon.com/images/I/81StjJ9-goL._UX569_.jpg)
2. Process: This works for me, YMMV
- Frameworks help manage 🐰 holes + keep you sane!
- Make it run, make it right, make it fast (use both hats)
3. Demo:
- Try it / share with a fellow commuter!
- Try it on other movement data (races, agile, progress, etc)
Interactive "Ibry" Charts / Structuring Self-Learning
Twitter / Github: @hydrosquall
Blog: serendipidata.com (Writeup coming soon)
Thanks!
![](https://images-na.ssl-images-amazon.com/images/I/81StjJ9-goL._UX569_.jpg)