The changing landscape of Data visualization tools in Large Organizations
By Justin Gosses
April 26th, 2017
Text
link to slides https://slides.com/justingosses/history_data_visualization_tools/
Data Transfer
Data Cleaning
Analytics
Analytics and other tasks sometimes get done by same software as data visualization
Everything done via code
Data Visualization
Analytics
Data Visualization
Data Transfer
Separate GUIs
code library
Tasks done by separate software & code
Hard to talk about just data visualization tools
Data Transer
Data Cleaning
Analytics
Data Visualization
Code inside GUI for these
GUI used for these
Single Software
GUI used for these
Data Cleaning
There is a large range of software and code libraries for data visualization. Putting them All into context is difficult
History presents one way to think about this complexity
Digital data visualization rides changes in:
- Processing Power
- Code Libraries
- Internet speeds
- Browser standards
- Who creates
Data Visualization tools & computer technology
mainframe
printed
charts
(SAS)
pre-installed spreadsheet based application on personal computer
Business Intelligence, expensive applications, not excel
early web data viz, in JS but not pure JS
pure JavaScript libraries, web default
A timelime of tool development
mainframe
PCs
Integrated Enterprise IT widespread
Early Internet 1.0
Key
Browser
Standards
Broadband > Dial Up
WebGL
easier
AR/VR
Github
data viz libraries for desktop
Spreadsheets & Charts
1976-1992 Timeline
- Spreadsheet analytics as way to sell computers
- Pattern of tool development in small companies, then bought by larger companies to be provided to user for free with OS
- Charting secondary to calculation
1976 - SAS
1979 - Vizicalc
1983 - Lotus
1985 - Excel
Statistics and calculations on mainframe
Early spreadsheet on Apple II
IBM's first "killer app", included word perfect and debase
Controlled the market since version 5 in 1993
Early internet doldrums
1993-2006 Timeline
- Data Limit Implications
- Data visualization too big for puny internet
- Data often stored on local machines or on-site servers
- Software not pre-installed must be physically delivered, which requires distribution, marketing, big scale.
- Early visualization libraries but not pure-JavaScript
- Start of Business Intelligence tools
1993 - Qlik
2001 - Chart Director
2003 - Fusion Charts
2006 - Tableau
Early B.I. tool from Sweden. Stayed small for many years, now much larger.
Chart library for many languages but directed at desktops, not web
Flash-based chart library, not pure JavaScript
Started as university spin-off, later grew into popular BI tool
Data on WEb is Easier
2006-2016 Timeline
- Key trends (discuss more on next slides)
- Who codes changes
- Internet speeds up
- Browsers standards
- Open-source increases rate of development
2000s
2000s
2007 - GoogleChartAPI
2007 - Flot charts
2010 - Protoviz
2011 - D3.js
2013- Chart.js
JS library but takes data to API & returns image of chart
Early JS library not using flash, but limited in chart types
Early js library that led to more flexible d3.js
Pure JS library, open-source, direct DOM manipulation, & flexible
Canvas-based JavaScript data visualization
1994
2016
1. Everyone from 1996
2. A lot of people know a bit for work, often within another piece of software.
3. A bunch of students who were taught in a college class but aren't C.S. students.
4. code bootcamp students
5. Internet taught
2017 Compared to past
Who writes code?
These groups are growing fast!
Stack Overflow 2016 Developer Survey found =
1. People with C.S. degrees
2. Hackers
3. People making things in their garages
More people are doing more advanced data visualization, because more people know how to code
majority don't have a traditional C.S. degree
internet speeds LESS OF a constraint
Many of the data visualizations that load today in <1 second today would take 10s of seconds to download (and then additional time for your browser to display) years ago
(you can use this calculator to figure out how long your data visualizations would take to download in the past)
Nielsen's "Law" of Bandwidth
Edholm's "Law" of Bandwidth
both of these use top-of-the-line speeds at the time and show more or less the same thing
The last D3.js visualiation I made : 2 minutes in 1998, 5 seconds in 2003, <1 seconds in 2005
a google chrome experiment from 2012
Key Changes
to Browsers & web standards
influence what data visualizations can do
Past Landscape
Small Data Analytics in Large Organizations Dominated By 3 types of tools
"th
Originally, Not Much Between The Islands
Excel
Code
Industry
Specific
Desktop
GUIs
Present Landscape
More Options
Excel
Code
Industry Specific Desktop GUIs
Salesforce
Tableau
D3.js
chart.js
Venga
QlikView
Domo
cloud-based
platforms as a service
open-source within $ software
Spotfire
Cost pressure?
More libraries
More GUIs have code as option
More GUIs talk with other things
bokeh
templates & add-ons purchased piecemeal
BI have r and python as default instead of vba
Altair
Microsoft BI
A MORE CROWDED LANDSCAPE WITH MORE HYBRIDS
FASTER TO BUILD THINGS & MORE TOOLS TO PICK FROM
More Hybrids
Writing code is faster
Galleries Save Time
near-Future Landscape
Excel
Code
Industry Specific Desktop GUIs
A more efficient environment to move data in with new ways to move into 3D dimensions
AR / VR
AI
AI
AI
better data engineering, better data architecture, and more machine-readable data
AI removes some grunt work
- More things move to WebGL / OpenGL (over SVG) for speed
(0-2 yr)
Summary
Digital data visualization rides changes in:
-
Processing Power
- Main-frame -> PC
- Desktop -> Browser
-
Code Libraries
- Flash -> Pure JavaScript
-
Internet speeds
- What % (or size) of datasets can be visualized
- Browser standards = available tools & features
- Who creates = speed & direction of innovation
End
There is a large range of software and code libraries for data visualization. Putting them All into context is difficult
History presents one way to think about this complexity
Digital data visualization rides changes in:
- Processing Power
- User interfaces
- Code Libraries
- Internet speeds
- Browser standards
- Users groups
-
VR & AR - more dimensions! but uncharted road
-
We've been working in 2D for hundreds of years as 3D representation was always too expensive
-
-
AI in data prep & chart style selection
- The return of Clippy? but less annoying?
- WebGL will overtake SVG for animation speed
-
Less grunt work
- better data engineering, better data architecture, and more machine-readable data
- More blending of BI & Data Science & IT (people & tools)
- APIs that talk to APIs that talk to APIs
-
Growth of data visualization as a discipline
- More "data visualization" jobs on linkedIN now than a year ago
- Attempts to define what a 100% data visualization job title should be responsible for but many differing opinions
Near-Future trends (0-2 years)
Changes Currently Pushing new Data visualization tool adoption
more data and increasingly complex data require different tools
data interpretations increasingly need to be shared & not only presented
Internet is faster & use of cloud is normalizing
more competition, more open-source, easier to share examples
Infrastructure
New Tools
& New features
People
more people know how to code
easier to push to larger audiences as more things move to browser/cloud
more real-time / mobile expectations
Data
Task
data visualization being applied in new ways
-
VR & AR - more dimensions! but uncharted road
-
We've been working in 2D for hundreds of years as 3D representation was always too expensive
-
-
AI in data prep & chart style selection
- The return of Clippy? but less annoying?
- WebGL will overtake SVG for animation speed
-
Less grunt work
- better data engineering, better data architecture, and more machine-readable data
- More blending of BI & Data Science & IT (people & tools)
- APIs that talk to APIs that talk to APIs
Near-Future Landscape
(0-2 years)
User or 3rd party generated examples extensions, templates, and plug-ins
Instead of standing on the shoulders of long ago giants, you stand on the shoulders of anyone doing similar work, somewhere, right now.
Speed of new things and diversity of things goes way up. Both open-source and $ license-based
Tableau
Petrel
D3.js
Spotfire
Tableau
Spotfire
D3.js
Ruths.ai template
Future Landscape
Excel
Code
Industry Specific Desktop GUIs
Salesforce
Tableau
D3.js
Venga
QlikView
Domo
cloud-based
platforms as a service
open-source within $ software
Spotfire
bokeh
templates & add-ons purchased piecemeal
Altair
Microsoft BI
A MORE CROWDED LANDSCAPE WITH MORE HYBRIDS
FASTER TO BUILD THINGS & MORE WAYS TO VISUALIZE DATA
AR / VR
AI
AI
AI
- better data engineering, better data architecture, and more machine-readable data
- AI removes some grunt work
- More things move WebGL / OpenGL over SVG for speed
Data_Visualization_Tools_HistoryOnly_Short
By Justin Gosses
Data_Visualization_Tools_HistoryOnly_Short
- 1,595