The changing landscape of Data visualization tools in Large Organizations

By Justin Gosses

April 26th, 2017

Text

link to slides https://slides.com/justingosses/history_data_visualization_tools/​

 link to timeline

link to browser standards data visualization

Data Transfer

Data Cleaning 

Analytics

Analytics and other tasks sometimes get done by same software as data visualization

Everything done via code

Data Visualization

Analytics

Data Visualization

Data Transfer

Separate GUIs

code library

Tasks done by separate software & code

Hard to talk about just data visualization tools

Data Transer

Data Cleaning 

Analytics

Data Visualization

Code inside GUI for these

GUI used for these

Single Software

GUI used for these

Data Cleaning 

There is a large range of software and code libraries for data visualization. Putting them All into context is difficult

 

History presents one way to think about this complexity

Digital data visualization rides changes in:

  • Processing Power
  • Code Libraries
  • Internet speeds
  • Browser standards
  • Who creates

Data Visualization tools & computer technology

mainframe

printed 

charts

(SAS)

pre-installed spreadsheet based application on personal computer

Business Intelligence, expensive applications, not excel

early web data viz, in JS but not pure JS

pure JavaScript libraries, web default

A timelime of tool development

mainframe

 

PCs

 

Integrated Enterprise IT widespread

Early Internet 1.0

 

Key

Browser

Standards 

Broadband > Dial Up

WebGL

easier

AR/VR

Github

data viz libraries for desktop

Spreadsheets & Charts

1976-1992  Timeline

 

  • Spreadsheet analytics as way to sell computers
    • Pattern of tool development in small companies, then bought by larger companies to be provided to user for free with OS
  • Charting secondary to calculation

1976 - SAS

1979 - Vizicalc

1983 - Lotus

1985 - Excel

Statistics and calculations on mainframe

Early spreadsheet on Apple II

IBM's first "killer app", included word perfect and debase

Controlled  the market since version 5 in 1993

Early internet doldrums

 1993-2006   Timeline

  • Data Limit Implications
    • Data visualization too big for puny internet
    • Data often stored on local machines or on-site servers
    • Software not pre-installed must be physically delivered, which requires distribution, marketing, big scale.
  • Early visualization libraries but not pure-JavaScript
  • Start of Business Intelligence tools

1993 - Qlik

2001 - Chart Director

2003 - Fusion Charts

2006 - Tableau

Early B.I. tool from Sweden. Stayed small for many years, now much larger.

Chart library for many languages but directed at desktops, not web

Flash-based chart library, not pure JavaScript

Started as university spin-off, later grew into popular BI tool

Data on WEb is Easier

 2006-2016   Timeline

  • Key trends (discuss more on next slides)
    • Who codes changes
    • Internet speeds up
    • Browsers standards
    • Open-source increases rate of development
2000s
2000s

2007 - GoogleChartAPI

2007 - Flot charts

2010 - Protoviz

2011 - D3.js

2013- Chart.js

JS library but takes data to API & returns image of chart

Early JS library not using flash, but limited in chart types

Early js library that led to more flexible d3.js

Pure JS library, open-source, direct DOM manipulation, & flexible

Canvas-based JavaScript data visualization

1994

2016

1. Everyone from 1996

 

2. A lot of people know a bit for work, often within another piece of software.

 

3. A bunch of students who were taught in a college class but aren't C.S. students.

 

4. code bootcamp students

 

5. Internet taught

2017 Compared to past

Who writes code?

These groups are growing fast!

Stack Overflow 2016 Developer Survey found =

1. People with C.S. degrees

2. Hackers

3. People making things in their garages

More people are doing more advanced data visualization, because more people know how to code

majority don't have a traditional C.S. degree

internet speeds LESS OF a constraint

Many of the data visualizations that load today in <1 second today would take 10s of seconds to download (and then additional time for your browser to display) years ago

(you can use this calculator to figure out how long your data visualizations would take to download in the past) 

Nielsen's "Law" of Bandwidth

Edholm's "Law" of Bandwidth

both of these use top-of-the-line speeds at the time and show more or less the same thing

The last D3.js visualiation I made :  2 minutes in 1998, 5 seconds in 2003, <1 seconds in 2005

a google chrome experiment from 2012

Key Changes

to Browsers & web standards

influence what data visualizations can do

Past Landscape

Small Data Analytics in Large Organizations Dominated By 3 types of tools

"th

Originally, Not Much Between The Islands

Excel

Code

Industry

Specific

Desktop

GUIs

Present Landscape

More Options

Excel

Code

Industry Specific Desktop GUIs

Salesforce

Tableau

D3.js

chart.js

Venga

QlikView

Domo

cloud-based

platforms as a service 

open-source within $ software

Spotfire

Cost pressure?

More libraries

More GUIs have code as option

More GUIs talk with other things

bokeh

templates & add-ons purchased piecemeal

BI have r and python as default instead of vba

Altair

Microsoft BI

A MORE CROWDED LANDSCAPE WITH MORE HYBRIDS

 FASTER TO BUILD THINGS & MORE TOOLS TO PICK FROM

More Hybrids

 Writing code is faster

Galleries Save Time

near-Future Landscape 

Excel

Code

Industry Specific Desktop GUIs

A more efficient environment to move data in with new ways to move into 3D dimensions

AR / VR

AI

AI

AI

better data engineering, better data architecture, and more machine-readable data

AI removes some grunt work

  • More things move to WebGL / OpenGL (over SVG) for speed

(0-2 yr)

Summary

Digital data visualization rides changes in:

  • Processing Power
    • Main-frame -> PC
    • Desktop -> Browser
  • Code Libraries
    • Flash -> Pure JavaScript
  • Internet speeds
    • What % (or size) of datasets can be visualized 
  • Browser standards =  available tools & features
  • Who creates = speed & direction of innovation

End

There is a large range of software and code libraries for data visualization. Putting them All into context is difficult

 

History presents one way to think about this complexity

Digital data visualization rides changes in:

  • Processing Power
  • User interfaces
  • Code Libraries
  • Internet speeds
  • Browser standards
  • Users groups
  • VR & AR - more dimensions! but uncharted road
    • ​We've been working in 2D for hundreds of years as 3D representation was always too expensive

  • AI in data prep & chart style selection
    • The return of Clippy? but less annoying?
  • WebGL will overtake SVG for animation speed
  • Less grunt work
    • better data engineering, better data architecture, and more machine-readable data
    • More blending of BI & Data Science & IT (people & tools)
    • APIs that talk to APIs that talk to APIs
  • Growth of data visualization as a discipline
    • More "data visualization" jobs on linkedIN now than a year ago
    • Attempts to define what a 100% data visualization job title should be responsible for but many differing opinions

Near-Future trends (0-2 years)

 Changes Currently Pushing new Data visualization tool adoption 

more data and increasingly complex data require different tools

data interpretations increasingly need to be shared & not only presented

Internet is faster & use of cloud is normalizing

more competition, more open-source, easier to share examples

Infrastructure

New Tools

& New features

People

more people know how to code

easier to push to larger audiences as more things move to browser/cloud

more real-time / mobile expectations

Data

Task

data visualization being applied in new ways

  • VR & AR - more dimensions! but uncharted road
    • ​We've been working in 2D for hundreds of years as 3D representation was always too expensive

  • AI in data prep & chart style selection
    • The return of Clippy? but less annoying?
  • WebGL will overtake SVG for animation speed
  • Less grunt work
    • better data engineering, better data architecture, and more machine-readable data
    • More blending of BI & Data Science & IT (people & tools)
    • APIs that talk to APIs that talk to APIs

Near-Future Landscape

(0-2 years)

User or 3rd party generated examples extensions, templates, and plug-ins

Instead of standing on the shoulders of long ago giants, you stand on the shoulders of anyone doing similar work, somewhere, right now.

 Speed of new things and diversity of things goes way up. Both open-source and $ license-based

Tableau

Petrel

D3.js

Spotfire

Tableau

Spotfire

D3.js

Ruths.ai template

Future Landscape

Excel

Code

Industry Specific Desktop GUIs

Salesforce

Tableau

D3.js

Venga

QlikView

Domo

cloud-based

platforms as a service 

open-source within $ software

Spotfire

bokeh

templates & add-ons purchased piecemeal

Altair

Microsoft BI

A MORE CROWDED LANDSCAPE WITH MORE HYBRIDS

 FASTER TO BUILD THINGS & MORE WAYS TO VISUALIZE DATA

AR / VR

AI

AI

AI

  • better data engineering, better data architecture, and more machine-readable data
  • AI removes some grunt work
  • More things move WebGL / OpenGL over SVG for speed

Data_Visualization_Tools_HistoryOnly_Short

By Justin Gosses

Data_Visualization_Tools_HistoryOnly_Short

  • 1,446