Interactive Data Visualisations Built with Python

Jesús Martínez Blanco

Data Scientist

Python

Before we start:

git pull

Jake Vanderplas

PyCon 2017

Visit also Anaconda's PyViz.org

Article elaborating on the diagram above

Data Visualisation:

which type to choose?

Pretty much any chart type is possible in Plotly.

RESOURCES:

  • From Data to Viz           
    interactive web with a decision tree to choose visualization type, and with link to code snippets.

What is Plotly

A Canadian company building products around data analytics and visualisation tools:

 

  • Charts: Web UI for building plots online.
  • Dashboards: Online dashboards with D3.js Plotly charts.
  • Slide Decks: Powerpoint-like slide decks online that have interactive Plotly charts.
  • Falcon: open-source SQL editor with inline Data visualization.

 

They make money hosting your plots privately (Chart-Studio) and providing consulting and training services.

Website: https://plot.ly

Disclaimer: I do not hold any professional or commercial relationship with Plotly

A famous example

Video: How Does Income Relate to Life Expectancy? (The Gapminder Foundation)

A nice example

Plotly's open source libraries for Data Science

Apart from their paid products, they have open sourced their plotting libraries:

  • Plotly.js: JavaScript library for front-end graphs and dashboards (example here).
  • Plotly for R: the Javascript code is generated from R code.
  • Plotly for Python: the Javascript code is generated from Python code.
  • Dash: Python framework for building analytical web applications (including server side).
  • React: component suit for React web applications.

They are free to use and are fully functional OFFLINE 

(no need to use their servers).

Contains D3.js

Plotly for Python

Write Python code and get interactive plots rendered in the browser.

 

You'll need (see Getting Started with Plotly in Python):

  • Python > 3.5 installation (for example Anaconda distro)
  • pip install plotly kaleido nbformat

Bookmark these (docs):

for static image export

Plotly for Python

If you have Docker, navigate in the terminal to the directory with your notebooks and execute this:

docker run --rm -it -p 1789:8888 -v $PWD:/home/jovyan/HOST chumo/plotly_course
 run : executes the Docker image chumo/plotly_course. 
       If you don't have it in your local machine, 
       it will be downloaded automatically from DockerHub.

--rm : to remove the Docker container after 
       finishing the Jupyter notebook server with CTRL+c.

 -it : to be able to stop the JupyterLab server running on the container.

 -p  : to map the port 8888 internal to the running container 
       (where the notebook is running) to the port 1789 outside the container.

 -v  : to mount a local directory into the 
       directory /home/jovyan/HOST inside the container.

 

... and then see the Jupyter Notebook at http://localhost:1789

use any available port

Are you all set up?

In JupyterLab you may need the jupyterlab-plotly extension and change the default renderer:
import plotly.io as pio
pio.renderers.default = 'jupyterlab'

Plotly from scratch

# Import the library

import plotly as py

Building blocks

# Import the Plotly building blocks

import plotly.graph_objects as go

A Plotly figure is built upon objects from plotly.graph_objects 

and Python dictionaries and lists.

Examples of such objects are:

go.Scatter(x=X, y=Y)

go.Bar(x=X, y=Y)

go.Histogram(x=X)

go.Layout(title='My Title')

It is possible to use Python dictionaries instead:

dict(type='scatter', x=X, y=Y)

dict(type='bar', x=X, y=Y)

dict(type='histogram', x=X)

dict(title='My Title')
# Display the result in the notebook with...
fig.show()

Constructing a figure

import numpy as np
x = np.linspace(0, 2*np.pi)

# Traces
trace0 = dict(type='scatter', x=x, y=np.sin(x))
trace1 = dict(type='scatter', x=x, y=np.cos(x))




# Figure
fig = go.Figure(data=[trace0, trace1]) 

and now with a Layout

import numpy as np
x = np.linspace(0, 2*np.pi)

# Traces
trace0 = dict(type='scatter', x=x, y=np.sin(x), name='sin(x)')
trace1 = dict(type='scatter', x=x, y=np.cos(x), name='cos(x)')

# Layout
layout = dict(
  title='SIN and COS functions',
  xaxis=dict(title='x'),
  yaxis=dict(title='f(x)'),
)

# Figure
fig = go.Figure(data=[trace0, trace1], layout=layout) 
# syntactic sugar
xaxis_title='x',
yaxis_title='f(x)',

display with configurations

# For example, remove the Plotly logo for a cleaner layout,
# and make texts editable:
fig.show(
    config=dict(
        displaylogo=False,
        editable=True,
    ),
)

You can set a configuration to alter how the plot is displayed (see configuration options)

Exercise

Reproduce this:

Reproduce this:

Subplots

# Contrary to what we saw so far, 
# the figure object with subplots is defined beforehand:

from plotly.subplots import make_subplots

fig = make_subplots(
    rows=1, 
    cols=2, 
    subplot_titles=('f(x) = sin(x)', 'f(x) = cos(x)'), 
    shared_yaxes=True,
)

Modify the fig object

# Once the figure object has been created, 
# we can concatenate .add_ methods (to add specific traces)
# or the .update_ methods (to update existing features):

(
    fig
  
    .add_scatter(...)
    .add_histogram(...)
    .add_box(...)
    .add_bar(...)
  
    .update_xaxes(...)
    .update_yaxes(...)
    .update_layout(...)
    .update_traces(...)
)

It accepts a selector parameter as well

They accept a row and col parameters

Exercise

make_subplots

    vertical_spacing

    subplot_titles

    shared_xaxes

layout xaxis range

layout xaxis domain

layout margin t, r, b, l

Reproduce this:

hints

Things you can do with the Figure object

# Display it in the notebook
fig.show()  or  fig.show(renderer='png')
# Export it as static image (done by KALEIDO under the hood)
fig.write_image(file='sin_cos.png', width=700, height=500)
# Create a stand-alone html file 
fig.write_html(file='sin_cos.html', include_plotlyjs='cdn')
# Share it via the Plotly cloud platform (CHART STUDIO)
# or just a <div> element with the plot to embed in your web page
fig.write_html(file='mydiv.html', include_plotlyjs=False, full_html=False)
# Introspect its dictionary representation
fig.to_dict()
fig.full_figure_for_development(as_dict=True)  # includes defaults too
They accept a config parameter
use .to_html to get the raw string

The figure, hosted in Plotly

HTML file with the figure

# Generate the HTML code of the plot in a <div> element:

fig.write_html(
    file='mydiv.html', 
    include_plotlyjs=False,
    config=dict(displaylogo=False),
    full_html=False,
)
# Then you can use the resulting string in your own HTML code:

with open('mydiv.html', 'r') as f:
    div_str = f.read()


html_str = f'''
<!DOCTYPE html>
<html>

  <head>
    <script src='https://cdn.plot.ly/plotly-latest.min.js'></script>
  </head>

  <body>
    <h1>Simple Dashboard</h1>
    <p>The following plot is static and interactive at the same time ;)</p>

    {div_str}

  </body>

</html>
'''


# The final string can be saved in a file
with open('simple_dashboard.html', 'w') as f:
    f.write(html_str)

HTML file with the figure

Sharing your

Interactive Visualisations

  • Share the stand-alone html generated with 
fig.write_html()
  • Host your plot in Plotly, after sending it via  chart_studio and use the provided sharing link.
  • Use a hosting service: for example, GitHub pages (or Netlify if you also want server-less functions available). 

Hosting Visualisations in GitHub

Create a repo in your GitHub called, say, SharedPlots.

1.

Hosting Visualisations in GitHub

Upload any .html file created with Plotly to this repository:

2.

Hosting Visualisations in GitHub

Commit your changes to the repository:

3.

Hosting Visualisations in GitHub

Click on the "Settings" tab, go to menu "Pages", and set the "main" branch to be served. Press "Save"

4.

Hosting Visualisations in GitHub

Your visualisation is automatically served under the URL: 

5.

https://<USERNAME>.github.io/<REPOSITORY>/<FILENAME>.html

Ta-dá !!!

where USERNAME is your GitHub username

    and FILENAME is the html file that was added to the repo

NOTE 2: it may take a few minutes to be available online.

NOTE 1: if you don't specify <FILENAME>.html, GitHub Pages is trying to serve index.html or README.md by default.

Plotly Express

Check:

What is Plotly Express?

  • A wrapper for Plotly.py
  • Complex interactive visualisations with one-liners       (not all chart types are supported yet)
  • Inspired by the Layered Grammar of Graphics
  • Takes a tidy Pandas dataframe as input

From version 4.8, no longer a requirement!

See this post

Grammar of Graphics

  • Leland Wilkinson (1999) publishes a seminal book defining the components (layers) that make up a data visualization.

Grammar of Graphics

image credit: this post

Grammar of Graphics in Plotly Express

import plotly.express as px

px.scatter(
    data_frame= gapminder[gapminder.year.isin([1952, 2007])], 
    x= 'gdpPercap', 
    y= 'lifeExp', 
    log_x = True,
    color= 'continent', 
    size= 'pop',
    size_max= 60, 
    facet_col='year',
    width= 800,
    height= 500,
    title= 'Life Expectancy vs. GDP per capita',
)

data

aesthetics

geometries

geometries

facets

coordinates

themes

Grammar of Graphics in Plotly Express

The result:

Tidy DataFrames as input in Plotly Express

Tidy Data according to Hadley Wickham: (original paper)

  • Each variable forms a column and contains values
  • Each observation forms a row
  • Each type of observational unit forms a table

messy (wide-form)

tidy (long-form)

tidy = messy.melt(
    id_vars=['patient'],
    value_vars=['Treatment A', 'Treatment B'],
    var_name='Treatment',
    value_name='Result',
)

Plotly Express powered backend for Pandas plotting

Since Pandas 0.25 it is possible to provide any backend for the .plot plotting API.

Plotly backend (see here) can be set with:

import pandas as pd
pd.options.plotting.backend = 'plotly'

Then you can pass plotly.express parameters to the .plot method of pandas, and get a Plotly figure out of it:

df.plot(
    x='columnA',
    y='columnB',
    log_x=True,
)

Exercise

plotly express scatter

    range_x

    range_y

    facet_row

    labels

    color_discrete_map

    facet_row_spacing

Reproduce this:

After the figure has been created, you can hide the legend with:

.update_layout(showlegend=False)

plotly express scatter

    range_x

    range_y

    facet_row

    labels

    color

    color_discrete_map

    facet_row_spacing

Exercise

Reproduce this:

plotly express scatter

    animation_frame

    category_orders

Exercise

Reproduce these:

plotly express choropleth

    locations

plotly express box

    points

A few solutions that you can consider:

  1. Bare metal with HTML, CSS and Javascript (Plotly.js).
  2. Plotly Dash: web based & server assisted Plotly plots.
  3. Streamlit: from Python scripts to server backed dashboards.
  4. Pynecone: To build web apps in Python and deploy them.
  5. nicegui: A Python library to build UIs with simple scripts.
  6. Voila: from notebooks to standalone apps and dashboards.
  7. Panel: A high-level app and dashboarding solution for Python.

Building dashboards

DASH

pip install plotly dash gunicorn

required for deployment with Heroku

minimal DASH app

import pandas as pd
import numpy as np
import plotly.express as px
from dash import Dash, callback, html, dcc, Input, Output


external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']

app = Dash(__name__, external_stylesheets=external_stylesheets)

server = app.server

# Some example data and its corresponding Plotly figure
df = pd.DataFrame(np.random.rand(10,2), columns=['A', 'B'])
fig = px.scatter(df, x='A', y='B')

# Structure of my page
app.layout = html.Div([
    html.H1('A simple dashboard'),
    html.Button('Generate Random Data', id='randomize', n_clicks=0),
    html.Div(id='display-value'),
    dcc.Graph(id="myGraph", figure=fig),
])

@callback(
    Output(component_id='display-value', component_property='children'),
    Output(component_id='myGraph', component_property='figure'),
    Input(component_id='randomize', component_property='n_clicks'),
)
def do_something(n_clicks):
    # Regenerate the figure
    df = pd.DataFrame(np.random.rand(10,2), columns=['A', 'B'])
    fig = px.scatter(df, x='A', y='B')

    return [
        f'You have clicked {n_clicks} times',
        fig,
    ]

if __name__ == '__main__':
    app.run_server(debug=True)

DASH

see deployment with Heroku for details

heroku create dsr-test # feel free to use any other name
git add . # add all files to git
git commit -m 'Initial app boilerplate'
git push heroku master # deploy code to heroku
heroku ps:scale web=1  # run the app with a 1 heroku "dyno"
  • Your app should be available at: https://dsr-test.herokuapp.com . See the logs with:
  • Initialize Heroku, add files to Git, and deploy:
  • Make sure your app has the following files at least:
app.py (with the dash code, including             )

requirements.txt (gunicorn should be included)

Procfile (with the line                  )

.gitignore (with files to ignore in Git)
web: gunicorn app:server
heroku logs --tail
server = app.server

Some more cool stuff

  • Bokeh: from Anaconda (f.k.a. Continuum Analytics).
  • Altair: declarative visualizations based on Vega and Vega-Lite.
  • plotnine: ggplot2 for Python.
  • bqplot: Plotting library for IPython/Jupyter Notebooks.

Alternatives to Plotly

Made with Slides.com