Making sense of geospatial data with CartoDB maps

Raquel Alegre

Research Software Development Group

University College London

UCL, 14th March 2016

ShipViz

ShipViz

  • UCL Energy Institute 

  • Tools for visualization and analysis of shipping data

 

  • Objective:
    • Research on the behaviour of the maritime transport sector
      • Response to policy changes
      • Identification of transport routes
      • Environmental impact (GHG emission)
      • ...

ShipViz - Sample use case

 

  • The International Maritime Organisation introduced in 2012 a regulation for Emission Control Areas

  • Identify the number of ships affected by these regulations and their effectiveness by analysing the speed, movement and type of ships sailing through them.

ShipViz + RSDG

 

  • Develop a visualisation tool for ingestion, manipulation and analysis of geo-temporal data
    • Time slider / animation capabilities
    • Filtering options (type of vessel, type of fuel, area, ...)
    • Interactive and responsive
    • Import / Export data (CSV, KML, ...)
    • Secure (data provider restrictions)
    • Easy for the Shipping Group to mantain and build on top
    • Produces nice visualisations to inform their community
    • Input SQL queries
    • ... All these to be completed in around 30 days of work

 

 

  • Very large dataset in a PostgreSQL DB
    • Messages sent by ships all around the world containing:
      • MMSI, position, time, speed, fuel type, etc.

ShipViz status before work

 

  • Some Python and MATLAB tools for data analysis and visualisation
    • Didn't deal with temporal data
    • Only worked with a small amount of points of data
    • Time consuming: ingest + extract + process + plot
    • Required some software development skills

 RSDG work on ShipViz

 

  • Research of different GIS tools available that met the requirements
    • Best option: CartoDB  

 

 RSDG summary of work

 

  • Custom installation and configuration

 

 

  • Analysis of current status of project
    • Improvement on data storage 

 

 

 

  • Training on usage as well as maintenance and improvements

 RSDG work - Database

 

  • PostgreSQL
    • Open Source relational database
    • >1 year worth of data, clean and hourly sampled
      • 15 million data points, 5000 vessels
    • Treats latitudes and longitudes as doubles

 

  • A standard Shipviz query took around 60 minutes to complete
    • Considering moving to Neo4j to speed up

 RSDG work - Database

 

  • PostGIS
    • Spatial database extension for PostgreSQL
    • Adds spatial indexing
      • R-trees 
    • Adds support for spatial data types
      • Point, Line, Polygon...
    • Adds support for reprojections
    • Extends PostgreSQL with geo-queries

 

  • A standard Shipviz query takes around 5 minutes to complete  

 

 

  • PostGIS - Indexing with R-trees
    • ​Group nearby objects and represent them with their minimum bounding rectangle

 RSDG work - GIS tools

  • PostGIS + QGIS

 

  • Open Source
  • Plugins can be developed in Python to add functionality
  • Good performance with large dataset
  • But: very limited temporal visualisation 

 RSDG work - GIS tools

  • PostGIS + GeoServer/Mapserver + OpenLayers/Leaflet

 

  • Traditional solution, but:

 

  • Tricky to make it performant with so much data

 

 

  • ... All the Javascript 

 

 

  • Not enough time to develop all requirements

 RSDG work - GIS tools

  • ESRI GIS / ArcGIS
    • Non-Open Source
      • Next!

 RSDG work - GIS tools

  • PostGIS + MapBox + TileMill
    • Good performance with large amounts of data​
    • Out of the box solution
      • Ticks some requirements with no extra work

 

  • However:
    • Unstable release at the time
    • Focus on tiling design, not so much interaction with data
    • No temporal data animation at the time

 RSDG work - GIS tools

  • PostGIS + GeoServer + WebGL

 RSDG work - GIS tools

  • CartoDB
    • ​Out of the box solution that ticked most requirements
  • However:
    • Data security requirements not met since data would be save in the cloud 
    • Standard online version only allows up to 250MB of data for free
  • Luckily:
    • Since it's Open Source, you can install the full stack of CartoDB locally as well! 

 RSDG work - CartoDB

  • DB + Server + Client traditional GIS application architecture
  • Can be used:
    • Online with limitations
    • Local full stack - (almost) no limitations
  • Can be used as a unit or in bits:
    • CartoDB python/R/Java libraries to connect with CartoDB PostGIS
    • D3/Plotly/ChartJS can visualise CartoDB data

 RSDG work - CartoDB

 RSDG work - CartoDB

           20% data points in CartoDB

 RSDG work - CartoDB

          400K points:  ~3% of the data points in CartoDB

Demo!

Were Emission Control Areas effective? 

Future of ShipViz

  • The CartoDB stack is currently up and running at a server installed at the Energy Institute

 

  • It is being used by several ShipViz team members for their research and other use cases

 

  • Paper under preparation about the work with CartoDB and the ECAs findings

CartoDB and Python

pip install cartodb
from cartodb import CartoDBAPIKey

API_KEY ='YOUR_CARTODB_API_KEY'
cartodb_domain = 'YOUR_CARTODB_DOMAIN'

cl = CartoDBAPIKey(API_KEY, cartodb_domain)
  • Connect to CartoDB using your API keys or OAuth
  • Install:

CartoDB-Python

 

  • Python Client for CartoDB SQL API and Import API 
from cartodb import FileImport

fi = FileImport("test.csv", cl)
fi.run()
from cartodb import URLImport

fi = URLImport("http://acdmy.org/d/counties.zip", cl)
fi.run()

CartoDB-Python

 

  • Import local or remote datasets:
  • Send SQL query
from cartodb import CartoDBAPIKey, CartoDBException

API_KEY ='YOUR_CARTODB_API_KEY'
cartodb_domain = 'YOUR_CARTODB_DOMAIN'
cl = CartoDBAPIKey(API_KEY, cartodb_domain)
try:
   print(cl.sql('select * from mytable'))
except CartoDBException as e:
   print("some error ocurred", e)

CartoDB and Javascript

CartoDB

Cool maps

 

The End

ShipViz and RSDG

By Raquel Alegre

ShipViz and RSDG

Presentation about work done on ShipViz (Energy Institute) by Raquel Alegre for one of the events at the series of UCL's Technical Socials - March 2016

  • 2,969