The Data Scientist Open Source Cartography Toolbox

Francois Dion

Chief Data Scientist

Dion Research LLC

About me

Francois Dion

Chief Data Scientist

Dion Research LLC

fdion@dionresearch.com

linkedIn                @f_dion

github

About you

The Hitchhiker's Guide to the Open

Source Data Science Galaxy

 

Francois Dion

June 10, 2016

John Tukey

 "Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise."

The Data Science Wheel

Why?

Apps

Databases

Python (Data Wrangling)

  • Sqlalchemy: ORM and abstraction layer for database queries (over a dozen dialects of SQL)
  • GDAL/OGR: abstraction layer for raster and for vector formats (200+)
  • fiona: another OGR API

Python (Feature Engineering)

  • Geopy: Geocoding addresses (18 API)
  • censusgeocode: Python wrapper for the US Census Geocoder
  • ip2geotools: Geocoding IPs (14 API)
  • pyproj: Performs cartographic transformations and geodetic computations.
  • cartopy: cartographic python library with matplotlib support
  • scipy.spatial: Spatial algorithms and data structures (kdtree, voronoi, minkowski...)
  • scikit-image: image processing, feature engineering

Waldo Tobler

 "everything is related to everything else, but near things are more related than distant things."

Python (Model)

  • scikit-learn: machine learning, data mining
    • sklearn.cluster
    • sklearn.distance
  • networkx: complex graphs and networks
  • osmnx: OSM+networkx

  • prophet: time series
  • PySal: Python Spatial Library (also esda, giddy, spaghetti)
  • PyTorch: Deep learning platform
  • rpy2: interface with R packages

ex:

rpostgis, dplyr, tidyr, sp, shapefiles, raster, geojson, geosphere, leafletR, cartography, choroplethR, amdai, ggmap, caret, glm, forecast, deepboost

Python (Visualization)

Thank you!

fdion@dionresearch.com

 

https://slides.com/fdion