Geoposition with Elasticsearch


Philly ES Meetup - Nov 2014

Michael Holroyd, Ph.D.



What is this stuff?

node.js

 server-side event-driven javascript

postgres

 classic row-based SQL server 

elasticsearch

 full-text search using Apache Lucene

 ... supports geospatial queries!

Data

 scraped from instagram.com


Official API endpoint for geolocation based search:
https://api.instagram.com/v1/media/search?lat=48.858844&lng=2.294351

Response:
https://gist.githubusercontent.com/anonymous/ce06a8e8ce7c47314357/raw/058d40624c01e41a64ecdb9cbbb56b45bb8fe172/gistfile1.txt


Go look at downloadInstagram.js and centroids();

Kibana

http://kibana.arqball.com/#/dashboard/elasticsearch/Instagram%20Oldscrape


Maps

leaflet.js

 open-source JS library for interactive maps


  • Handles base-layers
  • Support for tiled raster layers (Anything > 100k elements)
  • Plays nice with d3.js for interactive SVG elements


Maps

Originally we used SVG elements for every dot on the map. Looked awesome, but ran the browser into the ground.

Maps

Instead, rasterize all the dots and use a tile server to load the correct data at run-time.


 psql -c "SELECT latitude, longitude FROM instagram" -F , -t -o instagram.csv

Check out https://github.com/ericfischer/datamaps for multi-core rendering solution. Can still be really slow, future work is to parallelize across machines.


Go look at demo. NO PEEKING AT COOL ELASTICSEARCH FEATURES YET.

Elasticsearch

Yank everything out of postgres and stuff it into elasticsearch

It's important to setup your mapping ahead of time to accommodate geo-position queries:
properties: {
  geo: {
    type: "geo_point",
    lat_lon: true,
    fielddata: {
      format: "compressed",
      precision: "1cm"
    }
  }
}
look at http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-geo-point-type.html

Query

You MUST query in the same geo-format as your data (bug?)


{
  size: 10000,
  "query" : {
    "match_all" : {}
  },
  "filter" : {
    "geo_distance" : {
      "distance" : "100m",
      "distance_type": "plane",
      "geo" : [e.latlng.lng,e.latlng.lat]
    }
  }
}
            
see  http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-geo-distance-filter.html#query-dsl-geo-distance-filter

What's going on?

Multiple strategies available in Lucene:
BoundingBox, Quadtrees, SpatialPrefixTree

For points, elasticsearch just uses a simple boundingbox strategy. Lots of room for improvement here actually.

For shapes, elasticsearch uses either a geohash strategy (if your data fits) or quadtrees.

Geospatial or Fulltext search


Check out search.js and show Demo

Thanks!


Arqball (computer vision kung-fu):
http://arqspin.com

Knollop (online education search):
http://knollop.com

Learnstream (link curation and sharing):
http://learnstream.com

Michael Holroyd
http://meekohi.com
Made with Slides.com