anthony.fox AT ccri.com


What is GeoMesa?

- Distributed Spatio-Temporal Database built on Accumulo
- Standard Geotools DataStore API
- Standard OGC access
- Geoserver Plugins and WPS analytics
- LocationTech Open Source
High Velocity
Spatio-Temporal Data
Twitter (100-150k tweets/second)
Foursquare (1M checkins/day)
Geolocated clickstreams
Satellite Imagery
Near-real time Traffic sensors
FAA flight information
Distributed Databases
| 1 | MSFT |
Flexible Schema
Query planning pushed into Application layer
Presents challenges
Distributed Databases
| ID | SYMBOL | DATE | CLOSE | |
| 1 | MSFT | 2014-05-20T00:00:00.000Z | 39.64 |
Flexible Schema
Query planning pushed into Application layer
Presents challenges
Distributing Data
Distributing Data
Space Filling Curves

Space Filling Curves

Query Planning

Analytics
Interpolated line select
Density computations
WPS
Near-Real Time Architecture
Challenges
Geospatial Joins
select coffee_shop.name, tweet.handle
from coffee_shop, tweets
where dwithin(tweet.location, coffee_shop.location, 500 meters)
- relational databases leverage shared memory and bitset intersections
- no such luck in distributed databases
- goal: compute result while limiting network overhead
- Naive algorithm: buffer all coffee shop locations, generate index query for each polygon, run query against tweets table
GeoMesa
By anthonyccri
GeoMesa
- 633
