Challenges
Geospatial Joins
select coffee_shop.name, tweet.handle
from coffee_shop, tweets
where dwithin(tweet.location, coffee_shop.location, 500 meters)
- joining on geometries from two tables- relational databases leverage shared memory and bitset intersections
- no such luck in distributed databases
- goal: compute result while limiting network overhead
- Naive algorithm: buffer all coffee shop locations, generate index query for each polygon, run query against tweets table