Big Data Zombies

The PLAN

250K drones with IR cameras*

Detects temperature variations

One report per second per drone

Max resolution: 1000 zombies

*Each drone observes 1 km^2

Format*

{
    "drone" : 1000000,
    "zombie-id" : 34,
    "lat" : 42.0343,
    "lon" : 2.40324
}

*After the prototype we will use CSV

Amazon Kinesis

Managed ingestion & storage

Unlimited capacity

Unit of capacity: Shards

Records

Sequence number +

Partition key +

Data blob

Shard Limits

Max size of a record: 25KB

Max throughput: 1000 rec/s

IO limits:

  • 1MB/s writes - ingress
  • 2MB/s reads - egress

 

Do the maths

250,000 records/s

1,000 records/shard + 10%

= 275 shards

Show me the money

$0.015 shard/h

$0.014 per million of puts

$2970 + $9072 month*

* we can always scale it down if you want to be cheap

The question is the key

What do we want to do?

To visualise the position of the zombies

Lat/Lon vs UTM coordinates

Lat/Lon represents a 3D point

Maps are 2D projections

UTM as the most popular proj.

It makes aggregation easy!

Example

41.390205N 2.154007E

 

31N 429271E 4582420N

 

31N 429 4582 <- 1 km^2
31N 42   458 <- 10 km^2

UTM 1 km^2 as partition key

Good enough distribution

Allows km^2 aggregation

Kinesis streams dev kit

AWS Mobile SDK

Kinesis Agent (java tool)

Kinesis Producer library (c++)

KPL java wrapper

Kinesis Client Library (java)

KCL daemon (java tool)

Kinesis producer library (KPL)

ObjectMapper mapper = new ObjectMapper();
KinesisProducer kinesis = new KinesisProducer();  

while(true) {
    ZombieLecture lect = nextLecture();
    LatLng latLng = lect.getCoordinates();
    UTM utm = Projections.toUTM(latLng);
    String shardKey = utm.asString(1000 /*meters*/);
    String json = mapper.writeValueAsString(lect);
    ByteBuffer data =
         ByteBuffer.wrap(json.getBytes("UTF-8"));
    kinesis.addUserRecord("zombieStream", shardKey, data);
}  

Kinesis Client Llibrary (KCL)

// iterator to iterate *inside* a shard
shardIterator = getShardIteratorResult.getShardIterator();
List<Record> records;   
while (true) {
  GetRecordsRequest recRequest = new GetRecordsRequest();
  recRequest .setShardIterator(shardIterator);
  recRequest .setLimit(25);

  GetRecordsResult result = client.getRecords(recRequest );
  records = result.getRecords();
  this.processRecords(records);
  shardIterator = result.getNextShardIterator();
}

Kinesis analytics

Realtime simple processing

Based on good-old-SQL!

Supports sliding windows

Integración

Apache Storm

Realtime sophisticated processing

Based on topographies

Kinesis Storm Spout as source

Uses Bolts for aggregation

Visualisation with Leafletjs

Javascript light library

Works perfectly on mobile

Excellent API

Tons of plugins

Including one for heatmaps

Código de ejemplo

var map = L.map('map')
  .setView([41.3902, 2.15400], 13);

L.tileLayer('http://{s}.tile.osm.org'+
            '/{z}/{x}/{y}.png', {})
  .addTo(map);
$.ajax(...)
 .done(function(points) {
    var heat = L.heatLayer(points)
                .addTo(map);    
});

Zombies

By CAPSiDE

Zombies

  • 1,197