Big Data
ecosystem

Presentation about Espeo's Big Data ecosystem based on 1000 simulated drones flying around Poznan city
Drones produce and collect real time sample data like:
- latitude,
- longitude,
- height(m),
- temp(C),
- wind(m/s),
- humidity,
- air-polution




Drone soft written in Scala language
Drone soft written in Scala language
In real time it streams data to the server

Drone soft written in Scala language
In real time it streams data to the server
Using Kafka

Drone soft written in Scala language
In real time it streams data to the server
Using Kafka



On a server, data is read by Spark Streaming.
It allows us to:
It allows us to:
- save data to Cassandra
- send calculated data to browser through websocket
- send it to another Kafka consumer
- save the whole log to Hadoop cluster

On a server, data is read by Spark Streaming.
By saving logs to Hadoop cluster, we can later access those logs, if we didn't save something
in Cassandra


By sending data to the browser through websocket, we can see where our drones are in realtime, monitor sensors and much more


By using Cassandra and Apache Spark data scientists can analyze given data later,
by using:
1. Apache Zeppelin
- Apache Spark(df, RDD) + Scala
- Apache Spark MLLib
2. Azure Machine Learning

We prefer to use Azure Machine Learning instead Spark MLLib because it is much easier to understand - and design new predictions
Read our blog post about Azure ML:

Complete
ecosystem diagram
Drones


Wifi
Drones



Wifi
Drones





Wifi
websocket
Drones






Wifi
websocket
Drones








Wifi
websocket
Drones








Wifi
websocket
API
Drones








Wifi

websocket
API
Big Data ecosystem
By Sebastian Superczynski
Big Data ecosystem
- 352