Big Data
ecosystem

Presentation about Espeo's Big Data ecosystem based on 1000 simulated drones flying around Poznan city

Drones produce and collect real time sample data like:

 

  • latitude,
  • longitude,
  • height(m),
  • temp(C),
  • wind(m/s),
  • humidity,
  • air-polution

Drone soft written in Scala language

Drone soft written in Scala language

In real time it streams data to the server

Drone soft written in Scala language

In real time it streams data to the server

Using Kafka

Drone soft written in Scala language

In real time it streams data to the server

Using Kafka

On a server, data is read by Spark Streaming.

It allows us to:

It allows us to:

  • save data to Cassandra
  • send calculated data to browser through websocket
  • send it to another Kafka consumer
  • save the whole log to Hadoop cluster

On a server, data is read by Spark Streaming.

By saving logs to Hadoop cluster, we can later access those logs, if we didn't save something

in Cassandra

By sending data to the browser through websocket, we can see where our drones are in realtime, monitor sensors and much more

By using Cassandra and Apache Spark data scientists can analyze given data later,

by using:

1. Apache Zeppelin

  - Apache Spark(df, RDD) + Scala

  - Apache Spark MLLib

2. Azure Machine Learning

We prefer to use Azure Machine Learning instead Spark MLLib because it is much easier to understand - and design new predictions

Complete

ecosystem diagram

Drones

Wifi

Drones

Wifi

Drones

Wifi

websocket

Drones

Wifi

websocket

Drones

Wifi

websocket

Drones

Wifi

websocket

API

Drones

Wifi

websocket

API

Copy of Big Data ecosystem

By Sebastian Superczynski

Copy of Big Data ecosystem

  • 371