Flink 102

Core Concepts other options

Stream Basics

Life is Easy without State

(Source Float) DataStream<Float>

DataStream<Float> (map Float->Float) DataStream<Float>

DataStream<Float> (map Float->Boolean) DataStream<Boolean>

DataStream<Boolean> (sink Boolean)

Stream Basics

  • DataStreams are infinite
  • Operators perform transformations
  • Data is processed in the order in which it is received
  • Backup/Restore is easy (check source offset)

State Instantly Makes Life HARD

  • Let's say temperature alerts when it's over 100 degrees C, fires once, doesn't fire again until the next event.

State Instantly Makes Life HARD

  • Requires Data In Order
  • Requires Knowledge on if we've alerted

Watermarking

  • Event Time
  • Buffers for a defined period until Watermark is met
  • Emits Data

Watermarking

  • How long can we wait?
  • What do we do if something passes our wait timer?
  • How much memory does this use?
  • What does this do to system latency?

Flink State

  • Flink can capture state variables and remember them
  • This hurts composition
  • This means operators need to be serializable
  • This means we have backup/restore operational stories

Flink Windows

  • What if we wanted to know the average temperature during an alert?
  • We now need to collect data for an arbitrary period
  • Windowing lets us aggregate data according to some criteria and emit a collection.
  • Window Start when we pass threshold, collect data until under threshold
  • Fire collection of datapoints to an operator that generates an average
  • Windows can be very simple too (i.e. collect 1 minute of data)
  • Windows don't like to reopen (tricky)

Final Graph

More interesting things

  • Streams form a graph
  • Can be forked
  • Operators can emit zero to many data points.
  • Cycles should be avoided
  • We can query running state
  • We need to advance forward down the graph

Streaming vs Micro Batching

  • Allows us to play data in Event Time order
  • Windowing sounds similar to a batch, but allows more flexibility in terms of window size

Nothing New Under the Sun

  • People have done this all before, but Flink as a framework assembles these features for us.

Flink 102

By Philip Doctor

Flink 102

  • 1,459