Intro to
Akka Streams & HTTP
Lance Arlaus
ny-scala
blog.lancearlaus.com
Going Reactive
quick take
-
Reactive is Real
- four tenets
- expanding options
-
Streams are Effective
- versatile abstraction
Intro to Akka Streams and HTTP
Overview
- Reactive Streams Basics
- concept: push-based streams
- innovation: demand-based flow
- Akka Streams Basics
- capability: reusable flow components
- abstraction: visual flow construction (DSL)
- Akka HTTP Basics
- improvement: fully stream-based
- Sample Service
What is a stream?
natural abstraction for sequenced data
(file, network, events, ...)
- has a beginning
- may be unbounded
- may be non-repeatable
Example | Bounded | Repeat | |
---|---|---|---|
(6, 7, 8, 9) | Fixed integer sequence | Yes | Yes |
(3, 4, 5, ...) | Infinite integer sequence | No | Yes |
(9, 3, 1, 6) | Fixed-length random sequence | Yes | No |
(7, 2, 4, ...) | Infinite random sequence | No | No |
(GET, ...) | Incoming HTTP requests | ? | ? |
What is the challenge?
mismatched producer and consumer
-
fast producer / slow consumer
- producer blocks (sync only)
- consumer drops
- slow producer / fast consumer
- consumer blocks e.g. iterator.next()
FAST (10/sec)
slooww (2/sec)
Robust Stream Processing
Reactive Streams
handle large data sets or rapid events with bounded resources
- data only flows downstream in response to demand
- all interfaces non-blocking
Akka Streams
reusable, natural expression of stream processing atop Akka
- data transformations
- flow graph components
Reactive Streams Basics
-
Publisher
-
Subscriber
-
Subscription
-
Processor
Publisher
-
how does data flow?
-
key concept: streams are push-based
-
-
data is never directly pulled from a Publisher
- no next() method
-
subscribe to Publisher, receive events (later) via Subscriber
Reactive Streams Basics
public interface Publisher<T> {
void subscribe(Subscriber<? super T> s);
}
Subscriber
-
asynchronous data events
- publisher.subscribe(subscriber)
- subscriber.onNext(element), ...
- subscriber.onComplete()
-
events pushed to subscriber
-
what about errors?
-
subscriber.onError(error)
-
Reactive Streams Basics
public interface Subscriber<T> {
void onSubscribe(Subscription s);
void onNext(T t);
void onError(Throwable t);
void onComplete();
}
how is this different than traditional event listeners or reactive extensions (Rx.NET, et al)?
Subscription
- deconstruct Iterator.next()
- signal demand (call implies demand)
- deliver data (return value)
- signal error (exception)
- data & error via subscriber.on(Next|Error)
- signal demand
-
- subscribe() implies demand? no
Reactive Streams Basics
Subscription
-
when does data flow?
- key innovation: flow is demand-based
- explicitly signal demand
-
-
subscription.request(count)
-
- no data events flow until demand is signaled
-
-
publisher.subscribe(subscriber)
-
subscriber.onSubscribe(subscription)
-
subscription.request(count)
-
subscriber.onNext(element), ...
-
Reactive Streams Basics
public interface Subscription {
void request(long n);
void cancel();
}
Processor
-
both Subscriber and Publisher
- processing stage (e.g. data transformation)
- non-terminal
Reactive Streams Basics
public interface Processor<T, R>
extends Subscriber<T>, Publisher<R> {
}
Reactive Streams Basics
Flow Visualization
the flow of demand and data
Summary
- key concept: streams are push-based
- key innovation: flow is demand-based
- Reactive Streams footprint
-
- Publisher
- Subscriber
- Subscription
- Processor
- async/non-blocking
Reactive Streams Basics
public interface Publisher<T> {
void subscribe(Subscriber<? super T> s);
}
public interface Subscriber<T> {
void onSubscribe(Subscription s);
void onNext(T t);
void onError(Throwable t);
void onComplete();
}
public interface Subscription {
void request(long n);
void cancel();
}
public interface Processor<T, R>
extends Subscriber<T>, Publisher<R> {
}
Streams: From Reactive to Akka
Reactive Streams
objective: minimal, well-specified integration API
- multi-vendor integration (part of Java 9?)
- not a user level API (really a SPI)
Akka Streams
objective: develop streaming applications
- component definition (source, sink, flow, etc.)
- transformation library (data, stream)
- graph construction (fan out/in, DSL)
-
- linear, branching, cyclic
- integration / customization
-
- Akka publisher / subscriber
- custom stages
|
Level1
(Basic)
|
Level 2
(Intermediate)
|
Level 3
(Advanced)
|
Concept
|
Stream
Graph, Shape, Inlet, Outlet
Materialization
|
Buffers
Stream of streams
Attributes
|
Cyclic Graphs
Recovery
|
Component
Shape Library
|
Source, Sink, Flow
|
Broadcast, Zip, ZipWith, Unzip, UnzipWith, Merge
|
FlexiRoute, FlexiMerge
BidiFlow
|
Transform
|
Data Transformation
|
Custom Transformation
Stream Transformation
Other
|
Custom Materialization
Stream Transformation
Error Handling
|
Construct
|
Linear Flows
DSL, Builder
|
Branching Flows
|
Cyclic Flows
Protocol Flows
|
Customize
|
N/A
|
|
|
Test
|
|
|
|
Akka Streams Topic Map
Today
Basic Building Blocks
- Source
- Sink
- Flow
Akka Streams Level 1 (Basics)
*conceptually
val source = Source(1 to 3)
val sum = Flow[Int].fold(0.0)(_ + _)
val sink = Sink.foreach[Double](println)
// Prints '6.0'
source.via(sum).to(sink).run
the streaming function
Function
Input
Output
Flow
Source
Sink
→
~>
~>
Publisher*
Processor*
Subscriber*
Inlet
→
Outlet
Shape: Inlets & Outlets
Akka Streams Level 1 (Basics)
Shape is to Graph as Signature is to Function
Function :
Graph :
Inputs & Outputs (Signature)
Inlets & Outlets (Shape)
val source: Source[Int] = Source(1 to 3)
val sum: Flow[Int, Double] = Flow[Int].fold(0.0)(_ + _)
val sink: Sink[Double] = Sink.foreach[Double](println)
// What is the shape of the following?
val runnable: RunnableGraph = source.via(sum).to(sink)
runnable.run
note: types intentionally simplified
Running a Graph
Akka Streams Level 1 (Basics)
implicit val system = ActorSystem("akka-streams")
implicit val materializer = ActorMaterializer()
// Create a runnable graph, steps omitted
val runnable: RunnableGraph = source.via(flow).to(sink)
// Run the graph with implicit Materializer
runnable.run()
Materializer is to Graph as
ExecutionContext is to Future
- Graph defines blueprint (akin to function def)
- Materializer runs a RunnableGraph (akin to function call)
- materialization allocates runtime resources
- Akka Streams uses Actors
- other (Spark, for example) theoretically possible
Materialized Value
Akka Streams Level 1 (Basics)
// Materialized type is the last type parameter by convention
// Sink that materializes a Future that completes when stream completes
val printer: Sink[Int, Future[Unit]] = Sink.foreach[Int](println)
// Sink that materializes a Future that completes with the first stream element
val head: Sink[Int, Future[Int]] = Sink.head[Int]
// Sources often don't materialize anything
val source: Source[Int, Unit] = Source(1 to 3)
// Source that emits periodically until cancelled via the materialized Cancellable
val ticks: Source[Int, Cancellable] = Source(1.second, 5.seconds, 42)
// Note that the above are merely blueprints
// No materialized values are produced until a graph is materialized
// Materialize a Graph which will run indefinitely or until cancelled
// Any graph can only materialize a single value
// Both printer and ticks materialize values (Future[Unit] and Cancellable)
// runWith() selects the target's materialized value
val cancellable: Cancellable = printer.runWith(ticks)
// Cancel the materialized ticks source
cancellable.cancel
Graph materialization result
- runtime resource produced by a Graph during materialization
- related to / used by processing, but not part of the stream itself
Concepts Checkpoint
Akka Streams Level 1 (Basics)
// Create flow materializer
implicit val system = ActorSystem("akka-streams")
implicit val materializer = ActorMaterializer()
// Create graph components
val nums = (1 to 10)
val source: Source[Int, Unit] = Source(nums)
val sum: Flow[Int, Int, Unit] = Flow[Int].fold(0)(_ + _)
val triple: Flow[Int, Int, Unit] = Flow[Int].map(_*3)
val head: Sink[Int, Future[Int]] = Sink.head[Int]
// Assemble and run a couple of graphs
val future1a: Future[Int] = source.via(sum).to(head).run
val future2a: Future[Int] = source.via(triple).via(sum).to(head).run
// Perform some basic tests
whenReady(future1a)(_ shouldBe nums.sum)
whenReady(future2a)(_ shouldBe (nums.sum * 3))
// Equivalent to the above graphs, using shortcuts for brevity
val future1b = Source(nums).runFold(0)(_ + _)
val future2b = Source(nums).via(triple).runFold(0)(_ + _)
Transformations
- map, mapAsync
- fold, scan, filter, collect
- take, drop
- take/drop(While|Within)
- grouped, groupedWithin
Akka Streams Level 1 (Basics)
the usual suspects...
// Source[Out, Mat]
def mapAsync[T](parallelism: Int)(f: (Out) ⇒ Future[T]): Source[T, Mat]
def takeWhile(p: (Out) ⇒ Boolean): Source[Out, Mat]
def takeWithin(d: FiniteDuration): Source[Out, Mat]
Transformations
- concat, concatMat
- flatten
- prefixAndTail, split(After|When)
- groupBy
- conflate, expand
Akka Streams Level 1 (Basics)
...and a few more
// Source[Out, Mat]
def prefixAndTail[U >: Out](n: Int): Source[(Seq[Out], Source[U, Unit]), Mat]
def splitWhen[U >: Out](p: (Out) ⇒ Boolean): Source[Source[U, Unit], Mat]
def groupBy[K, U >: Out](f: (Out) ⇒ K): Source[(K, Source[U, Unit]), Mat]
def conflate[S](seed: (Out) ⇒ S)(aggregate: (S, Out) ⇒ S): Source[S, Mat]
def expand[S, U](seed: (Out) ⇒ S)(extrapolate: (S) ⇒ (U, S)): Source[U, Mat]
Sample Application
Akka Streams & HTTP Level 1 (Basics)
enhanced historical price service
- calculate simple moving average (SMA)
- parse historical price CSV stream
- enhance historical price CSV stream
Part I (Streams)
- expose enhanced price service endpoint
- request historical prices via HTTP
- stream enhanced prices
Part II (HTTP)
Enhanced Price Service
Akka Streams Level 1 (Basics)
sample data
Date, Open, High, Low, Close, Volume, Adj Close
2014-12-31, 25.3, 25.3, 24.19, 24.84, 1438600, 24.84
2014-12-30, 26.28, 26.37, 25.29, 25.36, 766100, 25.36
2014-12-29, 26.64, 26.8, 26.13, 26.42, 619700, 26.42
2014-12-26, 27.25, 27.25, 26.42, 26.71, 360400, 26.71
Date, Open, High, Low, Close, Volume, Adj Close, Adj Close SMA(3)
2014-12-31, 25.3, 25.3, 24.19, 24.84, 1438600, 24.84, 25.54
2014-12-30, 26.28, 26.37, 25.29, 25.36, 766100, 25.36, 26.16
input
output
- reverse time series (most recent price first)
http://localhost:8080/stock/price/daily/yhoo?calculated=sma(3)
http://real-chart.finance.yahoo.com/table.csv?s=yhoo& ...
Enhanced Price Service
Akka Streams Level 1 (Basics)
- calculate
- enhance
- expose
Akka Streams Level 1 (Basics)
// Calculate simple moving average (SMA) using scan()
// to maintain a running sum and a sliding window
def sma(N: Int) = Flow[Double]
.scan((0.0, Seq.empty[Double])) {
case ((sum, win), x) =>
win.size match {
case N => (sum + x - win.head, win.tail :+ x)
case _ => (sum + x, win :+ x)
}
}
.drop(N) // Drop initial and incomplete windows
.map { case (sum, _) => sum / N.toDouble }
Source(1 to 5)
.map(n => (n*n).toDouble)
.via(sma(3))
.runForeach(sma => println(f"$sma%1.2f")
// Output:
// 4.67
// 9.67
// 16.67
Enhanced Price Stream
simple moving average (SMA)
Akka Streams Level 1 (Basics)
import akka.stream.io.Framing
// A CSV file row, parsed into columns
type Row = Array[String]
// Parse incoming bytes into CSV record stream
// Note: Each ByteString may contain more (or less) than one line
def parse(maximumLineLength: Int = 256): Flow[ByteString, Row, Unit] =
Framing.delimiter(ByteString("\n"), maximumLineLength, allowTruncation = true)
.map(_.utf8String.split("\\s*,\\s*"))
// Select a specific column (including header) by name
def select(name: String): Flow[Row, String, Unit] = Flow[Row]
.prefixAndTail(1).map { case (header, rows) =>
header.head.indexOf(name) match {
case -1 => Source.empty[String] // Named column not found
case index => Source.single(name)
.concatMat(rows.map(_(index)))(Keep.right)
}
}.flatten(FlattenStrategy.concat)
// Convert row into CSV formatted ByteString
val format = Flow[Row].map(row => ByteString(row.mkString("", ",", "\n")))
Enhanced Price Stream
CSV handling
Akka Streams Level 1 (Basics)
Enhanced Price Stream
SMA column
// Calculate and format SMA for a column, renaming the column
def smaCol(name: String, n: Int, format: String = "%1.2f") = Flow[String]
.prefixAndTail(1)
.map { case (header, data) =>
Source.single(name).concatMat(
data.map(_.toDouble)
.via(calculate.sma((n)))
.map(_.formatted(format))
)(Keep.right)
}
.flatten(FlattenStrategy.concat)
Adj Close
24.84
25.36
26.42
26.71
Adj Close SMA(3)
25.54
26.16
Branching Flows
Akka Streams Level 2 (Intermediate)
a common pattern
- branch stream (fan-out)
- transform each branch
- combine results (fan-in)
Fan-Out Shapes:
Fan-In Shapes:
Broadcast, Unzip, UnzipWith
Zip, ZipWith, Merge, MergePreferred
Graph DSL
Akka Streams Level 1 (Basics)
visual flow construction
bcast ~> ~> append.in0
bcast ~> select ~> smaCol ~> append.in1
// Calculate and append SMA column
def appendSma(n: Int): Flow[Row, Row, Unit] = Flow(
Broadcast[Row](2),
csv.select("Adj Close"),
smaCol(s"Adj Close SMA($n)", n),
ZipWith((row: Row, col: String) => row :+ col)
)((_, _, _, mat) => mat) {
implicit builder => (bcast, select, smaCol, append) =>
bcast ~> append.in0
bcast ~> select ~> smaCol ~> append.in1
(bcast.in, append.out)
}
appendSma
Akka Streams Level 1 (Basics)
import akka.stream.io._
val inSource = SynchronousFileSource(new File("input.csv"))
val expSource = SynchronousFileSource(new File("expected.csv"))
val builder = new ByteStringBuilder()
val outSink = Sink.foreach[ByteString](builder ++= _)
val outSource = Source(() => Iterator.single(builder.result()))
val window = 3
val smaName = s"Adj Close SMA($window)"
val future = inSource.via(csv.parse()).via(quote.appendSma(window)).via(csv.format)
.runWith(outSink)
whenReady(future) { unit =>
// Compare SMA column from output and expected
val selectSma = csv.parse().via(csv.select(smaName)).drop(1).map(_.toDouble)
val outFuture = outSource.via(selectSma).runFold(List.empty[Double])(_ :+ _)
val expFuture = expSource.via(selectSma).runFold(List.empty[Double])(_ :+ _)
whenReady(Future.sequence(Seq(outFuture, expFuture))) { case out :: exp :: Nil =>
out should have size exp.size
out.zip(exp).foreach { case (out, exp) =>
out shouldBe exp
}
}
}
Enhanced Price Stream
testing
Sample Application
Akka HTTP Level 1 (Basics)
enhanced historical price service
- calculate simple moving average (SMA)
- parse historical price CSV stream
- enhance historical price CSV stream
Part I (Streams)
- expose enhanced price service endpoint
- request historical prices via HTTP
- stream enhanced prices
Part II (HTTP)
Introducing Akka HTTP
stream-based web services
- "Spray 2.0" - address weaknesses and polish features
- key improvement: fully stream-based
- easily handle chunked responses and large entities
- address missing features (WebSockets, anyone?)
- reusable streams transformations
- extensible HTTP model and spec implementation
Akka HTTP Level 1 (Basics)
From Akka Streams to HTTP
a natural mapping
in
out
Akka HTTP Level 1 (Basics)
Akka HTTP
fully stream-based
// TCP and HTTP protocols modeled as Flows
// Flow is materialized for each incoming connection
// TCP
Tcp().bindAndHandle(handler: Flow[ByteString, ByteString, _], ...)
// HTTP (server low-level)
// Q: How is HTTP pipelining supported?
Http().bindAndHandle(handler: Flow[HttpRequest, HttpResponse, Any], ...)
// HTTP (client low-level)
Http().outgoingConnection(host: String, port: Int = 80, ...):
Flow[HttpRequest, HttpResponse, Future[OutgoingConnection]]
// HTTP entities modeled using ByteString Sources
// Request entity modeled as a source of bytes (Source[ByteString])
val request: HttpRequest
val dataBytes: Source[ByteString, Any] = request.entity.dataBytes
// Use Source[ByteString] to create response entity
val textSource: Source[ByteString, Any]
val chunked = HttpEntity.Chunked.fromData(MediaTypes.`text/plain`, textSource)
val response = HttpResponse(entity = chunked)
Akka HTTP Level 1 (Basics)
Enhanced Price Service
Akka HTTP Level 1 (Basics)
mock service endpoint
import akka.http.scaladsl.Http
import akka.http.scaladsl.server.Directives._
import akka.http.scaladsl.model.MediaTypes._
import akka.stream.io.SynchronousFileSource
// Find mock data file (if it exists) for the given symbol
def mockFile(symbol: String): Option[File] =
Option(getClass.getResource(s"/mock/stock/price/$symbol.csv"))
.map(url => new File(url.getFile))
.filter(_.exists)
// http://localhost/stock/price/daily/yhoo
val route: Route =
(get & path("stock"/"price"/"daily" / Segment)) { (symbol) =>
complete {
mockFile(symbol) match {
// Create chunked response from Source[ByteString]
case Some(file) => HttpEntity.Chunked.fromData(`text/csv`, SynchronousFileSource(file))
case None => NotFound
}
}
}
// Run a server with the given route (implicit conversion to Flow[HttpRequest, HttpResponse])
val binding = Http().bindAndHandle(route, "localhost", 8080)
Enhanced Price Service
Akka HTTP Level 1 (Basics)
fetching data
import csv.Row
import akka.http.scaladsl.Http
import akka.http.scaladsl.model.HttpEntity
import akka.http.scaladsl.model.MediaTypes._
import akka.http.scaladsl.server.Directives._
trait StockPriceClient {
def history(symbol: String): Future[Either[(StatusCode, String), Source[Row, Any]]]
}
case class YahooStockPriceClient
(implicit system: ActorSystem, executor: ExecutionContextExecutor, materializer: Materializer)
extends StockPricesClient
{
override def history(symbol: String) = {
val uri = buildUri(symbol) // http://real-chart.finance.yahoo.com/table.csv?s=$symbol&...
Http().singleRequest(RequestBuilding.Get(uri)).map { response =>
response.status match {
case OK => Right(response.entity.dataBytes.via(csv.parse())
case NotFound => Left(NotFound -> s"No data found for $symbol")
case status => Left(status -> s"Request to $uri failed with status $status")
}
}
}
}
Enhanced Price Service
Akka HTTP Level 1 (Basics)
expose service endpoint
import csv.Row
import akka.http.scaladsl.Http
import akka.http.scaladsl.model.HttpEntity
import akka.http.scaladsl.model.MediaTypes._
import akka.http.scaladsl.server.Directives._
trait StockPriceClient {
def history(symbol: String): Future[Either[(StatusCode, String), Source[Row, Any]]]
}
val client: StockPriceClient
// http://localhost/stock/prices/daily/yhoo/sma(10)
val route: Route =
(get & path("stock"/"prices"/"daily" / Segment / "sma(" ~ IntValue ~ ")")) {
(symbol, window) =>
client.history(symbol).map[ToResponseMarshallable] {
case Right(source) => HttpEntity.Chunked.fromData(`text/csv`,
source.via(quote.appendSma(window)).via(csv.format))
case Left(err @ (NotFound, _)) => err
case Left(_) => ServiceUnavailable -> "Error calling underlying service"
}
}
Enhanced Price Service
Akka Streams Level 1 (Basics)
solution review
Full Sample
Intro to Akka Streams & HTTP
https://github.com/lancearlaus/akka-streams-http-intro
- Bitcoin trades OHLCV service
- stackable services
- WebSockets
- custom stage
- custom route segments / parameters
- flow graph packaging
more to explore
⇒ git clone https://github.com/lancearlaus/akka-streams-http-introduction
⇒ sbt run
[info] Running Main
Starting server on localhost:8080...STARTED
Get started with the following URLs:
Stock Price Service:
Yahoo with default SMA : http://localhost:8080/stock/price/daily/yhoo
Yahoo 2 years w/ SMA(200) : http://localhost:8080/stock/price/daily/yhoo?period=2y&calculated=sma(200)
Facebook 1 year raw history : http://localhost:8080/stock/price/daily/fb?period=1y&raw=true
Bitcoin Trades Service:
Hourly OHLCV (Bitstamp USD) : http://localhost:8080/bitcoin/price/hourly/bitstamp/USD
Daily OHLCV (itBit USD) : http://localhost:8080/bitcoin/price/daily/itbit/USD
Recent trades (itBit USD) : http://localhost:8080/bitcoin/trades/itbit/USD
Trades raw response : http://localhost:8080/bitcoin/trades/bitstamp/USD?raw=true
Intro to Akka Streams and HTTP
Summary
- Reactive Streams Basics
- Akka Streams Basics
- Akka HTTP Basics
- Sample Service
Questions?
Intro to Akka Streams and HTTP
By Lance Arlaus
Intro to Akka Streams and HTTP
Got Streams? Streams are an effective way to model many real-world data problems from event processing to the typical service request/response cycle. Learn the basics of Akka Streams & HTTP through a practical introductory training talk, no prior Streams experience required. The talk will cover Akka Streams from the ground up, starting with Reactive Streams concepts, to give participants a solid foundation. Akka Streams & HTTP building blocks and capabilities will be explained and showcased via a stream processing example exposed as an Akka HTTP service.
- 5,644