Streams

What are streams for?

  • Infinite data
  • Complex event handling
  • IO: side-effects, concurrency and resource management
  • Iteration

What is fs2?

  • Building streams
  • Transforming streams
  • Based on cats effect
  • Haskell's IterateeIO

Trivial Example

case class Cat(name: String)

Stream(Cat("Mao"))
 .repeat
 .take(3)
 .compile
 .toList
 
 // List(Cat("Mao"), Cat("Mao"), Cat("Mao"))

How does it work?

sealed trait Pull[R]
  
case class Output(value: Cat) extends Pull[Unit]
case class Uncons(inner: Pull[Unit]) extends Pull[Option[Cat]]
case class Bind(inner: Pull[R], f: R => Pull[R2]) 
  extends Pull[R2]
case class Pure(value: R) extends Pull[R]

A free monad

Trivial Example

case class Cat(name: String)

Stream(Cat("Mao")) // Build the DSL
 .repeat
 .take(3)
 .compile // Interpret the DSL
 .toList
 
 // List(Cat("Mao"), Cat("Mao"), Cat("Mao"))

Effects

Stream.eval(IO.println("Mao"))
 .repeat
 .take(3)
 .compile
 .drain
 .unsafeRunSync()
 
 // Prints
 // Mao
 // Mao
 // Mao

Effects

Stream.eval(IO.println("Mao"))
 .repeat
 .take(3)
 .evalMap(_ => IO.println("Nyan Cat"))
 .compile
 .drain
 .unsafeRunSync()

fs2 is pull-based

  • "Pull" as opposed to "Push"
  • "Internal" as opposed to "External"
  • "Cold" as opposed to "Hot"
  • Only pull as many elements as requested

How many cats are present?

val randomCat: IO[Cat] = ???

Stream.eval(randomCat)
  .repeat
  .compile
  .take(10)
  .drain

How can we tell?

val randomCat: IO[Cat] = ???

val randomCatResource: Resource[IO, Cat] = Resource.make(
  IO.println("Added a cat") >> randomCat, 
  _ => IO.println("Removed a cat")
  )

Stream.resource(randomCatResource)
  .repeat
  .compile
  .take(10)
  .drain

Performance

Mao

Maru

Bob

Nyan Cat

Chunks

  • Processed atomically
  • Some operations (evalMap, evalScan) break down chunks
  • Favour evalMapChunk and evalScanChunk

Pull vs Push

Welcome back!

  • Concurrency and non-determinism
  • IO and resource safety
  • Streams are programs, not data types
  • Chunking
  • Creating streams from queues
  • Backpressure: pull vs push
  • Concurrency with streams
  • Combining streams

Experimentation

  • Small experiments
  • Make predictions
  • Tweak the code
  • Break the code

Thread Pools

What are they?

Why have several?

[error]    |  program.unsafeRunSync()
[error]    |                     ^
[error]    |Could not find an implicit IORuntime.
[error]    |
[error]    |The following import might fix the problem:
[error]    |
[error]    |  import cats.effect.unsafe.implicits.global

Reacting to errors

// http4s
BlazeClientBuilder.withExecutionContext(ec)

Possibly incorrectly

implicit val ec: ExecutionContext = ExecutionContext.global

Different kinds of work

  • Compute
  • Blocking

Text

Learning to Learn

Spot the problem

Exception in thread "main" java.lang.IndexOutOfBoundsException: 3
        at scala.collection.LinearSeqOps.apply(LinearSeq.scala:117)
        at scala.collection.LinearSeqOps.apply$(LinearSeq.scala:114)
        at scala.collection.immutable.List.apply(List.scala:79)
        at index.App$.lastUserId(App.scala:6)
  def lastUserId(userIds: List[Int]): Int = 
    userIds(userIds.length)

 The collection of user ids is described by the Scala List. This is zero-indexed, so the last user id is found at index N - 1 where N is the length of the list. When an index greater than this number is used, an exception is raised and the program exits.

We're going to think about

  • How to think about problems you've seen
  • How to think about problems you haven't
  • How to learn to solve them
  • How to learn effectively

Abstraction

 The collection of user ids is described by the Scala List. This is zero-indexed, so the last user id is found at index N - 1 where N is the length of the list. When an index greater than this number is used, an exception is raised and the program exits.

Notice we don't mention solid state physics

Concept Map

Notional Machine

     The computer:

  • Finds the length of the list of user ids
  • Gets the user id at the length index
  • Throws an exception
  • Skips the rest of the program

 

  def lastUserId(userIds: List[Int]): Int = 
    userIds(userIds.length)

An abstraction of abstraction

The Problem

Exception in thread "main" java.lang.NoClassDefFoundError: 
  scalacache/CacheConfig$
        at utils.UserIds$.cache(UserIds.scala:12)
        at index.App$.index(App.scala:15)
        at index.index.main(App.scala:13)
Caused by: java.lang.ClassNotFoundException: 
  scalacache.CacheConfig$
        at java.base/jdk.internal.loader...
import utils.UserIds
...
val cache = UserIds.cache[IO]
 

The worst that can happen...

  • It takes several days to resolve the problem.
  • You find something that works, but don't really understand why.
  • Several months later, you see another NoClassDefFoundError.
  • You can't remember how you resolved it before.

Exercise

  • Write your own explanation
  • Draw a concept map to help
    • Use the terms "Scala", "SBT" and "scalacache"

  Scala compiled the program, but fails to run it. Perhaps our Scala compiler didn't compile scalacache properly?

SBT threw an error when running the program. This is a problem with out SBT configuration.
The error originates in scalacache. There's a bug in scalacache, or in the way we're using it.

Explanations

A mental model...

  • is always incomplete
  • may be incorrect
  • is usually good enough

Refining a mental model

  • Draw it on paper
  • Explain it to someone else
  • Cross reference it
    • Read the docs
    • Ask someone more experienced
  • Experiment
  • Review and redraw

Metacognition

Think about your thinking

  • As you solved the problem, what do you spend your day doing?
  • Did your mental model change?
  • Draw concept maps as you go

Review

  • The importance of abstraction
  • Mental models and concept maps
  • How to refine a mental model

Thank you!

Fs2 Streams

By Zainab Ali

Fs2 Streams

  • 598