Dispelling Magic behind Concurrency in FP

Piotr Gawryś

Who am I?

  • One of the maintainers of Monix
  • Contributor to Typelevel ecosystem
  • Kraków Scala User Group co-organizer
    (let me know if you'd like to speak!)
  • Super excited to be here!

https://github.com/Avasil

twitter.com/p_gawrys

What's the talk about

  • Concurrency fundamentals on JVM
  • Focused around purely functional programming (Cats-Effect, FS2, Monix, ZIO)
  • No overview/marketing of any of them

twitter.com/p_gawrys

Concurrency

  • Processing multiple tasks interleaved
  • Doesn't require more than one thread to work

twitter.com/p_gawrys

Concurrency

twitter.com/p_gawrys

Parallelism

  • Executing multiple tasks at the same time to finish them faster

twitter.com/p_gawrys

Parallelism

twitter.com/p_gawrys

Threads

  • Basic unit of CPU utilization
  • Typically part of Operating System
  • Threads can share memory
  • Processors can usually run up to 1 thread per CPU's core at the same time

twitter.com/p_gawrys

Threads on JVM

  • Map 1:1 to native OS threads
  • Each thread takes around 1 MB of memory on 64-bit systems by default

twitter.com/p_gawrys

JVM Memory Model

Heap

Thread Stack

Objects

call stack

local variables

Thread Stack

call stack

local variables

Thread Stack

call stack

local variables

twitter.com/p_gawrys

CPU

CPU's Core 1

Main Memory

registers

CPU

Cache

JVM

CPU's Core 3

registers

CPU

Cache

CPU's Core 2

registers

CPU

Cache

twitter.com/p_gawrys

Context Switch

  • Happens when new thread starts working on CPU's core
  • OS needs to store the state of old task and restore the state of the new one
  • Cache locality is lost
  • Synchronous => no context switches => best performance

twitter.com/p_gawrys

Context Switch

twitter.com/p_gawrys

Thread Pools

  • Take care of managing threads for us
  • Can reuse threads
  • Several types, e.g. Cached, SingleThreaded, WorkStealing etc.
  • Think ExecutionContext (Future), ContextShift (cats.effect.IO), Scheduler (Monix Task)

twitter.com/p_gawrys

Thread Pools

import java.util.concurrent.Executors
import monix.execution.Scheduler
import scala.concurrent.ExecutionContext

val ec = ExecutionContext.fromExecutor(Executors.newCachedThreadPool())
val scheduler = Scheduler(ec)

twitter.com/p_gawrys

Extra capabilities

import monix.execution.schedulers.TestScheduler
import scala.concurrent.duration._
import cats.implicits._

val sc = TestScheduler()

val failedTask: Task[Int] = Task.sleep(2.days) >> 
  Task.raiseError[Int](new Exception("boom"))

val f: CancelableFuture[Int] = failedTask.runToFuture(sc)
  
println(f.value) // None
sc.tick(10.hours)
println(f.value) // None
sc.tick(1000.days)
println(f.value) // Some(Failure(java.lang.Exception: boom))

twitter.com/p_gawrys

Asynchronous Boundary

  • Task returns to Thread Pool to be scheduled again
  • Many scenarios can occur, e.g.:
    • Cancelation
    • Another Task starts executing
    • The same Task executes on the same thread it was before
    • New Thread is created
    • and more!

twitter.com/p_gawrys

Asynchronous Boundary

val s: Scheduler = Scheduler.global

def repeat: Task[Unit] =
  for {
    _ <- Task.shift
    _ <- Task(println(s"Shifted to: ${Thread.currentThread().getName}"))
    _ <- repeat
  } yield ()

repeat.runToFuture(s)

twitter.com/p_gawrys

Asynchronous Boundary

// Output:
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-14
// Shifted to: scala-execution-context-global-14
// Shifted to: scala-execution-context-global-14
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-13
// Shifted to: scala-execution-context-global-13
// Shifted to: scala-execution-context-global-13
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// ...

twitter.com/p_gawrys

val s1: Scheduler = Scheduler(
  ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor()),
  ExecutionModel.SynchronousExecution)

def repeat(id: Int): Task[Unit] =
    Task(print(id)).flatMap(_ => repeat(id))

val prog = (repeat(1), repeat(2)).parTupled

prog.runToFuture(s1)

// Output:
// 1111111111111111111111111111111111111111111111111111111111...

Task Scheduling

twitter.com/p_gawrys

val s1: Scheduler = Scheduler(
  ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor()),
  ExecutionModel.SynchronousExecution)

def repeat(id: Int): Task[Unit] =
  Task(print(id)).flatMap(_ => Task.shift >> repeat(id))

val prog = (repeat(1), repeat(2)).parTupled

prog.runToFuture(s1)

// Output:
// 121212121212121212121212121212121212121212121 ...

Task Scheduling

twitter.com/p_gawrys

val s1: Scheduler = Scheduler( // 4 = number of cores on my laptop
  ExecutionContext.fromExecutor(Executors.newFixedThreadPool(4)),
  ExecutionModel.SynchronousExecution)
val s2: Scheduler = Scheduler(
  ExecutionContext.fromExecutor(Executors.newFixedThreadPool(1)),
  ExecutionModel.SynchronousExecution)

def repeat(id: Int): Task[Unit] =
  Task(print(id)).flatMap(_ => repeat(id))

val program = (repeat(1), repeat(2), repeat(3), repeat(4), repeat(5),
  repeat(6).executeOn(s2)).parTupled

program.runToFuture(s1)

// Output:
// 143622331613424316134424316134243161342431613424316134243 ...
// no '5' !

Task Scheduling

twitter.com/p_gawrys

Light Async Boundary

  • Continues on the same thread by means of a trampoline
  • Checks with thread pool for cancelation status (in Monix & Cats-Effect IO)
  • Can help with stack safety
  • Low level!

twitter.com/p_gawrys

Light Async Boundary

implicit val s = Scheduler.global
  .withExecutionModel(ExecutionModel.SynchronousExecution)

def task(i: Int): Task[Unit] =
  Task(println(s"$i: ${Thread.currentThread().getName}")) >> 
    Task.shift(TrampolineExecutionContext.immediate) >> task(i + 1)

val t =
  for {
    fiber <- task(0)
      .doOnCancel(Task(println("cancel")))
      .start
    _ <- fiber.cancel
  } yield ()

t.runToFuture 

// Output: 
// 0: scala-execution-context-global-11
// 1: scala-execution-context-global-11
// 2: scala-execution-context-global-11
// cancel

twitter.com/p_gawrys

Light Async Boundary

val immediate: TrampolineExecutionContext =
  TrampolineExecutionContext(new ExecutionContext {
    def execute(r: Runnable): Unit = r.run()
    def reportFailure(e: Throwable): Unit = throw e
  })

twitter.com/p_gawrys

Blocking Threads

  • Operation that takes entire thread without doing any useful work
  • Don't do it if you have a choice
  • Thread is being wasted, might prevent other tasks from being scheduled if the thread pool is limited
  • Use dedicated thread pool for blocking operations 
  • Use timeouts

twitter.com/p_gawrys

Dealing with blocking ops

implicit val globalScheduler = Scheduler.global
val blockingScheduler = Scheduler.io()

val blockingOp = Task {
  Thread.sleep(1000)
  println(s"${Thread.currentThread().getName}: done blocking")
}

val cpuBound = Task {
  // keep cpu busy
  println(s"${Thread.currentThread().getName}: done calculating")
}

val t =
  for {
    _ <- cpuBound
    _ <- blockingOp.executeOn(blockingScheduler)
    _ <- cpuBound
  } yield ()

t.runSyncUnsafe()

// scala-execution-context-global-11: done calculating
// monix-io-12: done blocking
// scala-execution-context-global-11: done calculating

Semantic / Asynchronous Blocking

  • Task waits for certain signal before it can complete
  • Doesn't really block any threads, other tasks can execute in the meantime

twitter.com/p_gawrys

Semantic / Asynchronous Blocking

val ec = ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor())
implicit val scheduler = Scheduler(ec)

val otherTask = Task.sleep(50.millis) >> Task(println("Running concurrently"))

val t =
  for {
    sem <- Semaphore[Task](0L)
    // start means it will run asynchronously "in the background"
    // and the next step in for comprehension will begin
    _ <- (Task.sleep(100.millis) >>
      Task(println("Releasing semaphore")) >> sem.release).start
    _ <- Task(println("Waiting for permit")) >> sem.acquire
    _ <- Task(println("Done!"))
  } yield ()

(t, otherTask).parTupled.runSyncUnsafe()

// Waiting for permit
// Running concurrently
// Releasing semaphore
// Done!

Fairness

  • Likelihood that different tasks are able to advance 
  • Without fairness you could get high disparity of latencies between processing different requests
  • Monix and ZIO introduce async boundaries from time to time (configurable)

twitter.com/p_gawrys

Fairness

Cancelation / Interruption

  • Ability to stop running Tasks
  • Not all tasks are cancelable
  • Cancelation might not happen immediately or even at all

twitter.com/p_gawrys

What makes Task cancelable?

  • Asynchronous boundaries
  • flatMaps (and more methods depending on implementation)
  • Handling java.lang.InterruptedException (opt-in)

twitter.com/p_gawrys

Canceling Task

twitter.com/p_gawrys

implicit val s = Scheduler.global

def foo(i: Int): Task[Unit] =
  for {
    _ <- Task(println(s"start $i"))
    _ <- if (i % 2 == 0) Task.raiseError(DummyException("error"))
         else Task.sleep(10.millis)
    _ <- Task(println(s"end $i"))
  } yield ()

val tasks: List[Task[Unit]] = List(foo(1), foo(2), foo(3), foo(4))
val result: Task[List[Unit]] = Task.gather(tasks)

println(result.attempt.runSyncUnsafe())

// start 4
// start 1
// start 2
// start 3
// Left(monix.execution.exceptions.DummyException: error)

Canceling Task

twitter.com/p_gawrys

// (...) code from the previous slide

val tasks: List[Task[Unit]] = List(foo(1), foo(2), foo(3), foo(4))
val result: Task[List[Either[Throwable, Unit]]] = Task.wander(tasks)(_.attempt)

println(result.runSyncUnsafe())

// start 2
// start 1
// start 3
// start 4
// end 1
// end 3
// List(
//  Right(()), Left(monix.execution.exceptions.DummyException: error), 
//  Right(()), Left(monix.execution.exceptions.DummyException: error)
// )

Canceling Task

twitter.com/p_gawrys

implicit val s = Scheduler.global

val t1 =
  for {
    _ <- Task(println("t1: start"))
    _ <- Task.sleep(100.millis)
    _ <- Task(println("t1: middle"))
    _ <- Task.sleep(100.millis)
    _ <- Task(println("t1: end"))
  } yield ()

t1.timeout(150.millis).runSyncUnsafe()

// t1: start
// t1: middle
// Exception in thread "main" java.util.concurrent.TimeoutException: 
// Task timed-out after 150 milliseconds of inactivity

Canceling Task

twitter.com/p_gawrys

val ec = ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor())
implicit val scheduler = Scheduler(ec)

def middle: Task[Unit] = Task {
  while(true){}
}

val t1 =
  for {
    _ <- Task(println("t1: start"))
    _ <- Task.sleep(100.millis)
    _ <- middle >> Task(println("t1: middle"))
    _ <- Task.sleep(100.millis)
    _ <- Task(println("t1: end"))
  } yield ()

t1.timeout(150.millis).runSyncUnsafe()

// t1: start
// ... never stops running

Green Threads and Fibers

Green Thread

Thread is scheduled by VM instead of OS. Cheap to start and we can map M Green Threads to N OS Threads.

Fiber

"Lightweight thread". Fibers voluntarily yield control to the scheduler.

twitter.com/p_gawrys

Is Task like a Green Thread/Fiber?

  • We can map thousands of Tasks to very few OS threads
  • Async boundaries give a chance to other tasks from the thread pool to execute, very much like cooperative multitasking
  • "Blocking" a Task itself is OK since it doesn't block underlying Threads

twitter.com/p_gawrys

Fiber in Cats-Effect

twitter.com/p_gawrys

trait Fiber[F[_], A] {

  def cancel: F[Unit]

  def join: F[A]
}


sealed abstract class Task[+A] {
  // (...)

  final def start: Task[Fiber[A]]

  // (...)
}

Thank you !

https://gitter.im/typelevel/cats-effect

https://gitter.im/functional-streams-for-scala/fs2

https://gitter.im/monix/monix

https://gitter.im/ZIO/core

I will appreciate any feedback. :)

​If you're looking for help/discussion:

twitter.com/p_gawrys

Dispelling Magic behind Concurrency in FP (flatMap(Oslo) 2019)

By Piotr Gawryś

Dispelling Magic behind Concurrency in FP (flatMap(Oslo) 2019)

  • 1,652