Functional Concurrency in Scala 101

Piotr Gawryś

Who am I?

One of the maintainers of Monix
Contributor to Typelevel ecosystem
Kraków Scala User Group co-organizer
(let me know if you'd like to speak!)
Super excited to be here!

https://github.com/Avasil

twitter.com/p_gawrys

What's the talk about

Concurrency fundamentals on JVM
Focused around purely functional programming (Cats-Effect, FS2, Monix, ZIO)

Concurrency

Processing multiple tasks interleaved
Doesn't require more than one thread to work

Concurrency

Parallelism

Executing multiple tasks at the same time to finish them faster

Parallelism

Threads

Basic unit of CPU utilization
Typically part of Operating System
Threads can share memory
Processors can usually run up to 1 thread per CPU's core at the same time

Threads on JVM

Map 1:1 to native OS threads
Each thread takes around 1 MB of memory on 64-bit systems by default

JVM Memory Model

Heap

Thread Stack

Objects

call stack

local variables

Thread Stack

call stack

local variables

Thread Stack

call stack

local variables

CPU

CPU's Core 1

Main Memory

registers

CPU

Cache

JVM

CPU's Core 3

registers

CPU

Cache

CPU's Core 2

registers

CPU

Cache

Context Switch

Happens when new thread starts working on CPU's core
OS needs to store the state of old task and restore the state of the new one
Cache locality is lost
Synchronous => no context switches => best performance

Context Switch

Thread Pools

Take care of managing threads for us
Can reuse threads
Several types, e.g. Cached, SingleThreaded, WorkStealing etc.
Think ExecutionContext (Future), ContextShift (cats.effect.IO), Scheduler (Monix Task)

Thread Pools

import java.util.concurrent.Executors
import monix.execution.Scheduler
import scala.concurrent.ExecutionContext

val ec = ExecutionContext.fromExecutor(Executors.newCachedThreadPool())
val scheduler = Scheduler(ec)

Extra capabilities

import monix.execution.schedulers.TestScheduler
import scala.concurrent.duration._
import cats.implicits._

val sc = TestScheduler()

val failedTask: Task[Int] = Task.sleep(2.days) >> 
  Task.raiseError[Int](new Exception("boom"))

val f: CancelableFuture[Int] = failedTask.runToFuture(sc)
  
println(f.value) // None
sc.tick(10.hours)
println(f.value) // None
sc.tick(1000.days)
println(f.value) // Some(Failure(java.lang.Exception: boom))

Asynchronous Boundary

Task returns to Thread Pool to be scheduled again
Many scenarios can occur, e.g.:
- Cancelation
- Another Task starts executing
- The same Task executes on the same thread it was before
- New Thread is created
- and more!

Asynchronous Boundary

val s: Scheduler = Scheduler.global

def repeat: Task[Unit] =
  for {
    _ <- Task.shift
    _ <- Task(println(s"Shifted to: ${Thread.currentThread().getName}"))
    _ <- repeat
  } yield ()

repeat.runToFuture(s)

Asynchronous Boundary

// Output:
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-14
// Shifted to: scala-execution-context-global-14
// Shifted to: scala-execution-context-global-14
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-13
// Shifted to: scala-execution-context-global-13
// Shifted to: scala-execution-context-global-13
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// ...

val s1: Scheduler = Scheduler(
  ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor()),
  ExecutionModel.SynchronousExecution)

def repeat(id: Int): Task[Unit] =
    Task(print(id)).flatMap(_ => repeat(id))

val prog = (repeat(1), repeat(2)).parTupled

prog.runToFuture(s1)

// Output:
// 1111111111111111111111111111111111111111111111111111111111...

Task Scheduling

val s1: Scheduler = Scheduler(
  ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor()),
  ExecutionModel.SynchronousExecution)

def repeat(id: Int): Task[Unit] =
  Task(print(id)).flatMap(_ => Task.shift >> repeat(id))

val prog = (repeat(1), repeat(2)).parTupled

prog.runToFuture(s1)

// Output:
// 121212121212121212121212121212121212121212121 ...

Task Scheduling

val s1: Scheduler = Scheduler( // 4 = number of cores on my laptop
  ExecutionContext.fromExecutor(Executors.newFixedThreadPool(4)),
  ExecutionModel.SynchronousExecution)
val s2: Scheduler = Scheduler(
  ExecutionContext.fromExecutor(Executors.newFixedThreadPool(1)),
  ExecutionModel.SynchronousExecution)

def repeat(id: Int): Task[Unit] =
  Task(print(id)).flatMap(_ => repeat(id))

val program = (repeat(1), repeat(2), repeat(3), repeat(4), repeat(5),
  repeat(6).executeOn(s2)).parTupled

program.runToFuture(s1)

// Output:
// 143622331613424316134424316134243161342431613424316134243 ...
// no '5' !

Task Scheduling

Light Async Boundary

Continues on the same thread by means of a trampoline
Checks with thread pool for cancelation status (in Monix & Cats-Effect IO)
Can help with stack safety
Low level!

Light Async Boundary

implicit val s = Scheduler.global
  .withExecutionModel(ExecutionModel.SynchronousExecution)

def task(i: Int): Task[Unit] =
  Task(println(s"$i: ${Thread.currentThread().getName}")) >> 
    Task.shift(TrampolineExecutionContext.immediate) >> task(i + 1)

val t =
  for {
    fiber <- task(0)
      .doOnCancel(Task(println("cancel")))
      .start
    _ <- fiber.cancel
  } yield ()

t.runToFuture 

// Output: 
// 0: scala-execution-context-global-11
// 1: scala-execution-context-global-11
// 2: scala-execution-context-global-11
// cancel

Light Async Boundary

val immediate: TrampolineExecutionContext =
  TrampolineExecutionContext(new ExecutionContext {
    def execute(r: Runnable): Unit = r.run()
    def reportFailure(e: Throwable): Unit = throw e
  })

Blocking Threads

Don't
Thread is being wasted, might prevent other tasks from being scheduled if the thread pool is limited
Use dedicated thread pool for blocking operations
Use timeouts

Dealing with blocking ops

implicit val globalScheduler = Scheduler.global
val blockingScheduler = Scheduler.io()

val blockingOp = Task {
  Thread.sleep(1000)
  println(s"${Thread.currentThread().getName}: done blocking")
}

val cpuBound = Task {
  // keep cpu busy
  println(s"${Thread.currentThread().getName}: done calculating")
}

val t =
  for {
    _ <- cpuBound
    _ <- blockingOp.executeOn(blockingScheduler)
    _ <- cpuBound
  } yield ()

t.runSyncUnsafe()

// scala-execution-context-global-11: done calculating
// monix-io-12: done blocking
// scala-execution-context-global-11: done calculating

Semantic / Asynchronous Blocking

Task waits for certain signal before it can complete
Doesn't really block any threads, other tasks can execute in the meantime

Semantic / Asynchronous Blocking

val ec = ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor())
implicit val scheduler = Scheduler(ec)

val otherTask = Task.sleep(50.millis) >> Task(println("Running concurrently"))

val t =
  for {
    sem <- Semaphore[Task](0L)
    // start means it will run asynchronously "in the background"
    // and the next step in for comprehension will begin
    _ <- (Task.sleep(100.millis) >>
      Task(println("Releasing semaphore")) >> sem.release).start
    _ <- Task(println("Waiting for permit")) >> sem.acquire
    _ <- Task(println("Done!"))
  } yield ()

(t, otherTask).parTupled.runSyncUnsafe()

// Waiting for permit
// Running concurrently
// Releasing semaphore
// Done!

Fairness

Likelihood that different tasks are able to advance
Without fairness you could get high disparity of latencies between processing different requests
Monix and ZIO introduce async boundaries from time to time (configurable)

Fairness

Green Threads and Fibers

Green Thread

Thread is scheduled by VM instead of OS. Cheap to start and we can map M Green Threads to N OS Threads.

Fiber

"Lightweight thread". Fibers voluntarily yield control to the scheduler.

Is Task like a Green Thread/Fiber?

We can map thousands of Tasks to very few OS threads
Async boundaries give a chance to other tasks from the thread pool to execute, very much like cooperative multitasking
"Blocking" a Task itself is OK since it doesn't block underlying Threads

Thank you !

https://gitter.im/typelevel/cats-effect

https://gitter.im/functional-streams-for-scala/fs2

https://gitter.im/monix/monix

https://gitter.im/scalaz/scalaz-zio

I will appreciate any feedback. :)

If you're looking for help/discussion:

FUNCTIONAL CONCURRENCY IN SCALA 101

By Piotr Gawryś

FUNCTIONAL CONCURRENCY IN SCALA 101

6 years ago
2,545

Piotr Gawryś

p_gawrys

Functional Concurrency in Scala 101

Who am I?

What's the talk about

Concurrency

Concurrency

Parallelism

Parallelism

Threads

Threads on JVM

JVM Memory Model

CPU

Context Switch

Context Switch

Thread Pools

Thread Pools

Extra capabilities

Asynchronous Boundary

Asynchronous Boundary

Asynchronous Boundary

Task Scheduling

Task Scheduling

Task Scheduling

Light Async Boundary

Light Async Boundary

Light Async Boundary

Blocking Threads

Dealing with blocking ops

Semantic / Asynchronous Blocking

Semantic / Asynchronous Blocking

Fairness

Fairness

Green Threads and Fibers

Green Thread

Fiber

Is Task like a Green Thread/Fiber?

Thank you !

I will appreciate any feedback. :)

FUNCTIONAL CONCURRENCY IN SCALA 101

More from Piotr Gawryś