Functional Concurrency in Scala 101

Piotr Gawryś

Scala における関数型並行処理入門

Who am I?

  • One of the maintainers of Monix
  • Contributor to Typelevel ecosystem
  • Kraków Scala User Group co-organizer
    (let me know if you'd like to speak!)
  • Super excited to be here!

https://github.com/Avasil

twitter.com/p_gawrys

自己紹介

What's the talk about

  • Concurrency fundamentals on JVM
  • Focused around purely functional programming (Cats-Effect, FS2, Monix, ZIO)
  • No overview/marketing of any of them

今日話すこと。JVMにおける並行処理の基礎、そして純数関数型プログラミングにおける実現手段について。

ライブラリ同士の比較や、お勧めは特にしません。

Concurrency

  • Processing multiple tasks interleaved
  • Doesn't require more than one thread to work

並行処理。複数のタスクをインターリーブする。1つのスレッドでも実行可能。

Concurrency

並行処理。

Parallelism

  • Executing multiple tasks at the same time to finish them faster

並列処理。複数のタスクを同時に実行することで、早く完了させることができる。

Parallelism

Threads

  • Basic unit of CPU utilization
  • Typically part of Operating System
  • Threads can share memory
  • Processors can usually run up to 1 thread per CPU's core at the same time

スレッド。CPU利用の基礎単位で、典型的にはOSの一部。スレッドはメモリを共有可能。

CPU1コアあたり1スレッドが基本。

Threads on JVM

  • Map 1:1 to native OS threads
  • Each thread takes around 1 MB of memory on 64-bit systems by default

JVMでのスレッド。

ネイティブなOSスレッドと1:1対応。各スレッドは64bitシステムではデフォルトで1MBほど必要。

JVM Memory Model

Heap

Thread Stack

Objects

call stack

local variables

Thread Stack

call stack

local variables

Thread Stack

call stack

local variables

CPU

CPU's Core 1

Main Memory

registers

CPU

Cache

JVM

CPU's Core 3

registers

CPU

Cache

CPU's Core 2

registers

CPU

Cache

Context Switch

  • Happens when new thread starts working on CPU's core
  • OS needs to store the state of old task and restore the state of the new one
  • Cache locality is lost
  • Synchronous => no context switches => best performance

コンテキストスイッチ。

新しいスレッドの動作開始時に、古いタスクの状態を保存し、新しいタスクの状態を復元する。

同期処理はコンテキストスイッチが発生しないため、もっとも効率良い。

Context Switch

Thread Pools

  • Take care of managing threads for us
  • Can reuse threads
  • Several types, e.g. Cached, SingleThreaded, WorkStealing etc.
  • Think ExecutionContext (Future), ContextShift (cats.effect.IO), Scheduler (Monix Task)

スレッドプール。複数のスレッドの面倒を見ることで、スレッドの再利用が可能。

Thread Pools

import java.util.concurrent.Executors
import monix.execution.Scheduler
import scala.concurrent.ExecutionContext

val ec = ExecutionContext.fromExecutor(Executors.newCachedThreadPool())
val scheduler = Scheduler(ec)

スレッドプール

Extra capabilities

import monix.execution.schedulers.TestScheduler
import scala.concurrent.duration._
import cats.implicits._

val sc = TestScheduler()

val failedTask: Task[Int] = Task.sleep(2.days) >> 
  Task.raiseError[Int](new Exception("boom"))

val f: CancelableFuture[Int] = failedTask.runToFuture(sc)
  
println(f.value) // None
sc.tick(10.hours)
println(f.value) // None
sc.tick(1000.days)
println(f.value) // Some(Failure(java.lang.Exception: boom))

追加機能

Asynchronous Boundary

  • Task returns to Thread Pool to be scheduled again
  • Many scenarios can occur, e.g.:
    • Cancelation
    • Another Task starts executing
    • The same Task executes on the same thread it was before
    • New Thread is created
    • and more!

非同期境界。タスクが、再度スケジュールされるためにスレッドプールへ戻ること。

Asynchronous Boundary

val s: Scheduler = Scheduler.global

def repeat: Task[Unit] =
  for {
    _ <- Task.shift
    _ <- Task(println(s"Shifted to: ${Thread.currentThread().getName}"))
    _ <- repeat
  } yield ()

repeat.runToFuture(s)

非同期境界

Asynchronous Boundary

// Output:
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-14
// Shifted to: scala-execution-context-global-14
// Shifted to: scala-execution-context-global-14
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-13
// Shifted to: scala-execution-context-global-13
// Shifted to: scala-execution-context-global-13
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// ...

非同期境界

val s1: Scheduler = Scheduler(
  ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor()),
  ExecutionModel.SynchronousExecution)

def repeat(id: Int): Task[Unit] =
    Task(print(id)).flatMap(_ => repeat(id))

val prog = (repeat(1), repeat(2)).parTupled

prog.runToFuture(s1)

// Output:
// 1111111111111111111111111111111111111111111111111111111111...

Task Scheduling

タスクのスケジューリング

val s1: Scheduler = Scheduler(
  ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor()),
  ExecutionModel.SynchronousExecution)

def repeat(id: Int): Task[Unit] =
  Task(print(id)).flatMap(_ => Task.shift >> repeat(id))

val prog = (repeat(1), repeat(2)).parTupled

prog.runToFuture(s1)

// Output:
// 121212121212121212121212121212121212121212121 ...

Task Scheduling

タスクのスケジューリング

val s1: Scheduler = Scheduler( // 4 = number of cores on my laptop
  ExecutionContext.fromExecutor(Executors.newFixedThreadPool(4)),
  ExecutionModel.SynchronousExecution)
val s2: Scheduler = Scheduler(
  ExecutionContext.fromExecutor(Executors.newFixedThreadPool(1)),
  ExecutionModel.SynchronousExecution)

def repeat(id: Int): Task[Unit] =
  Task(print(id)).flatMap(_ => repeat(id))

val program = (repeat(1), repeat(2), repeat(3), repeat(4), repeat(5),
  repeat(6).executeOn(s2)).parTupled

program.runToFuture(s1)

// Output:
// 143622331613424316134424316134243161342431613424316134243 ...
// no '5' !

Task Scheduling

タスクのスケジューリング。

Light Async Boundary

  • Continues on the same thread by means of a trampoline
  • Checks with thread pool for cancelation status (in Monix & Cats-Effect IO)
  • Can help with stack safety
  • Low level!

軽量な非同期境界。トランポリンにより、同じスレッドで継続する。スタックセーフ。

MonixやCats-Effect IOでは、キャンセルされたかどうかのチェックも行う。

Light Async Boundary

implicit val s = Scheduler.global
  .withExecutionModel(ExecutionModel.SynchronousExecution)

def task(i: Int): Task[Unit] =
  Task(println(s"$i: ${Thread.currentThread().getName}")) >> 
    Task.shift(TrampolineExecutionContext.immediate) >> task(i + 1)

val t =
  for {
    fiber <- task(0)
      .doOnCancel(Task(println("cancel")))
      .start
    _ <- fiber.cancel
  } yield ()

t.runToFuture 

// Output: 
// 0: scala-execution-context-global-11
// 1: scala-execution-context-global-11
// 2: scala-execution-context-global-11
// cancel

軽量な非同期境界

Light Async Boundary

val immediate: TrampolineExecutionContext =
  TrampolineExecutionContext(new ExecutionContext {
    def execute(r: Runnable): Unit = r.run()
    def reportFailure(e: Throwable): Unit = throw e
  })

軽量な非同期境界

Blocking Threads

  • Operation that takes entire thread without doing any useful work
  • Don't do it if you have a choice
  • Thread is being wasted, might prevent other tasks from being scheduled if the thread pool is limited
  • Use dedicated thread pool for blocking operations 
  • Use timeouts

スレッドのブロッキング。スレッドを占有するオペレーション。

スレッド資源を台無しにするので、もし他に選択肢がある場合はやるべきでない。

Dealing with blocking ops

implicit val globalScheduler = Scheduler.global
val blockingScheduler = Scheduler.io()

val blockingOp = Task {
  Thread.sleep(1000)
  println(s"${Thread.currentThread().getName}: done blocking")
}

val cpuBound = Task {
  // keep cpu busy
  println(s"${Thread.currentThread().getName}: done calculating")
}

val t =
  for {
    _ <- cpuBound
    _ <- blockingOp.executeOn(blockingScheduler)
    _ <- cpuBound
  } yield ()

t.runSyncUnsafe()

// scala-execution-context-global-11: done calculating
// monix-io-12: done blocking
// scala-execution-context-global-11: done calculating

ブロックキング処理の扱い方

Semantic / Asynchronous Blocking

  • Task waits for certain signal before it can complete
  • Doesn't really block any threads, other tasks can execute in the meantime

twitter.com/p_gawrys

意味論の / 非同期なブロッキング

実際にはスレッドをブロックせず、その間に他のタスクを実行可能。

Semantic / Asynchronous Blocking

val ec = ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor())
implicit val scheduler = Scheduler(ec)

val otherTask = Task.sleep(50.millis) >> Task(println("Running concurrently"))

val t =
  for {
    sem <- Semaphore[Task](0L)
    // start means it will run asynchronously "in the background"
    // and the next step in for comprehension will begin
    _ <- (Task.sleep(100.millis) >>
      Task(println("Releasing semaphore")) >> sem.release).start
    _ <- Task(println("Waiting for permit")) >> sem.acquire
    _ <- Task(println("Done!"))
  } yield ()

(t, otherTask).parTupled.runSyncUnsafe()

// Waiting for permit
// Running concurrently
// Releasing semaphore
// Done!

Fairness

  • Likelihood that different tasks are able to advance 
  • Without fairness you could get high disparity of latencies between processing different requests
  • Monix and ZIO introduce async boundaries from time to time (configurable)

公平性。異なる複数のタスクを進めることができる蓋然性。

これがないと遅延のバラつきが大きくなる。

Fairness

Cancelation / Interruption

  • Ability to stop running Tasks
  • Not all tasks are cancelable
  • Cancelation might not happen immediately or even at all

キャンセル / 中断

What makes Task cancelable?

  • Asynchronous boundaries
  • flatMaps (and more methods depending on implementation)
  • Handling java.lang.InterruptedException (opt-in)

タスクをキャンセル出来るようにするには?

非同期境界、flatMapそしてInterruptedExceptionの取り扱い(オプトイン)。

Canceling Task

implicit val s = Scheduler.global

def foo(i: Int): Task[Unit] =
  for {
    _ <- Task(println(s"start $i"))
    _ <- if (i % 2 == 0) Task.raiseError(DummyException("error"))
         else Task.sleep(10.millis)
    _ <- Task(println(s"end $i"))
  } yield ()

val tasks: List[Task[Unit]] = List(foo(1), foo(2), foo(3), foo(4))
val result: Task[List[Unit]] = Task.gather(tasks)

println(result.attempt.runSyncUnsafe())

// start 4
// start 1
// start 2
// start 3
// Left(monix.execution.exceptions.DummyException: error)

Canceling Task

// (...) code from the previous slide

val tasks: List[Task[Unit]] = List(foo(1), foo(2), foo(3), foo(4))
val result: Task[List[Either[Throwable, Unit]]] = Task.wander(tasks)(_.attempt)

println(result.runSyncUnsafe())

// start 2
// start 1
// start 3
// start 4
// end 1
// end 3
// List(
//  Right(()), Left(monix.execution.exceptions.DummyException: error), 
//  Right(()), Left(monix.execution.exceptions.DummyException: error)
// )

Canceling Task

implicit val s = Scheduler.global

val t1 =
  for {
    _ <- Task(println("t1: start"))
    _ <- Task.sleep(100.millis)
    _ <- Task(println("t1: middle"))
    _ <- Task.sleep(100.millis)
    _ <- Task(println("t1: end"))
  } yield ()

t1.timeout(150.millis).runSyncUnsafe()

// t1: start
// t1: middle
// Exception in thread "main" java.util.concurrent.TimeoutException: 
// Task timed-out after 150 milliseconds of inactivity

Canceling Task

val ec = ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor())
implicit val scheduler = Scheduler(ec)

def middle: Task[Unit] = Task {
  while(true){}
}

val t1 =
  for {
    _ <- Task(println("t1: start"))
    _ <- Task.sleep(100.millis)
    _ <- middle >> Task(println("t1: middle"))
    _ <- Task.sleep(100.millis)
    _ <- Task(println("t1: end"))
  } yield ()

t1.timeout(150.millis).runSyncUnsafe()

// t1: start
// ... never stops running

Green Threads and Fibers

Green Thread

Thread is scheduled by VM instead of OS. Cheap to start and we can map M Green Threads to N OS Threads.

Fiber

"Lightweight thread". Fibers voluntarily yield control to the scheduler.

グリーンスレッドとファイバー。

グリーンスレッドは、OSではなくVMによってスケジュールされるスレッドで、OSスレッドと1:1対応でなくて良い。

ファイバーは、軽量スレッド。スケジューラに制御を移譲する。

Is Task like a Green Thread/Fiber?

  • We can map thousands of Tasks to very few OS threads
  • Async boundaries give a chance to other tasks from the thread pool to execute, very much like cooperative multitasking
  • "Blocking" a Task itself is OK since it doesn't block underlying Threads

タスクはグリーンスレッドやファイバーのようなものか?

Fiber in Cats-Effect

trait Fiber[F[_], A] {

  def cancel: F[Unit]

  def join: F[A]
}


sealed abstract class Task[+A] {
  // (...)

  final def start: Task[Fiber[A]]

  // (...)
}

Thank you !

https://gitter.im/typelevel/cats-effect

https://gitter.im/functional-streams-for-scala/fs2

https://gitter.im/monix/monix

https://gitter.im/ZIO/core

I will appreciate any feedback. :)

​If you're looking for help/discussion:

twitter.com/p_gawrys

ありがとう! フィードバック歓迎です:)

Functional Concurrency in Scala 101 (ScalaMatsuri 2019)

By Piotr Gawryś

Functional Concurrency in Scala 101 (ScalaMatsuri 2019)

  • 5,674