Functional Concurrency in Scala 101
Piotr Gawryś
Scala における関数型並行処理入門
Who am I?
- One of the maintainers of Monix
- Contributor to Typelevel ecosystem
-
Kraków Scala User Group co-organizer
(let me know if you'd like to speak!) - Super excited to be here!
https://github.com/Avasil
twitter.com/p_gawrys
自己紹介
What's the talk about
- Concurrency fundamentals on JVM
- Focused around purely functional programming (Cats-Effect, FS2, Monix, ZIO)
- No overview/marketing of any of them
今日話すこと。JVMにおける並行処理の基礎、そして純数関数型プログラミングにおける実現手段について。
ライブラリ同士の比較や、お勧めは特にしません。
Concurrency
- Processing multiple tasks interleaved
- Doesn't require more than one thread to work
並行処理。複数のタスクをインターリーブする。1つのスレッドでも実行可能。
Concurrency
並行処理。
Parallelism
- Executing multiple tasks at the same time to finish them faster
並列処理。複数のタスクを同時に実行することで、早く完了させることができる。
Parallelism
Threads
- Basic unit of CPU utilization
- Typically part of Operating System
- Threads can share memory
- Processors can usually run up to 1 thread per CPU's core at the same time
スレッド。CPU利用の基礎単位で、典型的にはOSの一部。スレッドはメモリを共有可能。
CPU1コアあたり1スレッドが基本。
Threads on JVM
- Map 1:1 to native OS threads
- Each thread takes around 1 MB of memory on 64-bit systems by default
JVMでのスレッド。
ネイティブなOSスレッドと1:1対応。各スレッドは64bitシステムではデフォルトで1MBほど必要。
JVM Memory Model
Heap
Thread Stack
Objects
call stack
local variables
Thread Stack
call stack
local variables
Thread Stack
call stack
local variables
CPU
CPU's Core 1
Main Memory
registers
CPU
Cache
JVM
CPU's Core 3
registers
CPU
Cache
CPU's Core 2
registers
CPU
Cache
Context Switch
- Happens when new thread starts working on CPU's core
- OS needs to store the state of old task and restore the state of the new one
- Cache locality is lost
- Synchronous => no context switches => best performance
コンテキストスイッチ。
新しいスレッドの動作開始時に、古いタスクの状態を保存し、新しいタスクの状態を復元する。
同期処理はコンテキストスイッチが発生しないため、もっとも効率良い。
Context Switch
Thread Pools
- Take care of managing threads for us
- Can reuse threads
- Several types, e.g. Cached, SingleThreaded, WorkStealing etc.
- Think ExecutionContext (Future), ContextShift (cats.effect.IO), Scheduler (Monix Task)
スレッドプール。複数のスレッドの面倒を見ることで、スレッドの再利用が可能。
Thread Pools
import java.util.concurrent.Executors
import monix.execution.Scheduler
import scala.concurrent.ExecutionContext
val ec = ExecutionContext.fromExecutor(Executors.newCachedThreadPool())
val scheduler = Scheduler(ec)
スレッドプール
Extra capabilities
import monix.execution.schedulers.TestScheduler
import scala.concurrent.duration._
import cats.implicits._
val sc = TestScheduler()
val failedTask: Task[Int] = Task.sleep(2.days) >>
Task.raiseError[Int](new Exception("boom"))
val f: CancelableFuture[Int] = failedTask.runToFuture(sc)
println(f.value) // None
sc.tick(10.hours)
println(f.value) // None
sc.tick(1000.days)
println(f.value) // Some(Failure(java.lang.Exception: boom))
追加機能
Asynchronous Boundary
- Task returns to Thread Pool to be scheduled again
- Many scenarios can occur, e.g.:
- Cancelation
- Another Task starts executing
- The same Task executes on the same thread it was before
- New Thread is created
- and more!
非同期境界。タスクが、再度スケジュールされるためにスレッドプールへ戻ること。
Asynchronous Boundary
val s: Scheduler = Scheduler.global
def repeat: Task[Unit] =
for {
_ <- Task.shift
_ <- Task(println(s"Shifted to: ${Thread.currentThread().getName}"))
_ <- repeat
} yield ()
repeat.runToFuture(s)
非同期境界
Asynchronous Boundary
// Output:
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-14
// Shifted to: scala-execution-context-global-14
// Shifted to: scala-execution-context-global-14
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-13
// Shifted to: scala-execution-context-global-13
// Shifted to: scala-execution-context-global-13
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// ...
非同期境界
val s1: Scheduler = Scheduler(
ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor()),
ExecutionModel.SynchronousExecution)
def repeat(id: Int): Task[Unit] =
Task(print(id)).flatMap(_ => repeat(id))
val prog = (repeat(1), repeat(2)).parTupled
prog.runToFuture(s1)
// Output:
// 1111111111111111111111111111111111111111111111111111111111...
Task Scheduling
タスクのスケジューリング
val s1: Scheduler = Scheduler(
ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor()),
ExecutionModel.SynchronousExecution)
def repeat(id: Int): Task[Unit] =
Task(print(id)).flatMap(_ => Task.shift >> repeat(id))
val prog = (repeat(1), repeat(2)).parTupled
prog.runToFuture(s1)
// Output:
// 121212121212121212121212121212121212121212121 ...
Task Scheduling
タスクのスケジューリング
val s1: Scheduler = Scheduler( // 4 = number of cores on my laptop
ExecutionContext.fromExecutor(Executors.newFixedThreadPool(4)),
ExecutionModel.SynchronousExecution)
val s2: Scheduler = Scheduler(
ExecutionContext.fromExecutor(Executors.newFixedThreadPool(1)),
ExecutionModel.SynchronousExecution)
def repeat(id: Int): Task[Unit] =
Task(print(id)).flatMap(_ => repeat(id))
val program = (repeat(1), repeat(2), repeat(3), repeat(4), repeat(5),
repeat(6).executeOn(s2)).parTupled
program.runToFuture(s1)
// Output:
// 143622331613424316134424316134243161342431613424316134243 ...
// no '5' !
Task Scheduling
タスクのスケジューリング。
Light Async Boundary
- Continues on the same thread by means of a trampoline
- Checks with thread pool for cancelation status (in Monix & Cats-Effect IO)
- Can help with stack safety
- Low level!
軽量な非同期境界。トランポリンにより、同じスレッドで継続する。スタックセーフ。
MonixやCats-Effect IOでは、キャンセルされたかどうかのチェックも行う。
Light Async Boundary
implicit val s = Scheduler.global
.withExecutionModel(ExecutionModel.SynchronousExecution)
def task(i: Int): Task[Unit] =
Task(println(s"$i: ${Thread.currentThread().getName}")) >>
Task.shift(TrampolineExecutionContext.immediate) >> task(i + 1)
val t =
for {
fiber <- task(0)
.doOnCancel(Task(println("cancel")))
.start
_ <- fiber.cancel
} yield ()
t.runToFuture
// Output:
// 0: scala-execution-context-global-11
// 1: scala-execution-context-global-11
// 2: scala-execution-context-global-11
// cancel
軽量な非同期境界
Light Async Boundary
val immediate: TrampolineExecutionContext =
TrampolineExecutionContext(new ExecutionContext {
def execute(r: Runnable): Unit = r.run()
def reportFailure(e: Throwable): Unit = throw e
})
軽量な非同期境界
Blocking Threads
- Operation that takes entire thread without doing any useful work
- Don't do it if you have a choice
- Thread is being wasted, might prevent other tasks from being scheduled if the thread pool is limited
- Use dedicated thread pool for blocking operations
- Use timeouts
スレッドのブロッキング。スレッドを占有するオペレーション。
スレッド資源を台無しにするので、もし他に選択肢がある場合はやるべきでない。
Dealing with blocking ops
implicit val globalScheduler = Scheduler.global
val blockingScheduler = Scheduler.io()
val blockingOp = Task {
Thread.sleep(1000)
println(s"${Thread.currentThread().getName}: done blocking")
}
val cpuBound = Task {
// keep cpu busy
println(s"${Thread.currentThread().getName}: done calculating")
}
val t =
for {
_ <- cpuBound
_ <- blockingOp.executeOn(blockingScheduler)
_ <- cpuBound
} yield ()
t.runSyncUnsafe()
// scala-execution-context-global-11: done calculating
// monix-io-12: done blocking
// scala-execution-context-global-11: done calculating
ブロックキング処理の扱い方
Semantic / Asynchronous Blocking
- Task waits for certain signal before it can complete
- Doesn't really block any threads, other tasks can execute in the meantime
twitter.com/p_gawrys
意味論の / 非同期なブロッキング
実際にはスレッドをブロックせず、その間に他のタスクを実行可能。
Semantic / Asynchronous Blocking
val ec = ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor())
implicit val scheduler = Scheduler(ec)
val otherTask = Task.sleep(50.millis) >> Task(println("Running concurrently"))
val t =
for {
sem <- Semaphore[Task](0L)
// start means it will run asynchronously "in the background"
// and the next step in for comprehension will begin
_ <- (Task.sleep(100.millis) >>
Task(println("Releasing semaphore")) >> sem.release).start
_ <- Task(println("Waiting for permit")) >> sem.acquire
_ <- Task(println("Done!"))
} yield ()
(t, otherTask).parTupled.runSyncUnsafe()
// Waiting for permit
// Running concurrently
// Releasing semaphore
// Done!
Fairness
- Likelihood that different tasks are able to advance
- Without fairness you could get high disparity of latencies between processing different requests
- Monix and ZIO introduce async boundaries from time to time (configurable)
公平性。異なる複数のタスクを進めることができる蓋然性。
これがないと遅延のバラつきが大きくなる。
Fairness
Cancelation / Interruption
- Ability to stop running Tasks
- Not all tasks are cancelable
- Cancelation might not happen immediately or even at all
キャンセル / 中断
What makes Task cancelable?
- Asynchronous boundaries
- flatMaps (and more methods depending on implementation)
- Handling java.lang.InterruptedException (opt-in)
タスクをキャンセル出来るようにするには?
非同期境界、flatMapそしてInterruptedExceptionの取り扱い(オプトイン)。
Canceling Task
implicit val s = Scheduler.global
def foo(i: Int): Task[Unit] =
for {
_ <- Task(println(s"start $i"))
_ <- if (i % 2 == 0) Task.raiseError(DummyException("error"))
else Task.sleep(10.millis)
_ <- Task(println(s"end $i"))
} yield ()
val tasks: List[Task[Unit]] = List(foo(1), foo(2), foo(3), foo(4))
val result: Task[List[Unit]] = Task.gather(tasks)
println(result.attempt.runSyncUnsafe())
// start 4
// start 1
// start 2
// start 3
// Left(monix.execution.exceptions.DummyException: error)
Canceling Task
// (...) code from the previous slide
val tasks: List[Task[Unit]] = List(foo(1), foo(2), foo(3), foo(4))
val result: Task[List[Either[Throwable, Unit]]] = Task.wander(tasks)(_.attempt)
println(result.runSyncUnsafe())
// start 2
// start 1
// start 3
// start 4
// end 1
// end 3
// List(
// Right(()), Left(monix.execution.exceptions.DummyException: error),
// Right(()), Left(monix.execution.exceptions.DummyException: error)
// )
Canceling Task
implicit val s = Scheduler.global
val t1 =
for {
_ <- Task(println("t1: start"))
_ <- Task.sleep(100.millis)
_ <- Task(println("t1: middle"))
_ <- Task.sleep(100.millis)
_ <- Task(println("t1: end"))
} yield ()
t1.timeout(150.millis).runSyncUnsafe()
// t1: start
// t1: middle
// Exception in thread "main" java.util.concurrent.TimeoutException:
// Task timed-out after 150 milliseconds of inactivity
Canceling Task
val ec = ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor())
implicit val scheduler = Scheduler(ec)
def middle: Task[Unit] = Task {
while(true){}
}
val t1 =
for {
_ <- Task(println("t1: start"))
_ <- Task.sleep(100.millis)
_ <- middle >> Task(println("t1: middle"))
_ <- Task.sleep(100.millis)
_ <- Task(println("t1: end"))
} yield ()
t1.timeout(150.millis).runSyncUnsafe()
// t1: start
// ... never stops running
Green Threads and Fibers
Green Thread
Thread is scheduled by VM instead of OS. Cheap to start and we can map M Green Threads to N OS Threads.
Fiber
"Lightweight thread". Fibers voluntarily yield control to the scheduler.
グリーンスレッドとファイバー。
グリーンスレッドは、OSではなくVMによってスケジュールされるスレッドで、OSスレッドと1:1対応でなくて良い。
ファイバーは、軽量スレッド。スケジューラに制御を移譲する。
Is Task like a Green Thread/Fiber?
- We can map thousands of Tasks to very few OS threads
- Async boundaries give a chance to other tasks from the thread pool to execute, very much like cooperative multitasking
- "Blocking" a Task itself is OK since it doesn't block underlying Threads
タスクはグリーンスレッドやファイバーのようなものか?
Fiber in Cats-Effect
trait Fiber[F[_], A] {
def cancel: F[Unit]
def join: F[A]
}
sealed abstract class Task[+A] {
// (...)
final def start: Task[Fiber[A]]
// (...)
}
Thank you !
https://gitter.im/typelevel/cats-effect
https://gitter.im/functional-streams-for-scala/fs2
https://gitter.im/monix/monix
https://gitter.im/ZIO/core
I will appreciate any feedback. :)
If you're looking for help/discussion:
twitter.com/p_gawrys
ありがとう! フィードバック歓迎です:)
Functional Concurrency in Scala 101 (ScalaMatsuri 2019)
By Piotr Gawryś
Functional Concurrency in Scala 101 (ScalaMatsuri 2019)
- 6,141