Dispelling Magic behind Concurrency in FP
Piotr Gawryś
Who am I?
- One of the maintainers of Monix
- Contributor to Typelevel ecosystem
-
Kraków Scala User Group co-organizer
(let me know if you'd like to speak!) - Super excited to be here!
https://github.com/Avasil
twitter.com/p_gawrys
What's the talk about
- Concurrency fundamentals on JVM
- Focused around purely functional programming (Cats-Effect, FS2, Monix, ZIO)
- No overview/marketing of any of them
twitter.com/p_gawrys
Concurrency
- Processing multiple tasks interleaved
- Doesn't require more than one thread to work
twitter.com/p_gawrys
Concurrency
twitter.com/p_gawrys
Parallelism
- Executing multiple tasks at the same time to finish them faster
twitter.com/p_gawrys
Parallelism
twitter.com/p_gawrys
Threads
- Basic unit of CPU utilization
- Typically part of Operating System
- Threads can share memory
- Processors can usually run up to 1 thread per CPU's core at the same time
twitter.com/p_gawrys
Threads on JVM
- Map 1:1 to native OS threads
- Each thread takes around 1 MB of memory on 64-bit systems by default
twitter.com/p_gawrys
JVM Memory Model
Heap
Thread Stack
Objects
call stack
local variables
Thread Stack
call stack
local variables
Thread Stack
call stack
local variables
twitter.com/p_gawrys
CPU
CPU's Core 1
Main Memory
registers
CPU
Cache
JVM
CPU's Core 3
registers
CPU
Cache
CPU's Core 2
registers
CPU
Cache
twitter.com/p_gawrys
Context Switch
- Happens when new thread starts working on CPU's core
- OS needs to store the state of old task and restore the state of the new one
- Cache locality is lost
- Synchronous => no context switches => best performance
twitter.com/p_gawrys
Context Switch
twitter.com/p_gawrys
Thread Pools
- Take care of managing threads for us
- Can reuse threads
- Several types, e.g. Cached, SingleThreaded, WorkStealing etc.
- Think ExecutionContext (Future), ContextShift (cats.effect.IO), Scheduler (Monix Task)
twitter.com/p_gawrys
Thread Pools
import java.util.concurrent.Executors
import monix.execution.Scheduler
import scala.concurrent.ExecutionContext
val ec = ExecutionContext.fromExecutor(Executors.newCachedThreadPool())
val scheduler = Scheduler(ec)
twitter.com/p_gawrys
Extra capabilities
import monix.execution.schedulers.TestScheduler
import scala.concurrent.duration._
import cats.implicits._
val sc = TestScheduler()
val failedTask: Task[Int] = Task.sleep(2.days) >>
Task.raiseError[Int](new Exception("boom"))
val f: CancelableFuture[Int] = failedTask.runToFuture(sc)
println(f.value) // None
sc.tick(10.hours)
println(f.value) // None
sc.tick(1000.days)
println(f.value) // Some(Failure(java.lang.Exception: boom))
twitter.com/p_gawrys
Asynchronous Boundary
- Task returns to Thread Pool to be scheduled again
- Many scenarios can occur, e.g.:
- Cancelation
- Another Task starts executing
- The same Task executes on the same thread it was before
- New Thread is created
- and more!
twitter.com/p_gawrys
Asynchronous Boundary
val s: Scheduler = Scheduler.global
def repeat: Task[Unit] =
for {
_ <- Task.shift
_ <- Task(println(s"Shifted to: ${Thread.currentThread().getName}"))
_ <- repeat
} yield ()
repeat.runToFuture(s)
twitter.com/p_gawrys
Asynchronous Boundary
// Output:
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-14
// Shifted to: scala-execution-context-global-14
// Shifted to: scala-execution-context-global-14
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-13
// Shifted to: scala-execution-context-global-13
// Shifted to: scala-execution-context-global-13
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// ...
twitter.com/p_gawrys
val s1: Scheduler = Scheduler(
ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor()),
ExecutionModel.SynchronousExecution)
def repeat(id: Int): Task[Unit] =
Task(print(id)).flatMap(_ => repeat(id))
val prog = (repeat(1), repeat(2)).parTupled
prog.runToFuture(s1)
// Output:
// 1111111111111111111111111111111111111111111111111111111111...
Task Scheduling
twitter.com/p_gawrys
val s1: Scheduler = Scheduler(
ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor()),
ExecutionModel.SynchronousExecution)
def repeat(id: Int): Task[Unit] =
Task(print(id)).flatMap(_ => Task.shift >> repeat(id))
val prog = (repeat(1), repeat(2)).parTupled
prog.runToFuture(s1)
// Output:
// 121212121212121212121212121212121212121212121 ...
Task Scheduling
twitter.com/p_gawrys
val s1: Scheduler = Scheduler( // 4 = number of cores on my laptop
ExecutionContext.fromExecutor(Executors.newFixedThreadPool(4)),
ExecutionModel.SynchronousExecution)
val s2: Scheduler = Scheduler(
ExecutionContext.fromExecutor(Executors.newFixedThreadPool(1)),
ExecutionModel.SynchronousExecution)
def repeat(id: Int): Task[Unit] =
Task(print(id)).flatMap(_ => repeat(id))
val program = (repeat(1), repeat(2), repeat(3), repeat(4), repeat(5),
repeat(6).executeOn(s2)).parTupled
program.runToFuture(s1)
// Output:
// 143622331613424316134424316134243161342431613424316134243 ...
// no '5' !
Task Scheduling
twitter.com/p_gawrys
Light Async Boundary
- Continues on the same thread by means of a trampoline
- Checks with thread pool for cancelation status (in Monix & Cats-Effect IO)
- Can help with stack safety
- Low level!
twitter.com/p_gawrys
Light Async Boundary
implicit val s = Scheduler.global
.withExecutionModel(ExecutionModel.SynchronousExecution)
def task(i: Int): Task[Unit] =
Task(println(s"$i: ${Thread.currentThread().getName}")) >>
Task.shift(TrampolineExecutionContext.immediate) >> task(i + 1)
val t =
for {
fiber <- task(0)
.doOnCancel(Task(println("cancel")))
.start
_ <- fiber.cancel
} yield ()
t.runToFuture
// Output:
// 0: scala-execution-context-global-11
// 1: scala-execution-context-global-11
// 2: scala-execution-context-global-11
// cancel
twitter.com/p_gawrys
Light Async Boundary
val immediate: TrampolineExecutionContext =
TrampolineExecutionContext(new ExecutionContext {
def execute(r: Runnable): Unit = r.run()
def reportFailure(e: Throwable): Unit = throw e
})
twitter.com/p_gawrys
Blocking Threads
- Operation that takes entire thread without doing any useful work
- Don't do it if you have a choice
- Thread is being wasted, might prevent other tasks from being scheduled if the thread pool is limited
- Use dedicated thread pool for blocking operations
- Use timeouts
twitter.com/p_gawrys
Dealing with blocking ops
implicit val globalScheduler = Scheduler.global
val blockingScheduler = Scheduler.io()
val blockingOp = Task {
Thread.sleep(1000)
println(s"${Thread.currentThread().getName}: done blocking")
}
val cpuBound = Task {
// keep cpu busy
println(s"${Thread.currentThread().getName}: done calculating")
}
val t =
for {
_ <- cpuBound
_ <- blockingOp.executeOn(blockingScheduler)
_ <- cpuBound
} yield ()
t.runSyncUnsafe()
// scala-execution-context-global-11: done calculating
// monix-io-12: done blocking
// scala-execution-context-global-11: done calculating
Semantic / Asynchronous Blocking
- Task waits for certain signal before it can complete
- Doesn't really block any threads, other tasks can execute in the meantime
twitter.com/p_gawrys
Semantic / Asynchronous Blocking
val ec = ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor())
implicit val scheduler = Scheduler(ec)
val otherTask = Task.sleep(50.millis) >> Task(println("Running concurrently"))
val t =
for {
sem <- Semaphore[Task](0L)
// start means it will run asynchronously "in the background"
// and the next step in for comprehension will begin
_ <- (Task.sleep(100.millis) >>
Task(println("Releasing semaphore")) >> sem.release).start
_ <- Task(println("Waiting for permit")) >> sem.acquire
_ <- Task(println("Done!"))
} yield ()
(t, otherTask).parTupled.runSyncUnsafe()
// Waiting for permit
// Running concurrently
// Releasing semaphore
// Done!
Fairness
- Likelihood that different tasks are able to advance
- Without fairness you could get high disparity of latencies between processing different requests
- Monix and ZIO introduce async boundaries from time to time (configurable)
twitter.com/p_gawrys
Fairness
Cancelation / Interruption
- Ability to stop running Tasks
- Not all tasks are cancelable
- Cancelation might not happen immediately or even at all
twitter.com/p_gawrys
What makes Task cancelable?
- Asynchronous boundaries
- flatMaps (and more methods depending on implementation)
- Handling java.lang.InterruptedException (opt-in)
twitter.com/p_gawrys
Canceling Task
twitter.com/p_gawrys
implicit val s = Scheduler.global
def foo(i: Int): Task[Unit] =
for {
_ <- Task(println(s"start $i"))
_ <- if (i % 2 == 0) Task.raiseError(DummyException("error"))
else Task.sleep(10.millis)
_ <- Task(println(s"end $i"))
} yield ()
val tasks: List[Task[Unit]] = List(foo(1), foo(2), foo(3), foo(4))
val result: Task[List[Unit]] = Task.gather(tasks)
println(result.attempt.runSyncUnsafe())
// start 4
// start 1
// start 2
// start 3
// Left(monix.execution.exceptions.DummyException: error)
Canceling Task
twitter.com/p_gawrys
// (...) code from the previous slide
val tasks: List[Task[Unit]] = List(foo(1), foo(2), foo(3), foo(4))
val result: Task[List[Either[Throwable, Unit]]] = Task.wander(tasks)(_.attempt)
println(result.runSyncUnsafe())
// start 2
// start 1
// start 3
// start 4
// end 1
// end 3
// List(
// Right(()), Left(monix.execution.exceptions.DummyException: error),
// Right(()), Left(monix.execution.exceptions.DummyException: error)
// )
Canceling Task
twitter.com/p_gawrys
implicit val s = Scheduler.global
val t1 =
for {
_ <- Task(println("t1: start"))
_ <- Task.sleep(100.millis)
_ <- Task(println("t1: middle"))
_ <- Task.sleep(100.millis)
_ <- Task(println("t1: end"))
} yield ()
t1.timeout(150.millis).runSyncUnsafe()
// t1: start
// t1: middle
// Exception in thread "main" java.util.concurrent.TimeoutException:
// Task timed-out after 150 milliseconds of inactivity
Canceling Task
twitter.com/p_gawrys
val ec = ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor())
implicit val scheduler = Scheduler(ec)
def middle: Task[Unit] = Task {
while(true){}
}
val t1 =
for {
_ <- Task(println("t1: start"))
_ <- Task.sleep(100.millis)
_ <- middle >> Task(println("t1: middle"))
_ <- Task.sleep(100.millis)
_ <- Task(println("t1: end"))
} yield ()
t1.timeout(150.millis).runSyncUnsafe()
// t1: start
// ... never stops running
Green Threads and Fibers
Green Thread
Thread is scheduled by VM instead of OS. Cheap to start and we can map M Green Threads to N OS Threads.
Fiber
"Lightweight thread". Fibers voluntarily yield control to the scheduler.
twitter.com/p_gawrys
Is Task like a Green Thread/Fiber?
- We can map thousands of Tasks to very few OS threads
- Async boundaries give a chance to other tasks from the thread pool to execute, very much like cooperative multitasking
- "Blocking" a Task itself is OK since it doesn't block underlying Threads
twitter.com/p_gawrys
Fiber in Cats-Effect
twitter.com/p_gawrys
trait Fiber[F[_], A] {
def cancel: F[Unit]
def join: F[A]
}
sealed abstract class Task[+A] {
// (...)
final def start: Task[Fiber[A]]
// (...)
}
Thank you !
https://gitter.im/typelevel/cats-effect
https://gitter.im/functional-streams-for-scala/fs2
https://gitter.im/monix/monix
https://gitter.im/ZIO/core
I will appreciate any feedback. :)
If you're looking for help/discussion:
twitter.com/p_gawrys
Dispelling Magic behind Concurrency in FP (flatMap(Oslo) 2019)
By Piotr Gawryś
Dispelling Magic behind Concurrency in FP (flatMap(Oslo) 2019)
- 1,644