Functional Concurrency in Scala 101
Piotr Gawryś
Who am I?
- One of the maintainers of Monix
- Contributor to Typelevel ecosystem
-
Kraków Scala User Group co-organizer
(let me know if you'd like to speak!) - Super excited to be here!
https://github.com/Avasil
twitter.com/p_gawrys
What's the talk about
- Concurrency fundamentals on JVM
- Focused around purely functional programming (Cats-Effect, FS2, Monix, ZIO)
Concurrency
- Processing multiple tasks interleaved
- Doesn't require more than one thread to work
Concurrency
Parallelism
- Executing multiple tasks at the same time to finish them faster
Parallelism
Threads
- Basic unit of CPU utilization
- Typically part of Operating System
- Threads can share memory
- Processors can usually run up to 1 thread per CPU's core at the same time
Threads on JVM
- Map 1:1 to native OS threads
- Each thread takes around 1 MB of memory on 64-bit systems by default
JVM Memory Model
Heap
Thread Stack
Objects
call stack
local variables
Thread Stack
call stack
local variables
Thread Stack
call stack
local variables
CPU
CPU's Core 1
Main Memory
registers
CPU
Cache
JVM
CPU's Core 3
registers
CPU
Cache
CPU's Core 2
registers
CPU
Cache
Context Switch
- Happens when new thread starts working on CPU's core
- OS needs to store the state of old task and restore the state of the new one
- Cache locality is lost
- Synchronous => no context switches => best performance
Context Switch
Thread Pools
- Take care of managing threads for us
- Can reuse threads
- Several types, e.g. Cached, SingleThreaded, WorkStealing etc.
- Think ExecutionContext (Future), ContextShift (cats.effect.IO), Scheduler (Monix Task)
Thread Pools
import java.util.concurrent.Executors
import monix.execution.Scheduler
import scala.concurrent.ExecutionContext
val ec = ExecutionContext.fromExecutor(Executors.newCachedThreadPool())
val scheduler = Scheduler(ec)
Extra capabilities
import monix.execution.schedulers.TestScheduler
import scala.concurrent.duration._
import cats.implicits._
val sc = TestScheduler()
val failedTask: Task[Int] = Task.sleep(2.days) >>
Task.raiseError[Int](new Exception("boom"))
val f: CancelableFuture[Int] = failedTask.runToFuture(sc)
println(f.value) // None
sc.tick(10.hours)
println(f.value) // None
sc.tick(1000.days)
println(f.value) // Some(Failure(java.lang.Exception: boom))
Asynchronous Boundary
- Task returns to Thread Pool to be scheduled again
- Many scenarios can occur, e.g.:
- Cancelation
- Another Task starts executing
- The same Task executes on the same thread it was before
- New Thread is created
- and more!
Asynchronous Boundary
val s: Scheduler = Scheduler.global
def repeat: Task[Unit] =
for {
_ <- Task.shift
_ <- Task(println(s"Shifted to: ${Thread.currentThread().getName}"))
_ <- repeat
} yield ()
repeat.runToFuture(s)
Asynchronous Boundary
// Output:
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-14
// Shifted to: scala-execution-context-global-14
// Shifted to: scala-execution-context-global-14
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-13
// Shifted to: scala-execution-context-global-13
// Shifted to: scala-execution-context-global-13
// Shifted to: scala-execution-context-global-12
// Shifted to: scala-execution-context-global-12
// ...
val s1: Scheduler = Scheduler(
ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor()),
ExecutionModel.SynchronousExecution)
def repeat(id: Int): Task[Unit] =
Task(print(id)).flatMap(_ => repeat(id))
val prog = (repeat(1), repeat(2)).parTupled
prog.runToFuture(s1)
// Output:
// 1111111111111111111111111111111111111111111111111111111111...
Task Scheduling
val s1: Scheduler = Scheduler(
ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor()),
ExecutionModel.SynchronousExecution)
def repeat(id: Int): Task[Unit] =
Task(print(id)).flatMap(_ => Task.shift >> repeat(id))
val prog = (repeat(1), repeat(2)).parTupled
prog.runToFuture(s1)
// Output:
// 121212121212121212121212121212121212121212121 ...
Task Scheduling
val s1: Scheduler = Scheduler( // 4 = number of cores on my laptop
ExecutionContext.fromExecutor(Executors.newFixedThreadPool(4)),
ExecutionModel.SynchronousExecution)
val s2: Scheduler = Scheduler(
ExecutionContext.fromExecutor(Executors.newFixedThreadPool(1)),
ExecutionModel.SynchronousExecution)
def repeat(id: Int): Task[Unit] =
Task(print(id)).flatMap(_ => repeat(id))
val program = (repeat(1), repeat(2), repeat(3), repeat(4), repeat(5),
repeat(6).executeOn(s2)).parTupled
program.runToFuture(s1)
// Output:
// 143622331613424316134424316134243161342431613424316134243 ...
// no '5' !
Task Scheduling
Light Async Boundary
- Continues on the same thread by means of a trampoline
- Checks with thread pool for cancelation status (in Monix & Cats-Effect IO)
- Can help with stack safety
- Low level!
Light Async Boundary
implicit val s = Scheduler.global
.withExecutionModel(ExecutionModel.SynchronousExecution)
def task(i: Int): Task[Unit] =
Task(println(s"$i: ${Thread.currentThread().getName}")) >>
Task.shift(TrampolineExecutionContext.immediate) >> task(i + 1)
val t =
for {
fiber <- task(0)
.doOnCancel(Task(println("cancel")))
.start
_ <- fiber.cancel
} yield ()
t.runToFuture
// Output:
// 0: scala-execution-context-global-11
// 1: scala-execution-context-global-11
// 2: scala-execution-context-global-11
// cancel
Light Async Boundary
val immediate: TrampolineExecutionContext =
TrampolineExecutionContext(new ExecutionContext {
def execute(r: Runnable): Unit = r.run()
def reportFailure(e: Throwable): Unit = throw e
})
Blocking Threads
- Don't
- Thread is being wasted, might prevent other tasks from being scheduled if the thread pool is limited
- Use dedicated thread pool for blocking operations
- Use timeouts
Dealing with blocking ops
implicit val globalScheduler = Scheduler.global
val blockingScheduler = Scheduler.io()
val blockingOp = Task {
Thread.sleep(1000)
println(s"${Thread.currentThread().getName}: done blocking")
}
val cpuBound = Task {
// keep cpu busy
println(s"${Thread.currentThread().getName}: done calculating")
}
val t =
for {
_ <- cpuBound
_ <- blockingOp.executeOn(blockingScheduler)
_ <- cpuBound
} yield ()
t.runSyncUnsafe()
// scala-execution-context-global-11: done calculating
// monix-io-12: done blocking
// scala-execution-context-global-11: done calculating
Semantic / Asynchronous Blocking
- Task waits for certain signal before it can complete
- Doesn't really block any threads, other tasks can execute in the meantime
Semantic / Asynchronous Blocking
val ec = ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor())
implicit val scheduler = Scheduler(ec)
val otherTask = Task.sleep(50.millis) >> Task(println("Running concurrently"))
val t =
for {
sem <- Semaphore[Task](0L)
// start means it will run asynchronously "in the background"
// and the next step in for comprehension will begin
_ <- (Task.sleep(100.millis) >>
Task(println("Releasing semaphore")) >> sem.release).start
_ <- Task(println("Waiting for permit")) >> sem.acquire
_ <- Task(println("Done!"))
} yield ()
(t, otherTask).parTupled.runSyncUnsafe()
// Waiting for permit
// Running concurrently
// Releasing semaphore
// Done!
Fairness
- Likelihood that different tasks are able to advance
- Without fairness you could get high disparity of latencies between processing different requests
- Monix and ZIO introduce async boundaries from time to time (configurable)
Fairness
Green Threads and Fibers
Green Thread
Thread is scheduled by VM instead of OS. Cheap to start and we can map M Green Threads to N OS Threads.
Fiber
"Lightweight thread". Fibers voluntarily yield control to the scheduler.
Is Task like a Green Thread/Fiber?
- We can map thousands of Tasks to very few OS threads
- Async boundaries give a chance to other tasks from the thread pool to execute, very much like cooperative multitasking
- "Blocking" a Task itself is OK since it doesn't block underlying Threads
Thank you !
https://gitter.im/typelevel/cats-effect
https://gitter.im/functional-streams-for-scala/fs2
https://gitter.im/monix/monix
https://gitter.im/scalaz/scalaz-zio
I will appreciate any feedback. :)
If you're looking for help/discussion:
FUNCTIONAL CONCURRENCY IN SCALA 101
By Piotr Gawryś
FUNCTIONAL CONCURRENCY IN SCALA 101
- 2,474