Sleeping well in the lion's den with Monix Catnap

Piotr Gawryś

About me

  • An open source contributor for fun
  • One of the maintainers of Monix
  • Kraków Scala User Group co-organizer
    (let me know if you'd like to speak!)

https://github.com/Avasil

twitter.com/p_gawrys

Monix

  • Scala / Scala.js library for asynchronous programming
  • Multiple modules exposing Task, Observable, Iterant, Coeval and many concurrency utilities
  • Favors purely functional programming but provides for all
  • 2.0.0 released August 31, 2016
  • 3.0.0 released September 11, 2019

twitter.com/p_gawrys

Monix Niche

  • Mixed codebases 
  • Reactive Programming
  • Good integration and consistency with Cats ecosystem
  • Performance-sensitive applications
  • Stability

twitter.com/p_gawrys

Monix Modules

  • monix-execution - low level concurrency abstractions, companion to scala.concurrent
  • monix-catnap - purely functional abstractions, Cats-Effect friendly
  • monix-eval - Task and Coeval
  • monix-reactive - Observable, functional take on RxObservable
  • monix-tail - Iterant, purely functional pull-based stream
  • monix-bio - bifunctor implementation

twitter.com/p_gawrys

Cats-Effect

  • Library which abstracts over different effect type implementations 
  • Opens doors to the entire ecosystem regardless of your choice, i.e. http4s, finch, doobie, fs2, ...

twitter.com/p_gawrys

Problem: Limiting parallelism

object SharedResource {
  private val counter = AtomicInt(2)

  def access(i: Int): Unit = {
    if (counter.decrementAndGet() < 0)
      throw new IllegalStateException("counter less than 0")
    Thread.sleep(100)
    counter.increment()
  }
}

implicit val ec = 
  ExecutionContext.fromExecutor(Executors.newFixedThreadPool(4))

val f: Int => Future[Unit] = i => Future {
  SharedResource.access(i)
}

Await.result(Future.traverse(List(1, 2, 3, 4, 5))(f), 60.second)

Exception in thread "main" java.lang.IllegalStateException: counter less than 0

Semaphore

  • Synchronization primitive
  • A counter is incremented when semaphore's permit is released
  • A counter is decremented when permit is acquired
  • acquire blocks until there is a permit available

twitter.com/p_gawrys

java.util.concurrent.Semaphore

implicit val ec = 
  ExecutionContext.fromExecutor(Executors.newFixedThreadPool(4))

def traverseN(n: Int, list: List[Int])(
  f: Int => Future[Unit]
): Future[List[Unit]] = {
  // java.util.concurrent.Semaphore
  val semaphore = new Semaphore(n)

  Future.traverse(list) { i =>
    val future = Future(semaphore.acquire()).flatMap(_ => f(i))
    future.onComplete(_ => semaphore.release())
    future
  }
}

val f: Int => Future[Unit] = i => Future {
  SharedResource.access(i)
}

Await.result(traverseN(2, List.range(1, 5))(f), Duration.Inf) // works!
Await.result(traverseN(2, List.range(1, 10))(f), Duration.Inf) // hangs forever...

Semantic/Asynchronous blocking

  • Blocks a fiber instead of an underlying thread
  • Can we do it for a Future?

twitter.com/p_gawrys

Let's see how!

Acquire

type Listener[A] = Either[Throwable, A] => Unit

private final case class State(
  available: Long,
  awaitPermits: Queue[(Long, Listener[Unit])],
  awaitReleases: List[(Long, Listener[Unit])])
  
def unsafeAcquireN(n: Long, await: Listener[Unit]): Cancelable
  • Check state
  • Are n permits available?
    • NO => add Listener to awaitPermits queue
    • YES => decrement permits and call Listener callback

Acquire Cancelation

type Listener[A] = Either[Throwable, A] => Unit

private final case class State(
  available: Long,
  awaitPermits: Queue[(Long, Listener[Unit])],
  awaitReleases: List[(Long, Listener[Unit])])

def cancelAcquisition(n: Long, isAsync: Boolean): (Listener[Unit] => Unit)
  • Check state
  • find Listener in awaitPermits and remove it
  • release n permits

Release

type Listener[A] = Either[Throwable, A] => Unit

private final case class State(
  available: Long,
  awaitPermits: Queue[(Long, Listener[Unit])],
  awaitReleases: List[(Long, Listener[Unit])])
  
def unsafeReleaseN(n: Long): Unit
  • Check state
  • Is anything awaiting permit?
    • NO => add permit, go through awaitReleases
    • YES => go through queue and give permits

Implementing with Future

type Listener[A] = Either[Throwable, A] => Unit

private final case class State(
  available: Long,
  awaitPermits: Queue[(Long, Listener[Unit])],
  awaitReleases: List[(Long, Listener[Unit])])


def acquireN(n: Long): CancelableFuture[Unit] = {
  if (unsafeTryAcquireN(n)) {
    CancelableFuture.unit
  } else {
    val p = Promise[Unit]()
    unsafeAcquireN(n, Callback.fromPromise(p)) match {
      case Cancelable.empty => CancelableFuture.unit
      case c => CancelableFuture(p.future, c)
    }
  }
}

monix.execution.AsyncSemaphore

implicit val ec = 
  ExecutionContext.fromExecutor(Executors.newFixedThreadPool(4))

def traverseN(n: Int, list: List[Int])(
  f: Int => Future[Unit]
): Future[List[Unit]] = {
  // monix.execution.AsyncSemaphore
  val semaphore = AsyncSemaphore(n)

  Future.traverse(list) { i =>
    semaphore.withPermit(() => f(i))
  }
}

val f: Int => Future[Unit] = i => Future {
  SharedResource.access(i)
}

Await.result(traverseN(2, List.range(1, 10))(f), Duration.Inf) // works!
object LocalExample extends App with StrictLogging {
  implicit val ec = ExecutionContext.global

  def req(requestId: String, userName: String): Future[Unit] = Future {
    logger.info(s"Received a request to create a user $userName")
    // do sth
  }.flatMap(_ => registerUser(userName))
  
  def registerUser(name: String): Future[Unit] = {
    // business logic
    logger.info(s"Registering a new user named $name")
    Future.unit
  }

  val requests = List(req("1", "Clark"), req("2", "Bruce"), req("3", "Diana"))
  Await.result(Future.sequence(requests), Duration.Inf)
}

Received a request to create a user Bruce
Registering a new user named Bruce
Received a request to create a user Diana
Registering a new user named Diana
Received a request to create a user Clark
Registering a new user named Clark

Problem: Logging Requests

def req(requestId: String, userName: String): Future[Unit] = Future {
  logger.info(s"$requestId: Received a request to create a user $userName")
  // do sth
}.flatMap(_ => registerUser(requestId, userName))

def registerUser(requestId: String, name: String): Future[Unit] = {
  // business logic
  logger.info(s"$requestId: Registering a new user named $name")
  Future.unit
}

3: Received a request to create a user Diana
3: Registering a new user named Diana
1: Received a request to create a user Clark
1: Registering a new user named Clark
2: Received a request to create a user Bruce
2: Registering a new user named Bruce

Logging Requests

logger.info("Logging something.")
MDC.put("requestId", "1")
logger.info("Logging something with MDC.")


: Logging something.
1: Logging something with MDC.

Propagating context with MDC

def req(requestId: String, userName: String): Future[Unit] = Future {
  MDC.put("requestId", requestId)
  logger.info(s"Received a request to create a user $userName")
  // more flatmaps to add async boundaries
}.flatMap(_ => Future(()).flatMap(_ => Future())).flatMap(_ => registerUser(userName))

def registerUser(name: String): Future[Unit] = {
  // business logic
  logger.info(s"Registering a new user named $name")
  Future.unit
}

3: Received a request to create a user Diana
2: Received a request to create a user Bruce
1: Received a request to create a user Clark
1: Registering a new user named Clark
2: Registering a new user named Bruce
2: Registering a new user named Diana

MDC and concurrency

monix.execution.misc.Local

  • ThreadLocal with a flexible scope which can be propagated over async boundaries
  • Supports Future and Monix Task
  • Good for context propagation like MDC nad OpenTracing without manually passing parameters
  • Quite low level and still have rough edges
  • First version introduced in 2017

twitter.com/p_gawrys

Local Model

  • Local is shared unless told otherwise
  • Needs TracingScheduler for Future
  • TaskLocal is a pure version just for a Task
  • Task is a bit smarter about it and does not always require manual isolation
implicit val s = Scheduler.traced

// from https://github.com/mdedetrich/monix-mdc
MonixMDCAdapter.initialize()

def req(requestId: String, userName: String): Future[Unit] = Local.isolate {
  Future {
    MDC.put("requestId", requestId)
    logger.info(s"Received a request to create a user $userName")
    // more flatmaps to add async boundaries
  }.flatMap(_ => Future(()).flatMap(_ => Future())).flatMap(_ => registerUser(userName))
}

1: Received a request to create a user Clark
3: Received a request to create a user Diana
2: Received a request to create a user Bruce
3: Registering a new user named Diana
1: Registering a new user named Clark
2: Registering a new user named Bruce

MDC with Monix Local

Blackbox Asynchronous Code

implicit val ec = Scheduler.traced

val local = Local(0)

def blackbox: Future[Unit] = {
  val p = Promise[Unit]()
  new Thread {
    override def run(): Unit = {
      Thread.sleep(100)
      p.success(())
    }
  }.start()
  p.future
}

val f = Local.isolate {
  for {
    _ <- Future { local := local.get + 100 }
    _ <- blackbox
    _ <- Future { local := local.get + 100 }
  // can print 100 if blackbox is not isolated!
  } yield println(local.get) 
}

Await.result(f, Duration.Inf)

Asynchronous Queue

  • A collection which allows to add elements to one end of the sequence and remove them from the other end
  • Producer is backpressured on offer if a queue is full
  • Consumer is backpressured on poll if a queue is empty
  • Useful for decoupling producer and consumer, distributing work

twitter.com/p_gawrys

Monix Queues

  • ConcurrentQueue[F[_], A] - a purely functional asynchronous queue for any Cats-Effect compliant effect
  • AsyncQueue[A] - impure asynchronous queue for scala.concurrent.Future

twitter.com/p_gawrys

Example - Fast Producer

implicit val s = Scheduler.singleThread("example-pool")

def consumer(queue: ConcurrentQueue[Task, Int]): Task[Unit] = {
  queue.poll
    .flatMap(i => Task(println(s"Consuming $i")))
    .delayExecution(1.second)
    .loopForever
}
def producer(queue: ConcurrentQueue[Task, Int], n: Int = 0): Task[Unit] = {
  for {
    _ <- queue.offer(n)
    _ <- Task(println(s"Produced $n"))
    _ <- producer(queue, n + 1)
  } yield ()
}

val t =
  for {
    queue <- ConcurrentQueue.bounded[Task, Int](2)
    _     <- consumer(queue).startAndForget
    _     <- producer(queue)
  } yield ()

t.executeAsync.runSyncUnsafe()
== Output ==
Produced 0
Produced 1
Produced 2
Consuming 0
Produced 3
Consuming 1
Produced 4
...

Streaming with Queue

implicit val s = Scheduler.singleThread("example-pool")

def consumer(queue: ConcurrentQueue[Task, Long]): Task[Unit] = {
  Observable
    .repeatEvalF(queue.poll)
    .consumeWith(Consumer.foreachTask(i => Task(println(s"Consumed $i"))))
}

def producer(queue: ConcurrentQueue[Task, Long]): Task[Unit] = {
  Observable.intervalAtFixedRate(1.second)
    .doOnNext(i => queue.offer(i))
    .consumeWith(Consumer.foreachTask(i => Task(println(s"Produced $i"))))
}

val t =
  for {
    queue <- ConcurrentQueue.bounded[Task, Long](2)
    _     <- consumer(queue).startAndForget
    _     <- producer(queue)
  } yield ()

t.executeAsync.runSyncUnsafe()
== Output ==
Produced 0
Consumed 0
Consumed 1
Produced 1
Consumed 2
Produced 2
Consumed 3
...

Other implementations

twitter.com/p_gawrys

MONIX FS2 ZIO
Effects Cats-Effect and Future-native Cats-Effect native ZIO-native
API basic, lacks termination a lot of different types of queues adds methods like map, filter, contramap, etc.
Fairness no yes yes
Performance 3 - 10x faster baseline 0,6 - 2x faster

Fairness

twitter.com/p_gawrys

implicit val s = Scheduler.global

def consumer(id: Int, queue: ConcurrentQueue[Task, Int]): Task[Unit] = {
  queue.poll
    .flatMap(i => Task(println(s"$id: Consuming $i")))
    .loopForever
}

def producer(id: Int, queue: ConcurrentQueue[Task, Int], n: Int = 0): Task[Unit] = {
  for {
    _ <- queue.offer(n)
    _ <- producer(id, queue, n + 1).delayExecution(1.second)
  } yield ()
}

val t =
  for {
    queue <- ConcurrentQueue.bounded[Task, Int](2)
    _     <- consumer(1, queue).startAndForget
    _     <- consumer(2, queue).startAndForget
    _     <- consumer(3, queue).startAndForget
    _     <- producer(1, queue)
  } yield ()

t.executeAsync.runSyncUnsafe()

Fairness

twitter.com/p_gawrys

== Monix Output ==
1: Consuming 0
1: Consuming 1
3: Consuming 2
3: Consuming 3
2: Consuming 4
1: Consuming 5
3: Consuming 6
2: Consuming 7
2: Consuming 8
3: Consuming 9
...
== FS2 Output ==
1: Consuming 0
3: Consuming 1
1: Consuming 2
2: Consuming 3
3: Consuming 4
1: Consuming 5
2: Consuming 6
3: Consuming 7
1: Consuming 8
2: Consuming 9
...

Benchmark Results

twitter.com/p_gawrys

[info] Benchmark                                   Mode  Cnt      Score    Error  Units
[info] QueueBackPressureBenchmark.fs2Queue        thrpt   30    541.506 ± 78.125  ops/s
[info] QueueBackPressureBenchmark.monixQueue      thrpt   30   2906.331 ± 64.291  ops/s
[info] QueueBackPressureBenchmark.zioQueue        thrpt   30   1011.617 ± 13.278  ops/s

[info] QueueParallelBenchmark.fs2Queue            thrpt   30   1622.631 ± 21.835  ops/s
[info] QueueParallelBenchmark.monixQueue          thrpt   30   6073.337 ± 75.890  ops/s
[info] QueueParallelBenchmark.zioQueue            thrpt   30   2585.049 ± 76.588  ops/s

[info] QueueSequentialBenchmark.fs2Queue          thrpt   30   2679.532 ± 11.624  ops/s
[info] QueueSequentialBenchmark.monixQueue        thrpt   30  12829.954 ± 61.343  ops/s
[info] QueueSequentialBenchmark.zioQueue          thrpt   30   1685.545 ± 11.178  ops/s

// Source of benchmarks
// https://github.com/zio/zio/tree/master/benchmarks

monix.catnap.ConcurrentChannel

  • Created for the sole purpose of modeling complex producer-consumer scenarios
  • Supports multicasting / broadcasting to multiple consumers and workers
  • Sort of like ConcurrentQueue per Consumer with higher level API which allows termination, waiting on consumers etc.
  • Inspired by Haskell's ConcurrentChannel

twitter.com/p_gawrys

monix.catnap.ConcurrentChannel

twitter.com/p_gawrys

final class ConcurrentChannel[F[_], E, A] {
  def push(a: A): F[Boolean]
  def pushMany(seq: Iterable[A]): F[Boolean]
  def halt(e: E): F[Unit]
  def consume: Resource[F, ConsumerF[F, E, A]]
  def consumeWithConfig(config: ConsumerF.Config): Resource[F, ConsumerF[F, E, A]]
  def awaitConsumers(n: Int): F[Boolean]
}

trait ConsumerF[F[_], E, A] {
  def pull: F[Either[E, A]]
  def pullMany(minLength: Int, maxLength: Int): F[Either[E, Seq[A]]]
}

Usage Example

def consume(consumerId: Int, consumer: ConsumerF[Task, String, Int]): Task[Unit] = {
  consumer.pull.flatMap {
    case Right(element) =>
      Task(println(s"$consumerId: is processing $element"))
        .flatMap(_ => consume(consumerId, consumer))
    case Left(msg) =>
      Task(println(s"$consumerId: is done with msg: $msg"))
  }
}

val simple: Task[Unit] =
  for {
    channel <- ConcurrentChannel.of[Task, String, Int]
    _ <- channel.consume.use(consume(1, _)).startAndForget
    _ <- channel.consume.use(consume(2, _)).startAndForget
    _ <- channel.consume.use(consume(3, _))
      .delayExecution(100.millis).startAndForget
    _ <- channel.awaitConsumers(2)
    _ <- channel.pushMany(List(1, 2))
    _ <- channel.awaitConsumers(3)
    _ <- channel.push(3)
    _ <- channel.halt("good job!")
  } yield ()
2: is processing 1
1: is processing 1
1: is processing 2
2: is processing 2
3: is processing 3
2: is processing 3
1: is processing 3
3: is done with msg: good job!
2: is done with msg: good job!
1: is done with msg: good job!
def consume(consumerId: Int, workerId: Int, consumer: ConsumerF[Task, String, Int]): Task[Unit] = {
  consumer.pull.flatMap {
    case Right(element) =>
      Task(println(s"Worker $consumerId-$workerId is processing $element"))
        .flatMap(_ => consume(consumerId, workerId, consumer))
    case Left(msg) =>
      Task(println(s"Worker $consumerId-$workerId is done with msg: $msg"))
  }.delayExecution(100.millis)
}

def parallelConsumer(consumerId: Int, workers: Int, consumer: ConsumerF[Task, String, Int]): Task[Unit] = {
  Task.wander(List.range(0, workers))(i => consume(consumerId, i, consumer)).map(_ => ())
}

val app =
  for {
    channel <- ConcurrentChannel.of[Task, String, Int]
    _ <- channel.consume.use(consumer => parallelConsumer(1, 4, consumer)).startAndForget
    _ <- channel.consume.use(consumer => consume(2, 0, consumer)).startAndForget
    _ <- channel.awaitConsumers(2)
    _ <- channel.pushMany(List(1, 2, 3, 4))
    _ <- Task.sleep(1.second)
    _ <- channel.halt("good job!")
  } yield ()

monix.catnap.ConcurrentChannel

twitter.com/p_gawrys

Worker 2-0 is processing 1
Worker 1-0 is processing 1
Worker 1-2 is processing 2
Worker 1-1 is processing 3
Worker 1-3 is processing 4
Worker 2-0 is processing 2
Worker 2-0 is processing 3
Worker 2-0 is processing 4
Worker 1-2 is done with msg: good job!
Worker 1-0 is done with msg: good job!
Worker 1-1 is done with msg: good job!
Worker 2-0 is done with msg: good job!
Worker 1-3 is done with msg: good job!
val backpressure: Task[Unit] = for {
  channel <- ConcurrentChannel.of[Task, String, Int]
  customConfig = ConsumerF.Config.default
    .copy(capacity = Some(BufferCapacity.Bounded(2))
  )
  fiber <- channel.consumeWithConfig(customConfig)
    .use(consumer => consume(1, 1, consumer)).start
  _ <- channel.awaitConsumers(1)
  _ <- Task.traverse(List.range(1, 10))(i => 
      channel.push(i) >> Task(println(s"push($i)"))
    )
  _ <- fiber.join
} yield ()

twitter.com/p_gawrys

Backpressure

push(1)
push(2)
push(3)
Worker 1-1 is processing 1
Worker 1-1 is processing 2
push(4)
Worker 1-1 is processing 3
push(5)
push(6)
Worker 1-1 is processing 4
Worker 1-1 is processing 5
push(7)
push(8)
Worker 1-1 is processing 6
Worker 1-1 is processing 7
push(9)
Worker 1-1 is processing 8
Worker 1-1 is processing 9

What about Observable?

def parallelConsumer(consumerId: Int, n: Int): Consumer[Either[String, Int], Unit] = {
  val workers: List[Consumer[Either[String, Int], Unit]] =
    List.range(0, n).map(i => Consumer.foreachTask[Either[String, Int]](consume(consumerId, i, _)))
  Consumer.loadBalance(workers: _*).map(_ => ())
}

val app: Task[Unit] =
  for {
    queue  <- ConcurrentQueue.unbounded[Task, Either[String, Int]]()
    signal <- Semaphore[Task](0)
    _      <-  Observable
          .repeatEvalF(queue.poll)
          .takeWhileInclusive(_.isRight)
          .publishSelector { sharedSource =>
            val c1 = sharedSource
                .doOnSubscribe(signal.release)
                .mapEval(elem => consume(consumerId = 1, workerId = 0, elem))
            val c2 = sharedSource
                .doOnSubscribe(signal.release)
                .consumeWith(parallelConsumer(consumerId = 2, n = 4))
            
            Observable.fromTask(Task.parZip2(c1.completedL, c2))
          }.completedL.startAndForget
    _     <- signal.acquireN(2) // awaitConsumers
    _     <- queue.offerMany(List(1, 2, 3, 4).map(i => Right[String, Int](i)))
    _     <- Task.sleep(1.second)
             // only one worker per consumer will get it
    _     <- queue.offer(Left[String, Int]("good job!")) 
    _     <- Task.sleep(1.second)
  } yield ()

...And there's more!

  • CircuitBreaker, Cancelables, CancelableFuture, Future utils, TestScheduler, Future-based MVar, ...
  • If you have any questions or more ideas, make sure to let us know at https://github.com/monix/monix or https://gitter.im/monix/monix
  • Contributions are very welcome!

twitter.com/p_gawrys

Thank you !

https://github.com/typelevel/cats-effect

https://github.com/functional-streams-for-scala/fs2

https://github.com/monix/monix

https://github.com/zio/zio

​Some of the projects worth checking out:

twitter.com/p_gawrys

Made with Slides.com