Overload and backpressure

 

Genstage in theory

Thomas Depierre

@DianaO

Diana Olympos

 

Twitter :

Github :

  • B2B
  • Intelligence, Events, Advice
  • Unlock the future
  • Big Company ™
    • FTSE 250
    • Founded 1947
  • New organisation : Ascential Makers
  • Elixir for everything in the backend

Audience time

  • Who use a task queue ?
  • Who has had catastrophic failures due to overload ?
  • Who use a pipeline of work job (web crawling maybe?) that use multiple queues ?

Losing half of my audience

Little's law

{\displaystyle L=\lambda W}

Number of tasks in the system

Arrival rate

Average time spent in the system

The problem with task-queues

  • A queue is a buffer

  • It distributes load and/or smooth load

  • Queues tend to be used to introduce asynchronicity to a synchronous system...

  • But that generate a problem : how do you handle pathological load ?

Queues do not handle overload

  • Backpressure

  • Load-shedding

I am not going to talk about load-shedding, but it is an interesting area

Backpressure

  • Easy way : use a pool, scale horizontally
  • Sadly this is a really rare situation.
  • But can be a good optimization, see database connections

 

Backpressure

  • Just use calls ™
  • Make it synchronous at your slowest point.
  • Done

Wait what if...

Decoupling kills...

  • Throttle
  • Unsafe approach : it is ok, nothing bad will happen, pathological load is rare.

Pull-based control flow

  • Workers generate load when they can
  • That is how GenStage works
  • Move the problem to the producers
  • With enough batch size, close to a bounded queue...

Thanks Fred

Basic high level view

Producer

Consumer

Consumer

Consumer

How it works ?

Consumer side

handle_events(events, from, state)

Producer side

handle_demand(demand, state)

Data flow

GenStage.async_subscribe(stage, opts)
GenStage.sync_subscribe(stage, opts, timeout \\ 5000)

What if i have multiple steps?

Producer

Consumer

Consumer

Consumer

What if i have multiple steps?

Producer

ConsumerProducer

Consumer

Consumer

ConsumerProducer

ConsumerProducer

Only make sense if you need different workload behaviour...

What if i want one process per event ?

Producer

ConsumerSupervisor

Consumer

Consumer

Buffering producer side

  • Up to you !
  • There is an example in the GenStage doc for demand buffering.
  • Buffering event producing depends of what your events are

Optimisations

  • Couple of possibilities
  • Need experimentation with your workload and runtime.
  • Who you subscribe to ?
  • :min_demand
  • :max_demand

Stay for more on that subject with Evadne.

Thank You

Questions ?