FP and Event Based Systems

Hamish Dickson - LambdAle 2018

About me

  • I'm Hamish and I'm an FPer
  • First time talking at an FP conference
  • Scala Engineer at DriveTribe
  • Data Things/Machine Learning
  • I tweet about my cat a lot @_mishy

Distributed Systems

CAP Theorem

Partition

Tolerance

Consistency

Availability

CP

AP

Stay up if you start losing nodes

Same data on every node

Always get a response

up to date or error

always get a response, but could be old data

But why?

  • Need to send messages between nodes to decide on truth
  • Networks are rubbish, messages get:
    • delayed
    • dropped
    • out of order
    • duplicated

Most Distributed Systems are AP

  • Erroring on CP is scary
  • We don't often need strong consistency
  • AP systems can be designed to be eventually consistent

Eventual Consistency

  • We can provably make data in an AP system eventually consistent across nodes
  • Most common approach: Conflict-free Replicated Data Types (CRDTs)

CRDTs

  • Basically just let network/whatever failures happen
  • Fix conflicts later
  • Each node can do this fixup process independently

General Process:

  • Associative
  • Commutative
  • Idempotent
  • Distributive(?!)
a \bigoplus (b \bigoplus c) = (a \bigoplus b) \bigoplus c
a \bigoplus b = b \bigoplus c
a \bigoplus a = a

ACID 2.0

To do this we need some kind of "merge"

Made up to spell "ACID"

That's a semilattice!

This sounds like some FP thing

Lets go to our favourite Algebra website

Implementations

First attempt: (Int, +)

1 + (2 + 3) = (1 + 2) + 3
1 + 2 = 2 + 1
1 + 1 ≠ 1

Second attempt: (Set[T], ∪)

(\{1\} ∪ \{2\}) ∪ \{3\} = \{1\} ∪ (\{2\} ∪ \{3\})
\{1\} ∪ \{2\} = \{2\} ∪ \{1\}
\{1\} ∪ \{1\} = \{1\}

In reality you probably want to use an HyperLogLog or something

Example

  • Alice makes some change
  • sends change to other nodes
  • nodes get updated

*For legal reasons not based on any Pete in this room.. maybe

  • Bob makes a change
  • Alice gets update
  • Something happens and Drunk Pete never gets the message

DRIVETRIBE

Rest API

Rest API

alice likes

Likes = 721
Likes = 721
Likes = 720

Cool right?

  • Has the same problems as the Google Docs example
  • Because of networks innit
  • Has the same fix (Semilattices, because FP conference)

This thing still needs to resolve conflicts

Shut up and show me code

Counting likes

  • Lets try to count likes as they come in
  • Want the count to live on some PostStats model
  • Like is just postId, userId pair
case class PostStats(
  id: Id[Post],
  likeCounter: Set[Like],
  impressionCounter: Set[Impression]
)
case class Like(
  postId: Id[Post],
  userId: Id[User]
)

Here's the plan

  • read Like event out of queue
  • map Like event to dummy PostStats thing
  • use Semilattice to combine with what we already have for that post
Like(Id("post-1"), Id("bob"))
PostStats(
  Id("post-1"), 
  Counter((Id("post-1"), Id("bob"))),
  Counter()
)

likes events from bob

code is going to live in our stream processor

going to cheat and only consider one post

We need to find a Semilattice

case class PostStats(
  id: Id[Post],
  likeCounter: Set[Like],
  impressionCounter: Set[Impression]
)
  • Typically encode type classes with implicits in Scala
  • just need to have an implicit for type `Semilattice[PostStats]` in scope

How do we define a Semilattice?

case class PostStats(
  id: Id[Post],
  likeCounter: Set[Like],
  impressionCounter: Set[Impression]
)

and this is the same thing

we can make one for this

implicit def setSemilattice[T] = new Semilattice[Set[T]] {
  def combine(s1: Set[T], s2: Set[T]): Set[T] =
    s1 ++ s2
}
implicit val postStatsSemilattice = new Semilattice[PostStats] {
  def combine(ps1: PostStats, ps2: PostStats): PostStats =
    PostStats(
      ps1.id |+| ps2.id,
      ps1.likes |+| ps2.likes,
      ps1.impressions |+| ps2.impressions
    )
}

We have a way to do this, but it's kind of off topic

And this actually works

  • if everything in your case class has a Semilattice, so does your case class*
  • you have a CRDT
  • you have a happy life

Can we generalise this?

  • We didn't know how to build a Semilattice for PostStats
  • We did know how to build one for each of it's elements
  • We combined them
Id[Post] :: Set[Like] :: Set[Impression] :: HNil

Shapeless can do this for us

PostStats
  • We can break up this type into a list of its elements
  • Then traverse the list at compile time finding our Semilatti(ces?)
  • Get Shapeless to build the final Semilattice

Inductively traverse HList

Id[Post] :: Set[Like] :: Set[Impression] :: HNil

but I can find one for this

wonder if I can find one for this?

Set[Like] :: Set[Impression] :: HNil

oh I found one here!

and here!

Hey compiler, find me a Semilattice for this:

PostStats

no soup for you!

Set[Impression] :: HNil

we need to define this

object AutoSemilattice {

  implicit def autoSemilatticeHNil = new Semilattice[HNil] {
    override def combine(x: HNil, y: HNil) = HNil
  }

  implicit def autoSemilatticeHCons[H, L <: HList](
    implicit headSemilattice: Lazy[Semilattice[H]],
    tailSemilattice: Lazy[Semilattice[L]]
  ) = new Semilattice[H :: L] {
    override def combine(x: H :: L, y: H :: L) =
      headSemilattice.value.combine(x.head, y.head) :: tailSemilattice.value.combine(x.tail, y.tail)
  }

  implicit def autoSemilattice[T, Repr](
    implicit generic: Generic.Aux[T, Repr],
    genericSemilattice: Lazy[Semilattice[Repr]]
  ) = new Semilattice[T] {
    override def combine(x: T, y: T) =
      generic.from(genericSemilattice.value.combine(generic.to(x), generic.to(y)))
  }
}

Here, have code

Turn our type into an HList

Last bit

Find Semilattice for Head, then Tail

Beer?

Or twitter me for cat pics @_mishy

CRDTs and Semilattices

By Hamish dickson