DDD/CQRS/ES with Akka - case study


 


Andrzej Dębski,  Bartłomiej Szczepanik

AGENDA

  1. Who we are? Why we are here?
  2. CQRS/ES introduction
  3. Akka toolkit introduction
  4. Implementation overview
  5. Main challenge
  6. Thoughts, lessons learned
  7. Future work




who? why?

 

our initial background

    • AGH I@IET students
    • Almost no previous Scala & Akka experience
    • Only theoretical knowledge about 
      distributed systems, DDD, NoSQL, scalability
    • Ready to dive into the topic

    THESIS backgrounD

    • PaaSage EU project
    • Lufthansa Systems proposed the thesis
    • Diversity of initial goals:
      • PaaSage - industrial scalability use case
      • Lufthansa - evaluate Scala/Akka stack
      • Ours - learn a lot, try DDD/CQRS
    • Pivoting several times
    • Final thesis goal:  CQRS/ES/DDD scalability evaluation

    project timeline

    • 11/2013 - Master thesis proposal
    • 02/2014 - Started working on the prototype
    • 09/2014 - Presentation for PaaSage technical board 
    • 11/2014 - Finished most important parts of the app
    • now - Writing a paper and assembling the  thesis
    • now - Polishing, publishing the source code

    today We would like to

    • Demonstrate CQRS/ES concepts
    • Show that CQRS/ES scales up/out
    • Discuss Distributed DDD (DDDD)
    • Get an external feedback
    • Share our thoughts and lessons learned
    • Find people interested in using scalable CQRS/ES




    Cqrs/es

     

    Command-query responsibility segregation (CQRS)



    CQRS advantages

    • Different databases on read and write side
    • Separation of different query use cases
    • Database tailored to the use case
    • Lower latency of queries


    CQRS DISADVANTAGES

    • Eventual consistency (?)
    • Possible code duplication
    • More components to maintain
    • Uniqueness constraint problem

    event sourcing


    ES advantages

    • Full event log for free
    • Append-only is enough
    • Enables to add more CQRS read models in the future
    • No object-relational impedance mismatch
    • Event storming model maps 1:1 with ES

    ES DISADVANTAGES

    • Requires fine-grained model in order to be performant
    • Performance issues after some time (snapshots help)
    • Upcasting needed when the event format changes

    CQRS+ES available solutions




    AKKA

     

    • A toolkit,  not a framework
    • Distributed by design
    • Scala and Java API
    • Message passing style
    • Actor concurrency model
    • Supervision hierarchy (let it crash)
    • Location transparency

    Actor model

    • Active object flavour
    • ActorRef , location transparency
    • Message passing
    • Mailbox - message queue
    • Unit of concurrency
    • Hundred thousands of instances
     class MyActor(magicNumber: Int) extends Actor {
      def receive = {
        case x: Int => sender() ! (x + magicNumber)
      }
    }
    
    val system = ActorSystem("mySystem")
    val myActor = system.actorOf(Props[MyActor], "myactor2")
    
    myActor ! 97
    val futureResponse = (myActor ? 2).mapTo[Int]

    akka routers

    • Pools and groups
    • Round robin routing
    • Consistent hashing routing
    • Broadcast routing
    • Balancing routing

    akka.actor.deployment {
      /parent/router3 {
        router = round-robin-group
        routees.paths = ["/user/workers/w1", "/user/workers/w2", "/user/workers/w3"]
      }
    }
    
    val router3: ActorRef = context.actorOf(FromConfig.props(), "router3")
    router3 ! Work()

    akka modules

    • Akka streams (Rx)
    • Akka HTTP
    • Akka clustering
    • Distributed Publish Subscribe in Cluster
    • Akka cluster sharding
    • Akka persistence

    akka persistence

    • Persistence mechanism for actors
    • Based on Command/Event Sourcing concept
    • PersistentActor
      • defines "persistenceId"
      • persists messages/events bound to the id
      • when created, all correlated messages are replayed
    • Variety of journal plugins
    • Views
    • Snapshots
    • No support for CQRS and upcasting

    AKKA Persistence

    class ExamplePersistentActor extends PersistentActor {
      override def persistenceId = "sample-id-1"
     
      var state = ExampleState()
      def updateState(event: Evt): Unit = { state = state.updated(event) }
     
      def numEvents = state.size
    
    
                  val receiveRecover: Receive = {
        case evt: Evt                                 => updateState(evt)
        case SnapshotOffer(_, snapshot: ExampleState) => state = snapshot
      }
            
      val receiveCommand: Receive = {
        case Cmd(data) =>
          persist(Evt(s"${data}-${numEvents}"))(updateState)
          persist(Evt(s"${data}-${numEvents + 1}")) { event =>
            updateState(event)
            context.system.eventStream.publish(event)
          }
        case "snap"  => saveSnapshot(state)
        case "print" => println(state)
      } 
    }

    akka clustering

    • P2P (gossip) clustering protocol
    • Membership service
    • Automatic failure detection
    • Cluster-aware routers
    • JMX metrics
     class SimpleClusterListener extends Actor with ActorLogging {
      val cluster = Cluster(context.system)
     
      override def preStart(): Unit = cluster.subscribe(
          self, initialStateMode = InitialStateAsEvents, classOf[MemberEvent])
      override def postStop(): Unit = cluster.unsubscribe(self)
     
      def receive = {
        case MemberUp(member) => log.info("Member is Up: {}", member.address)
        case MemberRemoved(member, previousStatus) =>
          log.info("Member is Removed: {} after {}", member.address, previousStatus)
        case _: MemberEvent => // ignore
      }
    }
    

    akka cluster sharding

    • Shards of stateful actors
    • ShardRegion and ShardCoordinator services
    • Rebalancing shards using akka-persistence
    • Passivation of actors
    ClusterSharding(system).start(
      typeName = "Counter",
      entryProps = Some(Props[Counter]),
      idExtractor = idExtractor,
      shardResolver = shardResolver)
    
    val idExtractor: ShardRegion.IdExtractor = {
      case EntryEnvelope(id, payload) ⇒ (id.toString, payload)
      case msg @ Get(id)              ⇒ (id.toString, msg)
    }
     
    val shardResolver: ShardRegion.ShardResolver = msg ⇒ msg match {
      case EntryEnvelope(id, _) ⇒ (id % 10).toString
      case Get(id)              ⇒ (id % 10).toString
    }
    




    implementation

     

    UBIQUITOUS LANGUAGE

    • An airplane is assigned to a rotation. 
    • A rotation consists of legs.
    • A leg is a directed relocation of an airplane
      between two airports at given date.

    • A flight consist of legs and has a flight designator
    • Each airport defines a standard ground time which is the minimum time that an airplane have to spend on ground between consecutive legs
    • One can check if all legs in a rotation hold continuity property, does not violate standard ground times and flight numbers are not duplicated.
    • Schedule can be imported from SSIM file (industry standard)

    CONTINUITY CHECK EXAMPLE


    technology stack

    APPLICATION ARCHITEcture


    WRITE model design

    • Airplane and Rotation aggregates
    • (Persistent) Actor = Aggregate
    • Scala case classes = Value Objects, Events, Commands
    • Publishing domain events, e.g. RotationAdded
    • Separated domain from infrastructural concerns
    • Hexagonal architecture
    • Rest API

    read model design

    • Graph-oriented database (Neo4j)
    • Denormalization of events from event bus
    • REST API

    READ MODEL Scalability

    • Simple replication of read model instances
    • Round robin load balancing
    • Took advantage of the replayability

    WRITE MODEL SCALABILITY

    • Aggregate roots sharding and rebalancing
    • Round robin routers as load balancers
    • Scalable event store - Cassandra

    TOOLS


    • Ansible
    • Kamon
    • Zipkin
    • Akka tracing
    • Gatling
    • Akka clustering JMX
    • Visual VM
    • SBT






    replayable event bus

    main challenge


    distributed EVent bus

    • Messsage delivery is not an issue
      • DistributedPubSub Akka extension
      • ZeroMQ
      • RabbitMQ
    • We need to replay events from the past

    event bus #1

    Akka Persistence Views

    • Views can replay only events for a single persistent actor
    • Views are polling the event store
    • This will change in 2015 Q3 (see Akka Roadmap) [21]

    EVENT BUS #2

    Apache Kafka as an event store

    Nearly perfect solution! But...

    Kafka was not designed with this use case in mind: [51]
    • Retention time  usually 1-14 days.
    • Maximum number of partitions way too low
    • Designed mainly for log processing


    EVENT BUS #3

    Kafka + Cassandra tandem

    1. Replay past events from Cassandra
    2. Subscribe to Kafka

    We can miss events! 



    EVENT BUS #3

    Kafka + Cassandra tandem

    • We leveraged Kafka durability
    • Kafka retention time set to X (e.g. 24h)
    • Subscribing from scratch in Kafka after Cassandra replay
    • Filtering duplicate events
    • If replay takes less than X we won't miss any event

    STILL a room for improvement

    • Events ordering between aggregates [58]
    • Single Kafka topic/partition for now
    • Subscription only to all events
      • often we need to listen for a specific events only
        (e.g. from a single aggregate)
      • database is better in data filtering
      • unnecessary traffic
    • Not optimal for simultaneous replays on different nodes
    • ATOM interface
    • Spark connectors  




    lessons learned

    and those still not learned...

     

    distributed ddd

    • Pat Helland's entities [45] matches aggregate definition
    • CQRS makes scalability and distribution easier
    • Saga/Business process is hard to implement efficiently
    • Application services need to be replicated
    • A single aggregate may be still a bottleneck
      • Solution: CRDTs [20, 46, 47]

    ddd implementation concerns

    • Actors in the domain code? NO! [27]
    • Prefer TypedIdClasses over UUID
    • What to do when command validation fails?
      • Return error? 
      • Throw exception? 
      • Publish event?

    akka stuff

    • Avoid ask pattern if possible [22, 24] 
      • timeout hell
      • performance issues
      • tell, don't ask
    • It's not easy to manage dependencies
      • We don't like cake pattern!
    • How we should handle stateless business logic?
      • Actors behind routers?, futures?, static classes?
        continuation monad? dataflow?  [23, 25, 26]
    • Cluster sharding is not fully dynamic yet
      • Cannot change number of shards
        without restarting the app

    testing

    • Test Data Builder Pattern rocks! [59]
    • Start with integration tests on application service level
    • Given/When/Then perfectly fits to DDD: [30]
      • given past events
      • when command fired
      • then expected event(s)
    • Eventual consistency forces you to wait in tests :(
      • Sometimes it is possible to avoid it by using hacks
        e.g.: waiting on expected number of entities
    • Where to put e2e tests?
      • Completely outside of the app?
      • In the REST port?

    other lessons

    • EJB and Akka are in fact similar! [19-20]
      • Active Object
      • #unpopularopinion 
    • Eventual consistency 
      • often is feasible and realistic
      • introduces new problems
    • Performance measurement is not straightforward
    • App monitoring is challenging
    • Automatized deployment is crucial

    FUTURE work


    • Open source - on the way!
    • More detailed performance evaluation
    • Replayable event bus improvements
      • ATOM, Spark Streaming, Redis
    • Causal consistency [58]
    • Effective sagas implementation
    • Effective stateless logic implementation
    • Upcasting and snapshoting [33]
    • Geo-based sharding

    Spin-offs

    • Maciek & Adam
      • Integration with PaaSage platform
      • Metrics exposure
      • Second read model
      • More detailed performance evaluation
    • Mariusz & Michał [55]
      • Akka debugging tool
      • Akka-tracing enhancement [54]

    DDD/CQRS/ES with Akka - case study



    Let's talk! Give us feedback!




    Andrzej Dębski
    andrzejdebski91 @ gmail.com

    Bartłomiej Szczepanik
    mequrel @ gmail.com, @bszczepanik

    References


    http://goo.gl/EFFU9N
    Made with Slides.com