Year with EventSourcing & CQRS
as web is growing
in complexity and size
more core business systems are web
simple CMS
simple webshops
<html>
<?php
$mysql = mysql_connect(....);
$users = mysql_fetch('SELECT * FROM ...');
?>
<table>
<?php for($i=0;$i<count($users);$i++){ ?>
<tr>
<td><?php echo $users[$i]['first_name']; ?> </td>
...
</tr>
...
<?php
if( $x = 123) {
echo '<span> .... </span>';
}
?>
Good old days
lots of old school devs 'tought' this was good enough
84% of web
MVC
CRUD
complexity increases
existing practices
Feals like
tired of 'I have no idea why that happened'
greping logs
stuff not logged
joining 5 tables to get list of most popular X
looking for data instead of creating it
logging
logging
build state from changes
create data sources for each purpose
separate business complexity and scalability
id | 123456789 |
---|---|
title | Register button doesnt work in IE |
assigned | null |
type | BUG |
status | OPEN |
priority | URGENT |
createdAt | 2016-03-11 22:22:11 |
updatedAt | 2017-05-27 11:01:35 |
closedAt |
Issue tracker: Issue
How long is this urgent bug opened?
No one is assigned to it?
Lets check the logs
- IssueOpened: 123456789 [ 2016-03-11 22:22:11]
- IssueClosed: 123456789 [2016-03-12 12:00:11]
- IssueReopened: 123456789, [2017-01-04 11:01:11]
- We are missing details now :(
- Lets add logging when ..
- Lets add logging when ..
- Lets add logging when ..
What if we record all changes?
IssueCreated(123, 'Something', 'text', BUG, NORMAL, ' 2016..')
IssueAssigned(123, TeamManager, ' 2016-03-12 10:00:00')
PriorityChanged(123,URGENT, ' 2016-08-11 15:46:33')
IssueDellocated(123,'2016...')
IssueClosed(123,'2016...')
IssueReopened(123, ' 2017-05-27 11:01:35')
TitleChanged(123, ' Register button doesnt work in IE', '2016...')
and create state from applying them?
event sourcing
issue
issue
678
number of issues opened today
issue
678
number of issues opened today
456
number of issues opened on day YYYY-MM-DD
issue
678
number of issues opened today
abc123-123-121313
daa-2-2-3-3-2-3-2
opened urgent issue ids
issue
678
1.234
number of issues opened today
number of issues reopened this month
abc123-123-121313
daa-2-2-3-3-2-3-2
opened urgent issue ids
issue
678
1.234
abc-123-23 Login doesnt work 85
acd-342-43 Bug X 75
number of issues opened today
number of issues reopened this month
open issues with number of comments
abc123-123-121313
daa-2-2-3-3-2-3-2
opened urgent issue ids
Command Query Responsibility Segregation
Miro Svrtan
senior developer
ZgPHP user group organizer
@msvrtan
My ES + CQRS experience:
external concerts on TicketSwap
- part of existing app
- imports external concerts
- admin flow to match to our concerts
project X
- ver 1 had lot of 'what happened issues'
- ver 2 built with full ES + CQRS
- test ground to find when ES makes sense
devboard
- my pet project
- tracking github builds, PRs and issues
- ver 1 had lot of existance issues
- version 2 going full ES + CQRS
existance issues?
- GitHub hook
- branch exists? if not create one
- commit exists? if not create one
- author exists? if not create one
- branch needs commit
- commit needs author
Thinking of objects and relations
branch
commit
author
branch2
branch3
commit2
After rethinking it a bit
- in my domain
- commit is a value object
- author is a value object
- I don't care about relationship of branch->commit->author
- I don't care how it looks in DB
- I don't care how many commits did author do
No relation needed
branch
commit
author
branch2
branch3
commit2
commit
author
author
1st row
2nd row
3rd row
Defining bounded context
- my projects don't have domain experts
- often they are the 'domain'
- separation is hard
You as a TicketSwap user:
- can be a seller
- can be a buyer too
- just use notifications?
User is a buyer and seller
- what is a bounded contexts here?
- user
- buyer
- seller
- I'm still trying to figure it out :)
In "boring" domain:
domain experts will tell you who does what
In "boring" domain:
- buying a computer goes thru 'supply office'
- fill a request
- manager says OK
- supply orders it
In "boring" domain:
- your paycheck is calculated by human resources
- talk to team lead and ask for more money
- team lead says OK
- manager says OK
- HR does it's magic
In "boring" domain:
both will be paid by accounting
In "boring" domain:
CEO will get expense reports from financial ppl
In startups:
- airbnb for bicycles
- we dont know what we want
- or where the road will take us
- we might try X out
- and we need it yesterday
Concentrate on bringing value not perfect code
TicketSwap example:
User, buyer, seller
- you as a user
- can be a seller
- can be a buyer too
- just use notifications?
TicketSwap example:
User, buyer, seller
- unclear boundaries
- too much "core" into the app
- dont know future direction
- existing code
- looks ok
- works ok
Domain - application - infrastructure separation
- we often start by defining
- language
- framework
- tools
- design the ER model
- 80/20 - 80% time spent on 20% of problem
- try to make our problem fit into our solution
- designing a 'house of cards'
- implementing business logic & rules at crunch time
Saving invoice into RDBMS
- relation to a person/company
- company address changes?
- person last name changes?
blog post + comments in NoSQL
- building a entity with relations as a document
- comments are related to a blog post
- comments are related to a user
Domain - application - infrastructure separation
- solve the problem
- locate application usages
- store data
Saga/process manager
- connects multiple bounded contexts / aggregates
- possible inconsistent state if Xth step fails
User registration saga
- aggregates:
- user
- buyer
- seller
User registration saga
- BuyerSaga
- listens for UserRegisteredEvent
- sends a CreateBuyerCommand
- SellerSaga
- listens for UserRegisteredEvent
- sends a CreateSellerCommand
Queues
Your best friend
Queues
Your best friend
And worst enemy
Queues
- isolation
- instead of 1 queue per 1 payload you can queue all commands to same queue
Queue: multiple queues & workers
- 2 simultaneous changes on same aggregate root -> error
- if no concurrency issues increase worker count > 1
Queue: multiple queues & workers
- each external concert I import is 100% sure unique
- X workers on that queue
- concurency issues on GitHub notifications
- 1 worker per queue
- queue sharding by starting letter
Eventual consistency
- workers are ASYNC
- queues can be clogged
Eventual consistency
- never blindly trust your read model data
- aggregates are ONLY source of truth
Performance
- loading events from an event store is blazing fast
- unserializing is fast
- applying is fast
Loading aggregate root
<10 events 10 ms
~50 events 20 ms
~100 events 50 ms
~2000 events 1500 ms
Big aggregate root
- ~700 entities inside
- 580.000 events
Snapshotting
- avoid replaying all events for performance reasons
- record the state of aggregate root and save it
- load snapshot + load events afterwards
Snapshotting: DIY
- had to implement it myself :(
- simple MySQL table with
- autoincrement id
- aggregate id
- event number
- payload (serialized object)
Snapshotting: DIY
- creates a snapshot every ~200 events
- 17Mb of payload
- 1000 snapshots is 17 GB of data
- aggregate load time < 100ms
Snapshotting: warning
Changing aggregate
- delete all snapshots :(
- manually generate snapshots for big aggregate roots
No read side logic
- read side should be "anemic models"
- putting logic is a design smell
No read side logic example
- events have price with and without VAT
- you need VAT amount or percentage in the application
- don't calculate it on the read side
- update your domain events
- design smell that your domain is missing "crucial" data
Testing
- skip ES + CQRS if no testing experience
- don't test state of aggregate root
- concentrate on unit/integration testing
- do some end-to-end tests
Testing
- mock the infrastructure
- use InMemoryRepositories
- not locked to infrastructure while developing
- 10-100x faster testing
Refactoring read side
- external concert reads were stored in ElasticSearch
- ElasticSearch is not a database :)
- lots of issues -> lots of end to end tests
Refactoring read side
- dropping old ElasticSearch cluster
- refactored read side use DoctrineORM + MySQL
- half of days work
- updating tests -> ~2 days
Refactoring read side: replaying events
- we had all events stored
- replay all events to fill up new read models
- 30 line PHP CLI command
- 16.000 events ~ 1min
- turned off ElasticSearch
UUIDs
- instead of expecting DB to provide ID
- commands/aggregates need you to provide ID
- ramsey/uuid
UUID performance
- "64c3e987-905a-4426-8dc3-ddb61650b86b" takes more space than "1"
- instead of 32 chars, you can save them as 16 binary
- not continous
- index rebalancing
- need "createdAt" to sort
UUIDs everywhere
- I try to avoid autoincrement in CRUD
- helps avoiding problems where entity might have ID
Use familiar tech
- ES+CQRS is a big shift in thinking
- if you have experience using MySQL/Mongo/X/Y/Z try to use those instead of learning new tech as well
Class number explosion
- UserController
- User (Entity)
- UserRepository
- UserController
- RegisterUserCommand
- UserCommandHandler
- User (AggregateRoot)
- UserRegisteredEvent
- UserReadProjector
- UserReadEntity
- UserReadRepository
Class number explosion
- UserController
- User (Entity)
- UserRepository
- UserController
- RegisterUserCommand
- ChangePasswordCommand
- UserCommandHandler
- User (AggregateRoot)
- UserRegisteredEvent
- PasswordChangedEvent
- UserReadProjector
- UserReadEntity
- UserReadRepository
Recap
event sourcing and CQRS is the best thing EVER
Thank you!
Please please please leave feedback https://joind.in/talk/67342
@msvrtan
Any questions?
"Microservice" inside monolith
- separate event store
- important data for debugging
- not business important
- truncate data every X
Hammer problem
"When you have a hammer, everything looks like a nail"
Hammer problem
- self-doubt if now really everything looks like a nail
Year with EventSourcing and CQRS
By Miro Svrtan
Year with EventSourcing and CQRS
Slides for my 'Year with EventSourcing and CQRS' on PHP Srbija 2017 conference in Belgrade,Serbia
- 2,586