Strangling The Monolith
Applied Patterns & Practices From The Trenches
Thomas Ploch - Principal Software Architect @ FLIX
Quick Outlook
A small history tour
A small history tour
2015 merger with competitor Flixbus to form the largest long-distance bus travel provider in the German market
Mid-2017 our modernisation journey started
Development began 2012 with a single development team with support from one near-shoring team in the Ukraine
- PHP 5.2
- MySQL 5.5
- jQuery 1.8
Tech Stack
Architecture
Image: Cover "No Silver Bullet—Essence and Accident in Software Engineering". IEEE Computer (April 1987).
Architecture
Image: Cover "No Silver Bullet—Essence and Accident in Software Engineering". IEEE Computer (April 1987).
Somehow we, as an industry, still consistently end up, repeatedly, with many silver bullets.
And it was not different with us!
JEOPARDY TIME!
JEOPARDY TIME!
This architecture pattern was the most-widely used software & framework architecture for web-based projects in 2012.
JEOPARDY TIME!
What is ...?
MVC
Model-View-Controller Architecture
-
It's fast to get started with MVC, the market liberalization was a hard deadline so delivering fast was paramount.
-
All layers should be changeable independently from each other, so scaling development should not be a problem.
-
Ability to provide multiple views, i.e. HTML & JSON for the API.
- Routing, Security, Templating - batteries included.
Model-View-Controller Architecture
Houston, we have a problem!
In our case the cycle time kept increasing!
At some point we couldn't even implement certain features - "that's not possible..."
Lead & Cycle Times
Failure Patterns
1. Anemic Domain Models
1. Anemic Domain Models
class OrderEntity {
private DateTime $createdAt;
private string $status;
private MutableCollection $items;
public function getStatus(): string
{
return $this->status;
}
public function setStatus(string $status): void
{
$this->status = $status;
}
public function getItems(): MutableCollection
{
return $this->items;
}
}
1. Anemic Domain Models
// Business logic is spread out to many services
class OrderService {
public function cancelOrder(int $id): void
{
$order = $this->fetchOrderFromDB($id);
$status = $order->getStatus();
// some validation & processing logic
$order->setStatus('cancelled');
$this->saveOrderToDB($order);
}
}
1. Anemic Domain Models
Mmmmm, Lasagna!
1. Anemic Domain Models
When the classes that describe the model and the classes that perform operations on the model are separate. The services contain all the domain logic while the the domain objects themselves contain practically none.
The classic Lasagna architecture
1. Anemic Domain Models
Blood, Sweat & Tears
Time
Anemic Model
Rich Model
The really sweet spot
Time to complete
The sweet spot
Time to complete
Time to complete
The horrible spot
1. Anemic Domain Models
2. Weak Boundaries
Use Case A
Use Case B
Use Case C
Use Case D
Order
Service
Initially you start with a single use case and everything seems perfectly fine
Now the services are starting to be shared between differing use cases
More and more use cases make use of the shared service because DRY, right?
And over the time this initially nicely fitting Order Service has become an unmaintainable mess suffering from the God Class symptom
Order
Entity
Anemic Model
Order
Service
2. Weak Boundaries
Order Repository
Order Item Repository
Use Case A
Use Case B
Keeping consistency is IMPOSSIBLE!
Use Case A works on Orders and let's the Order model handle the internal consistency.
Use Case B breaks the consistency boundary by operating on the collection of Order Items directly.
2. Weak Boundaries
Order
Item
Order
Item
Order
Developers will very often take the path that was paved before they joined
3. That's How we do things here...
3. That's how we do things here...
Developer is fed up with the messy system and leaves the company
New developer is hired
New developer follows the paved path within the existing system
Problems get worse and new developer struggles with delivery
Developer wants to change things but stakeholders do not see the necessity
Service B
Service C
Service D
Service E
Service A
Use Case A
3. That's how we do things here...
State
State
State
State
State
-
Lasagna architecture made it very complicated to add and change features
-
State management was spread across many services and a major source for bugs and incidents
-
Developers were very unhappy and the re-hire cycle even accelerated the problems
-
All problems reinforced themselves and spun out of control very quickly
Failure Patterns
What can we do?
New path
The 5-step Improvement process
The 5-step Improvement Process
Identify the most valuable components and their boundaries
1.
Align the organisation around the future boundary cut-marks
2.
Measure & analyze the current system's migration complexity
3.
Pick the next most valuable component and start the migration
4.
Finish the migration and go back to step 4
5.
1. Identify the Most Valuable Components
Identify the most valuable components and their boundaries
1.
Align the organisation around the future boundary cut-marks
2.
Measure & analyze the current system's migration complexity
3.
Pick the next most valuable component and start the migration
4.
Finish the migration and go back to step 4
5.
1. Identify the most valuable components
The Map places the Value Chain components on an economic evolution horizon. The more commoditized a component is, the less there is a need to build & maintain components yourself.
Purpose & Scope set the stage for the mapping exercise
Anchoring the Value Chain on the users' needs is important to keep the focus on the customer value.
The Value Chain identifies
dependencies in the value delivery. This helps with understanding your actual core capabilities.
1. Identify the most valuable components
1. Identify the most valuable components
1. Identify the most valuable components
You want everything in core to take priority because you believe that this will give you an edge over your competitors.
Becoming faster in delivering value has the highest ROI in the core quadrant.
1. Identify the most valuable components
2. Re-Align the Organisation
Identify the most valuable components and their boundaries
1.
Align the organisation around the future boundary cut-marks
2.
Measure & analyze the current system's migration complexity
3.
Pick the next most valuable component and start the migration
4.
Finish the migration and go back to step 4
5.
2. RE-ALIGN THE ORGANISATION
2. RE-ALIGN THE ORGANISATION
...if you have four groups working on a compiler, you’ll get a four-pass compiler.
Melvin Conway
2. RE-ALIGN THE ORGANISATION
2. RE-ALIGN THE ORGANISATION
2. RE-ALIGN THE ORGANISATION
2. RE-ALIGN THE ORGANISATION
2. RE-ALIGN THE ORGANISATION
2. RE-ALIGN THE ORGANISATION
.git CODEOWNERS
Team B
Team A
Module A
Module B
3. Measure & Analyze
Identify the most valuable components and their boundaries
1.
Align the organisation around the future boundary cut-marks
2.
Measure & analyze the current system's migration complexity
3.
Pick the next most valuable component and start the migration
4.
Finish the migration and go back to step 4
5.
3. Measure & Analyze
3. Measure & Analyze
3. Measure & Analyze
Static
- Average cylcomatic complexity
- Afferent/efferent coupling & Instability
- No. of module dependencies
- Logical lines of code (size)
Behavioral
-
VCS change coupling
- Which things are changing together?
-
VCS change rate per module & team
- Which things are changed by a team?
- VCS change rate per module & team
Risk
- Is revenue generation effected?
- Scale of operations?
- High throughput / low latency?
3. Measure & Analyze
A
B
C
D
This will probably never be migrated, and may even be a trigger to give up on this component completely!
4. Let's get Started!
Identify the most valuable components and their boundaries
1.
Align the organisation around the future boundary cut-marks
2.
Measure & analyze the current system's migration complexity
3.
Pick the next most valuable component and start the migration
4.
Finish the migration and go back to step 4
5.
4. Let's get Started
4. Let's get started
4. Let's get Started
4. Let's get started
SELECT * FROM orders where...
INSERT INTO orders VALUES...
UPDATE orders SET ... WHERE
Use Case C
Use Case D
Use Case E
Use Case F
Use Case B
Use Case A
Traces connect the database and use cases through the code
Tip: Start with the INSERT use cases first, since they represent the genesis cases.
Most tracing tools/agents support the major database clients and connect queries with code traces and/or logs.
Span
Span
Span
Span
Span
Span
OpenTelemetry / observability platform
4. Let's get Started
class Order {
private DateTime $date;
private string $status;
// Impossible to construct an invalid state
public function __construct() {
$this->date = new DateTime(
'now',
new DateTimeZone('UTC'),
);
$this->status = 'new';
}
}
4. Let's get Started
class Order {
// ...
public function cancel(): void {
// now the validation is within the model
if ($this->status === 'cancelled') {
throw new LogicException(
'Invalid status cancelled'
);
}
$this->status = 'cancelled';
}
}
4. Let's get Started
class OrderService {
private OrderRepository $repository;
public function cancel(int $id): void
{
$order = $this->repository->get($id);
$order->cancel();
$this->repository->save($order);
}
}
4. Let's get Started
4. Let's get Started
4. Let's get started
Use Case A
// The Strangler Facade is just an abstraction
// It will have multiple implementations
interface OrderFacade {
public function place(
Id $id,
Money $value,
CustomerId $customerId,
): OrderCreatedEvent;
public function cancel(Id $id): OrderCancelledEvent;
}
Strangler Facade
Transactional Outbox
TX boundary
Order
Event
New
Order
Service
Outbox
Consumer
This has to be repeated for all the identified use cases
Once we have covered all the use cases with the Strangler Facade the new service is completely in sync with the legacy system!
4. Let's get started
Strangler Facade
Remote Client
Order
Event
New
Order
Service
Legacy Sync Consumer
All the changes behind the Strangler Facade are completely transparent to the clients!
Keeping the legacy system up to date for some time will be necessary due to reporting and other use cases!
Finish the migration and go back to step 4
5.
4. Let's get Started
4. Let's get started
Strangler Facade
Feature Flag Aware Decorator
New
Order
Service
Legacy System
Feature Flag Repository
Use Case C
Use Case D
Use Case E
Use Case B
Use Case A
The feature flag can control access per use case, and also enable post-deployment testing, i.e. through development cookies or other methods.
4. Let's get started
Strangler Facade
Feature Flag Aware Decorator
New
Order
Service
Legacy System
Feature Flag Repository
Use Case C
Use Case D
Use Case E
Use Case B
Use Case A
Every time we switch a use case over gradually and are able to limit the risk and blast radius since we can only enable it for a very small part of the traffic.
The feature flag can control access per use case, and also enable post-deployment testing, i.e. through development cookies or other methods.
4. Let's get started
Strangler Facade
Feature Flag Aware Decorator
New
Order
Service
Legacy System
Feature Flag Repository
Use Case C
Use Case D
Use Case E
Use Case B
Use Case A
Every time we switch a use case over gradually and are able to limit the risk and blast radius since we can only enable it for a very small part of the traffic.
The feature flag can control access per use case, and also enable post-deployment testing, i.e. through development cookies or other methods.
WARNING! Splitting traffic between use cases might be harder than you think. The coupling often poses big challenges.
-
Start with creating a new target picture based on strategic and economic factors. For us Wardley Maps, Strategic DDD and Team Topologies worked.
-
Realign the organisation around the new system architecture. Inverse-Conway Maneuvers worked for us, but it's not for everyone since it is a huge social perturbation.
-
Measure & analyze the code, the behaviors and the risks before diving into implementation. You'll often find surprises!
-
Design your new APIs using collaborative modelling and modern approaches. In the end, that's what modernization is all about.
-
Leverage observability platforms and other tooling to help you find the entry points to the use cases you want to tackle - by analyzing the traces from the DB to the code entry-points.
-
Tackle each of the entry-points with the newly designed Strangler Facade. We used the Transactional Outbox pattern and asynchronous event integration successfully.
-
Continue while it's valuable for the organisation, but constantly inspect and adapt.
Success Patterns
Thank You!
We're Hiring!
Slides online
Strangling the Monolith - Applied Patterns & Practices From The Trenches @ Lightweight Java Usergroup Munich 2022
By Thomas Ploch
Strangling the Monolith - Applied Patterns & Practices From The Trenches @ Lightweight Java Usergroup Munich 2022
Many developers have been there. A beautifully crafted system that has helped the company grow slowly deteriorates into a big ball of mud. The calls for a rewrite are becoming louder and louder, but these big-bang rewrites historically have a high risk of failure. Is there a way to iteratively step out of the mud? We at Flixbus faced a similar situation and were looking for answers - and we found one! The strangler (fig) pattern which allowed us to incrementally evolve even our high-risk core services towards a modern reactive system architecture. This sounds easy, but our path was paved with hard learning and many failures. In this session I will present the patterns and practices that helped us on our journey & the many pitfalls we have found.
- 922