Writing Robust Systems
Dom Finn, Lead Developer at UNiDAYS
@cleverfinn
UNiDAYS
Leading global student marketing provider
Agenda
- Excuses
- Diatribe
- Hyperbole
- Dissent
- Questions
- Pub
What This Isn't...
- CAP Theorem
- SOLID / General good design
- CQRS / Architecture
Assumptions
- You've tested it
- Your domain makes sense
Robustness
- What is Robustness
- Designing for failure
- Risk Evaluation
- Implementing for failure
- Panic
What Is Robustness
Robust
/roh-buhst/ - adjective
Strong and effective in all or most situations and conditions
What Is Robustness
Robust
/roh-buhst/ - adjective
Strong and effective in all or most situations and conditions
What Is Robustness
Robust
/roh-buhst/ - adjective
Strong and effective in all or most situations and conditions
What Is Robustness
Robust
/roh-buhst/ - adjective
Strong and effective in all or most situations and conditions
Murphy's Law
- Things break all the time
- Don't deny it
- Embrace it
- Design for it
Graceful Degradation
It's Not This
try
{ businessCriticalThing1(); businessCriticalThing2(); } catch (BadDevelopmentPractice b) { b.sweepUnder(rug); }
Graceful Degradation
Progressive Enhancement
Failure Handling
- Is a business concern
- Not a decision devs should be making (alone)
- Involves UX, Dev, Ops, RoB
Code-Issue Failure
Who Cares?
Code-Issue Failure
Accounts - Code Management
Ops - Code Management
Support - On-Site Known Issues
Support - Support Team
Support - Social Team
Legal - SLAs
Commercial Analytics - Data loss
Business - KPI Impact
Fail-Well
- Fail-Safe
- Fail-Secure
- Fail-Planned
Checklist
- What can fail?
- How many ways can it fail?
- How can we (partially) recover?
- What is the business impact?
- What stakeholders are affected?
- What work does a failure generate?
What Can Fail?
Everything (else)
CAP
What Can Fail?
Side Effect 1
Side Effect 2
Side Effect 3
Action
How many ways can that code be written / fail?
(Combinatorics)
MVC Example
public ActionResult UpdateUser(UserViewModel viewModel)
{
var user = userRepository.Get(viewModel.Id);
user.Email = viewModel.email;
user.Password = viewModel.password;
emailService.SendAccountUpdatedEmail(user);
userRepository.Update(user);
reportService.RecordEvent(new UserUpdatedEvent(user));
return Redirect("/user-updated");
}
Combinatorics
1) Email | 2) Update | 3) Report | |
---|---|---|---|
Pass | Pass | Pass | Yes |
Pass | Pass | Fail | Yes |
Pass | Fail | No | |
Fail | No |
1) Update | 2) Report | 3) Email | |
---|---|---|---|
Pass | Pass | Pass | Yes |
Pass | Pass | Fail | Yes |
Pass | Fail | Yes | |
Fail | No |
1) Email | 2) Report | 3) Update | |
---|---|---|---|
Pass | Pass | Pass | Yes |
Pass | Pass | Fail | No |
Pass | Fail | No | |
Fail | No |
1) Update | 2) Email | 3) Report | |
---|---|---|---|
Pass | Pass | Pass | Yes |
Pass | Pass | Fail | Yes |
Pass | Fail | Yes | |
Fail | No |
25%
50%
75%
75%
1) Report | 2) Email | 3) Update | |
---|---|---|---|
Pass | Pass | Pass | Yes |
Pass | Pass | Fail | No |
Pass | Fail | No | |
Fail | No |
25%
1) Report | 2) Update | 3) Email | |
---|---|---|---|
Pass | Pass | Pass | Yes |
Pass | Pass | Fail | Yes |
Pass | Fail | No | |
Fail | No |
50%
MVC Example
public ActionResult UpdateUser(UserViewModel viewModel)
{
var user = userRepository.Get(viewModel.Id);
user.Email = viewModel.email;
user.Password = viewModel.password;
+ userRepository.Update(user);
emailService.SendAccountUpdatedEmail(user);
- userRepository.Update(user);
reportService.RecordEvent(new UserUpdatedEvent(user));
return Redirect("/user-updated");
}
How Can We Recover?
- Redrive / Retry
- Deadletter
- Blackhole / Ignore
- Explode
Redrive / Retry
Requires:
- Atomicity
- Idempotence
Side Effect 1
Side Effect 2
Side Effect 3
Action
Redrive / Retry
Action
Side Effect 1
Side Effect 2
Side Effect 3
Forgotten Password
Forgotten Password
Forgotten Password
Forgotten Password
Critical Path
(Immediately Consistent Path)
Critical Paths
- Determine Critical Paths
- Remove non-critical actions from CP
- Determine recovery strategy for (non)critical actions
Blackhole
- Does it matter if Side Effect X doesn't work?
- Can be an quick, intermediary solution
Explode
- Do it nicely
- 500 vs 503 vs 404 vs 403
- Help the user recover / get help
No!
What work does a failure generate?
- Support Emails / Tweets
- Account Management
- Data Cleansing
- Reporting alterations
- Billing alterations
What problems does a failure hide?
- Data corruption
- Conversion failures
- Loss of revenue
- User disengagement
Help API Consumers Succeed, Not Fail
Encapsulation
try, try, try again
MVC Example
public ActionResult UpdateUser(UserViewModel viewModel)
{
var user = userRepository.Get(viewModel.Id);
user.Email = viewModel.email;
user.Password = viewModel.password;
userRepository.Update(user); emailService.SendAccountUpdatedEmail(user); reportService.RecordEvent(new UserUpdatedEvent(user)); return Redirect("/user-updated"); }
MVC Example
try { var user = userRepository.Get(resource.Id); try { userRepository.Update(user); try { emailService.SendAccountCreatedEmail(user); } catch { } try { reportService.RecordEvent(new UserUpdatedEvent(user)); } catch { } } catch { } } catch { }
try { } catch {}
- Is a hack*
- Bad design, equivalent to GOTO
- Propagation hard to follow
- Indication of developer ignorance
- Illegible
*mostly
When did you last write a trycatch and consult your PO / BA / Stakeholder?
try { } catch {}
try { // do stuff } catch { // just in case lolol }
1) Bury your head
try { } catch {}
try { // do stuff } catch (Exception e) { // no idea what will be thrown }
2) Catch all the things
try { } catch {}
try { // do stuff } catch { throw; }
3) Hot potato
Exceptions
Theres a time and a place...
(Probably not when and where you’re using them)
Cue sweeping generalisations...
Things that might / could / should throw
- Low Level APIs
- I/O
- Disc
- Network
- Third Party Code (Including BCL)
(Most of this doesn’t need to, but does)
Types of throw;
Development vs production
To throw or not to throw?
- Protect APIs with helpful guard clauses
- Don’t expect these to throw in production
- Every unhandled exception logged to become a Bug Ticket
- Pipe them into your bug tracker!
- Don’t try and recover*
Basic Example
Guid ConvertToGuid(string guid)
{
return Guid.Parse(guid);
}
No
Guid ConvertToGuid(string guid)
{
try
{
return Guid.Parse(guid);
}
catch
{
return Guid.Empty;
}
}
No
Guid ConvertToGuid(string guid) { Guid g; if(Guid.TryParse(guid, out g); return g; return Guid.Empty; }
Yes
MVC Example
try { var user = userRepository.Get(resource.Id); try { userRepository.Update(user); try { emailService.SendAccountCreatedEmail(user); } catch { } try { reportService.RecordEvent(new UserUpdatedEvent(user)); } catch { } } catch { } } catch { }
Solution
Enforce Boundaries
Solutions
public enum ExecutionResult { Success = 1, Failure = 2 } public sealed class ExecutionResult<T> { public T Data; public ExecutionResult Result; }
Solutions
User IUserRepository.Get (Guid id); void IUserRepository.Update (User user); void IEmailService.SendWelcomeEmail (User user); void IReportService.RecordEvent<TEvent> (TEvent @event);
ExecutionResult<User> IUserRepository.Get (Guid id); ExecutionResult IUserRepository.Update (User user); ExecutionResult IEmailService.SendWelcomeEmail (User user); ExecutionResult IReportService.RecordEvent<TEvent> (TEvent @event);
Before
After
MVC Example
var userResult = userRepository.Get(resource.Id); if(userResult.Result == ExecutionResult.Failure) return ErrorResult.ServiceUnavailable(); var user = userResult.Data; var result = userRepository.Update(user); if(userResult.Result == ExecutionResult.Failure) return ErrorResult.ServiceUnavailable(); emailService.SendAccountUpdatedEmail(user); // ignore result status reportService.RecordEvent(new UserUpdatedEvent(user)); // ignore result status
Other Anti-patterns
// double check, for saftey
var thing = getThing(thingId); // just in case, lol! if(thing == null) return; thing.doThing();
// null?
var thing = getThing(thingId); var things = getThings();
Panic!
Summary
- Progressive Enhancement instead of Graceful Degradation
- Let code fail under undesigned conditions
- Write APIs that help you succeed
- Consult business when handling/designing failure
Writing Robust Systems
By trullock
Writing Robust Systems
- 1,151