COMP2511
🎨 9.1 - Risk Engineering
In this lecture
- What is risk in Software Engineering?
- Mitigating risk
- Designing for Risk
The Flaw in the Plan
- Why do plans (designs) not go according to plan? What went wrong?
- Flaws in the implementation / execution of the plan/design
- Flaws in the design/plan itself
- We can't always plan for everything up front
- Design flaws are often hard to spot; Risk is invisible
- Can only tell through design smells / red flags
- Over time, we learn to become better at recognising warning signs and identifying flaws earlier on
- It's not what happened right before things went wrong that was the problem - it is what happened every step along the way that got us to that point
Design Debt, or Design Risk?
- Risk - the probability of a bad outcome occurring
- Design decisions come with a cost - "technical debt", the more technical debt, the more risk we accumulate
- Greater software complexity leads to more risk
- The design decisions and trade-offs we make are often the flaws in the plan - risks are inevitable
- How does this manifest itself?
- Design problems often build in a "slow burn" fashion
- Incidents, defects, bugs
- Resistance to changes in software
- These in turn present Business Risks
Mitigating Risk
- Risks are centred around events, e.g. software breaking.
- Risk is often assessed in terms of probability and impact
- Mitigations of probability
- Preventative measures that lower the chance of a bad outcome occurring
- E.g. Looking both ways before crossing the street
- Mitigations of impact
- Reactive measures that decrease the negative outcome in the event that something bad does occur
- E.g. Wearing a bike helmet
- This is often termed Quality Assurance
How do we design for risk?
Designing for Risk: Swiss Cheese Model
- James Reason - Major accidents and catastrophes reveal multiple, smaller failures that allow hazards to manifest as risks
- Each slice of cheese represents a barrier, each one of which can prevent a hazard from turning into consequences
- No single barrier is foolproof - each slice of cheese has "holes"
- When the holes all align, a risk event manifests as negative consequences
Designing for Risk: Swiss Cheese Model
- Taking a layered approach to Software Safety
- Testing at multiple levels:
- Static verification
- Unit and integration tests
- Usability tests
- Design and code reviews
- CI pipelines
- Sometimes referred to as containment barriers
- A defensive approach; multiple checks and balances in place
- Probability is multiplicative (X AND Y AND Z = P(X) * P(Y) * P(Z))
Designing for Risk: Shifting Left
A waterfall / big design up front approach to quality assurance.
Designing for Risk: Shifting Left
Shift Left: A practice intended to find and prevent problems early in the engineering process.
Designing for Risk: Shifting Left
- Shifting Left in principle: Moving risk forward in the software development timeline and designing systems and processes that are built for continuous testing
- What does shifting left involve in practice?
- Automated testing over manual testing
- Continuous Integration
- Test-Driven Development
Shifting Left: An Example
- Let's take an example - a python script which runs on a remote server
- There is an error in the code, and the code fails when attempting to run a usability test
$ python3 -m svc.create_repo test
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/Users/nicholaspatrikeos/Desktop/COMP2511-22T3/administration/svc/create_repo.py", line 11, in <module>
PROJECT = gl.projects.get(f'{NAMESPACE}/{TERM}/STAFF/repos/{REPO}')
NameError: name 'REPO' is not defined
- How could we shift left here?
Shifting Left: Dynamic Verification + CI
- We can dynamically verify the correctness of the code and automatically run the tests in a pipeline:
$ pytest
============================= test session starts ==============================
platform darwin -- Python 3.8.8, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
rootdir: /Users/nicholaspatrikeos/Desktop/COMP2511-22T3/administration
plugins: hypothesis-6.1.1, xdist-2.2.1, timeout-1.4.2, forked-1.3.0
collected 1 item
create_repo_test.py F [100%]
- Problem here - we are still having to run our tests in order to pick up a simple name error, this takes a long time to catch a small problem
Shifting Left: Static Verification + CI
- We can statically verify the correctness of our code, which is faster than running all the tests using a linter or a type checker:
$ pylint svc/*.py
************* Module svc.create_repo
svc/create_repo.py:11:64: E0602: Undefined variable 'REPO' (undefined-variable)
svc/create_repo.py:19:31: E0602: Undefined variable 'REPO' (undefined-variable)
svc/create_repo.py:27:24: E0602: Undefined variable 'REPO' (undefined-variable)
-------------------------------------------------------------------
Your code has been rated at 9.61/10 (previous run: 10.00/10, -0.39)
- Problem here - we are still having to push to the CI for our breaking changes to be contained. Can we enforce running them before?
Shifting Left: Local Configurations
- Pre-commit hooks and IDE tools can give us more friendly experiences that detect these problems earlier in the development loop, e.g.
- Ideally, static verification is "baked in" to our programming language rather than added on...
Shifting Left: Type Safety
- Types are statically verifiable - meaning that we can ensure correctness earlier on in the development process, shifting left
- In Java, code that doesn't adhere to the rules of the type system fails to compile - a significant containment barrier
- Extensions like mypy and TypeScript allow for an add-on of type checking
- Unlike Java however, type safety wasn't part of the Big Design Up Front for Python and JS
- Modern software design is favouring statically typed languages for these reasons
def my_function(message):
if message == 'hello':
return 1
return '0'
result = my_function('goodbye')
Shifting Left: Type Safety
- Features of type systems:
- Ability to define custom types (typedefs)
- Inheritance, Subtypes and Supertypes
- Interfaces
- Generics
- Unit types
- Enums
- Well-designed type systems allow us to verify more of our code statically
Shifting Left: More Static Verification &
Design by Contract
- Some programming languages (e.g. Dafny) allow for more static verification than just type checking - they can prove or disprove code according to a declarative contract where preconditions, postconditions and invariants are specified
- Dafny makes use of a theorem prover which checks how well the implementation matches the specification (contractual correctness)
Summary
- Risk forms a large part of modern-day Software Engineering
- Designing for risk:
- Considering risks in the design process;
- Designing processes to accomodate for risk.
- Murphy's law: Anything that can go wrong, will.
COMP2511 23T2 - Risk Engineering
By npatrikeos
COMP2511 23T2 - Risk Engineering
- 591