A tried-and-tested workflow for software quality assurance

Mark Woodbridge, Mayeul d’Avezac, Jeremy Cohen

Research Computing Service

Imperial College London

Preparation

Import/launch RSE18 VM from USB key
Open a terminal and`cd woodbridge`
Open a browser and visit slides.com/mwoodbri/rse18

Annotations on these slides:
- ⌨️ = Hands-on exercise
- Text in corner of slide = A git branch
  - ```
  git checkout <branch>
```

Purpose of this workshop

Demonstrate a set of tools and a workflow that can be used to automate some valuable quality checks for Python software projects

You will leave with:

Transferable knowledge
A working, automated QA setup
A reusable template for your own projects

Purpose of this workshop 2

Why is this important?

Software sustainability:

Reliability: correctness, robustness, performance
Collaboration: accessibility, efficiency

Ultimately:

Save you time (and potentially some embarrassment)
Make your project more attractive to users/developers

Purpose of this workshop 3

We are not aiming to:

Provide an in-depth guide to any single tool
- See e.g. Matt Williams’ workshop on pytest
Thoroughly cover languages other than Python...
- ...though the approach is transferable and we will discuss alternatives in other ecosystems
Address reproducibility, how to structure your code/projects, how to test user interfaces…

Agenda

Introduction
Getting started
Coding standards
Testing: Basics, Coverage, Fixtures
Static analysis: Linting, Type checking
Automation
Advanced topics
Discussion

Introduction

Our toy project is Conway’s Game of Life

Cellular automaton
“Zero-player game”
Grid of cells that are either on or off
Survival if 2 or 3 neighbours
Birth if 3 neighbours
We'll implement using neighbour counting

Getting started

sudo apt-get install -y atom
atom .

Alternative: VS Code

master

1. Launch the VM, open a terminal and switch directory:

git fetch
git reset --hard origin/master

cd woodbridge

2. Get the most up-to-date copy of this tutorial:

3. Install and launch Atom (password "workshops")

⌨️

Getting started 2

We're using Python 3 (provided by Ubuntu)
We have pre-installed:
- git via apt
- ide-python and linter-mypy Atom packages via apm
  - Their Python dependencies via pip
    - pyls-black/pyls-isort, mypy
- Full details here
And cloned:
- github.com/ImperialCollegeLondon/RSE18

Coding standards

Strict (automated) code formatting results in:

Consistency within projects
Consistency between projects
- Python has (mostly) clear guidelines, unlike some other languages (see PEP 8, PEP 257)
Less bikeshedding
Improved readability (and hopefully quality)
Basic syntactic verification
Time savings

Coding standards 2

We're using the black formatter
Describes itself as "uncompromising"
Numerous integrations
- Atom, VS Code, Jupyter...
Alternative: yapf

Coding standards 3

from scipy import signal
def count_neighbours(board):
    """Return an array of neighbour counts for each element of `board`"""
    return signal.convolve2d( board , 
        [[1, 1, 1], [1, 0, 1], [1, 1, 1]], mode ='same')

1. Paste in the following (poorly formatted) code into a new file named `life.py`

2. Save and observe automatic reformatting

formatting

⌨️

Testing: Basics

Testing is crucial to software engineering
Bedrock of correct, robust code
An enormous topic in itself
We won't cover functional testing, regression testing...
Nor details of fixtures, mocking etc
Or how to write effective tests!

Testing: Basics 2

-r requirements.txt
pytest==3.7.4

1. Create a new file `requirements-dev.txt`:

2. Update installed packages:

pip install -r requirements-dev.txt

Alternative: unittest

⌨️

Testing: Basics 3

from life import count_neighbours

def test_count_neighbours():
    board = [
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 1, 1, 1, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
    ]
    assert count_neighbours(board).tolist() == [
        [0, 0, 0, 0, 0],
        [1, 2, 3, 2, 1],
        [1, 1, 2, 1, 1],
        [1, 2, 3, 2, 1],
        [0, 0, 0, 0, 0],
    ]

1. Create a new file `test_life.py`:

testing

⌨️

2. Run `pytest` and ensure your test passes

Testing: Coverage

Any testing is almost certainly better than none
But how thorough is your test suite?
One metric: run the tests and see how much of your code they exercise
This is the idea behind test coverage

Testing: Coverage 2

-r requirements.txt
pytest==3.7.4
pytest-cov==2.5.1

[pytest]
addopts = --cov=life --cov-report term-missing

3. Create `pytest.ini`:

4. Run `pytest` and observe statistics

1. Update `requirements-dev.txt`:

2. Update installed packages:

pip install -r requirements-dev.txt

⌨️

Testing: Coverage 3

def step(board):
    """Return a new board corresponding to one step of the game"""
    nbrs_count = count_neighbours(board)
    return (nbrs_count == 3) | (board & (nbrs_count == 2))

1. Append to `life.py`:

3. Add a test for `step` to `test_life.py` (similar to `test_count_neighbours`) and re-run `pytest`

2. Re-run `pytest` and observe coverage

coverage

⌨️

Testing: Fixtures

Tests will often require some common inputs or initialisation/finalisation
- e.g. databases
Fixtures can be used for this
Avoids global state
Reduces repetition
Feature provided by `pytest`

Testing: Fixtures 2

1. Add a `board` fixture to `test_life.py`:

import pytest

@pytest.fixture
def board():
    return [
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 1, 1, 1, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
    ]

2. Change the signatures of the relevant functions e.g.

def test_count_neighbours(board):

3. Re-run your tests

⌨️

Testing: Fixtures 3

1. Add a `play` function to `life.py`:

def play(board, iterations):
    """Return a new board corresponding to `iterations` steps of the game"""
    for _ in range(iterations):
        board = step(board)
    return board.tolist()

2. Add a`test_play` function to `test_life.py` using a fixture

3. Run your tests and ensure they all pass

fixtures

⌨️

Hint: consider what `play(board, 2)` should return

Testing: Basics 4

def test_play_wrap():
    board = [
        [0, 0, 0, 0, 0],
        [0, 0, 1, 0, 0],
        [0, 0, 0, 1, 0],
        [0, 1, 1, 1, 0],
        [0, 0, 0, 0, 0],
    ]
    assert play(board, 20) == board

⌨️

1. Add the following test case to `test_life.py`

2. Run `pytest` and observe output

3. Resolve the root cause of the failure for this glider

Hint: Review the SciPy documentation for `convolve2d`

Static analysis

Code analysis prior to execution
Utility depends on language
Can range from basic syntactic verification through to sophisticated type analysis
Aims to detect bugs before they happen
Ideally via integration with editors

Introduction

Static analysis

A linter "analyzes source code to flag programming errors, bugs, stylistic errors, and suspicious constructs"
Not limited to checking formatting
Analysis typically requires parsing code
- Hence black doesn’t sort imports
flake8 = pycodestyle + pyflakes + mccabe
Alternative: coala

Linting

Static analysis

Linting 2

...
flake8==3.5.0

[flake8]
exclude = venv/,.atom,.tox
max-line-length = 88

3. Create a `.flake8` file in the current directory:

4. Run `flake8`

1. Update `requirements-dev.txt`:

2. Update installed packages:

pip install -r requirements-dev.txt

⌨️

Static analysis

Introduce some inconsistent formatting (spacing, quotes etc) into`life.py` using an editor other than Atom (e.g. vim, nano)
Run `flake8` and note errors
Fix problems by using the same editor, or by refreshing in Atom and then saving
Re-run `flake8` to verify

Linting 3

linting

⌨️

Static analysis

Type checking

Dynamic typing doesn't typically enable extensive static type analysis
Python 3.6 introduced type hints
Resources:
We're using mypy (Python 2 and 3)
Alternatives: pyre, Pytype

Static analysis

Type checking 2

1. Modify `test_life.py` so that it tries to invoke `play` with a non-integer number of iterations:

typing

def test_play(board) -> None:
    assert play(board, 2.5) == board

3. Add a type annotation to `life.py` to protect against this:

-def play(board, iterations):
+def play(board, iterations: int):

4. Observe the resultant warning in Atom

⌨️

2. Run `pytest` and observe output

Automation

Promotes consistency, efficiency and quality
"Continuous integration" ensures that linting, tests, coverage are all performed whenever code is pushed (or before) and automates reporting and notification
We're using GitLab
- Free private repositories
- Turnkey fully integrated CI solution
- Just need to add a simple YAML file to repo
Alternative: GitHub + Travis/CircleCI

Automation 2

1. Create a `.gitlab-ci.yml` file:

test:
  script:
  - apt-get update -y && apt-get install -y tox
  - tox

2. Create a `tox.ini` file:

[tox]
envlist = py3, flake8
skipsdist = True

[testenv]
deps = -rrequirements-dev.txt
commands = pytest

[testenv:flake8]
deps = flake8
commands = flake8

⌨️

Automation 3

1. Create a GitLab account (if necessary)

git remote add gitlab https://gitlab.com/<username>/rse18.git

3. Add a `gitlab` remote:

4. Push to GitLab:

git push gitlab

5. Visit https://gitlab.com/<username>/rse18

ci

⌨️

2. Create an "api" scope Access Token (if required)

Ensure that you save this somewhere

Automation 4

GitLab sends notification email on pipeline failure
Also provides badges to indicate status, coverage etc
- e.g. github.com/robinandeer/puzzle

Enable badges for your repository:

Visit Repository → Settings → CI/CD → General pipelines
- https://gitlab.com/<username>/rse18/settings/ci_cd
Set "Test coverage parsing" to suggested pytest-cov regex
Copy "Pipeline status" and "Coverage report" Markdown into `README.md`. Add, commit, push and refresh page.

badges

⌨️

Advanced topics

Hypothesis

pytest=3.7.4
...
hypothesis==3.70.00

from hypothesis import given
from hypothesis.strategies import integers, lists

@given(lists(lists(integers(0, 1))), integers(max_value=20))
def test_play_fuzz(board, iterations):
    play(board, iterations)

4. Run `pytest` and observe output

1. Update `requirements-dev.txt`:

2. Update installed packages:

pip install -r requirements-dev.txt

⌨️

3. Update `test_life.py`:

hypothesis

Advanced topics 2

Jupyter

pytest=3.7.4
...
nbval==0.9.1

[pytest]
addopts = --cov=life --cov-report term-missing --nbval

4. Create `life.ipynb`

5. Run `pytest` and observe output

1. Update `requirements-dev.txt`:

2. Update installed packages:

pip install -r requirements-dev.txt

⌨️

3. Update `pytest.ini`:

jupyter

Advanced topics 3

Benchmarking

pytest=3.7.4
...
pytest-benchmark==3.1.1

[pytest]
addopts = ... --benchmark-autosave --benchmark-compare --benchmark-compare-fail=min:5%

3. Update `pytest.ini`:

4. Run `pytest` twice and observe statistics

1. Update `requirements-dev.txt`:

2. Update installed packages:

pip install -r requirements-dev.txt

⌨️

benchmarking

Ignore warnings on first run

Benchmarking 2

import numpy as np
 
 
def count_neighbours(board):
    """Return an array of neighbour counts for each element of `board`"""
    return sum(
        np.roll(np.roll(board, i, 0), j, 1)
        for i in (-1, 0, 1)
        for j in (-1, 0, 1)
        if (i != 0 or j != 0)
    )

2. Run `pytest` and observe statistics

1. Update `life.py` to use numpy rather than scipy:

⌨️

numpy

Discussion

Build/configuration systems
- CMake, Meson
Formatters
- Clang-Format
Linters/static analyzers
- Language Server Protocol
Runtime analyzers
- Address/memory santizers (viz security)

See github.com/mre/awesome-static-analysis

Other languages

Discussion 2

Pre/post-commit hooks
Documentation
- Coverage
- Automated generation
- Executability
HPC/cloud tools
Cookiecutter

Thank you!

Jake VanderPlas: Conway's Game of Life in Python

Acknowledgements

Feedback

m.woodbridge@imperial.ac.uk or @ImperialRSE

Code

github.com/ImperialCollegeLondon/RSE18

RSE18

By Mark Woodbridge

RSE18

Suggested approaches to quality assurance for Python projects, presented at RSE18.

A tried-and-tested workflow for software quality assurance

Preparation

Purpose of this workshop

Purpose of this workshop 2

Purpose of this workshop 3

Agenda

Introduction

Getting started

⌨️

Getting started 2

Coding standards

Coding standards 2

Coding standards 3

⌨️

Testing: Basics

Testing: Basics 2

⌨️

Testing: Basics 3

⌨️

Testing: Coverage

Testing: Coverage 2

⌨️

Testing: Coverage 3

⌨️

Testing: Fixtures

Testing: Fixtures 2

⌨️

Testing: Fixtures 3

⌨️

Testing: Basics 4

⌨️

Static analysis

Introduction

Static analysis

Linting

Static analysis

Linting 2

⌨️

Static analysis

Linting 3

⌨️

Static analysis

Type checking

Static analysis

Type checking 2

⌨️

Automation

Automation 2

⌨️

Automation 3

⌨️

Automation 4

⌨️

Advanced topics

Hypothesis

Hypothesis

⌨️

Advanced topics 2

Jupyter

Jupyter

⌨️

Advanced topics 3

Benchmarking

Benchmarking

⌨️

Benchmarking 2

⌨️

Discussion

Other languages

Discussion 2

Other topics

Thank you!

Acknowledgements

Feedback

Code

RSE18

More from Mark Woodbridge