A 5 minute intro to Hypothesis

20/04/2018 @PyCon Italy

Giacomo Debidda

Your project

(intro_hypothesis) jack@localhost:~/Repos/intro-to-hypothesis$ tree -L 2
.
├── my_package
│   ├── __init__.py
│   └── my_module.py
└── tests
    ├── __init__.py
    └── test_my_module.py

Your code

# my_package/my_module.py

def add_numbers(a, b):
    return a + b
# tests/test_my_module.py

from my_package.my_module import add_numbers


def test_add_numers_is_commutative():
    assert add_numbers(1.23, 4.56) == add_numbers(4.56, 1.23)

Your tests

pytest -v

# or...
python -m pytest -v
==================================== test session starts =====================================
platform linux -- Python 3.6.3, pytest-3.5.0, py-1.5.3, pluggy-0.6.0
               -- /home/jack/.virtualenvs/intro_hypothesis/bin/python3.6
cachedir: .pytest_cache
rootdir: /home/jack/Repos/intro-to-hypothesis, inifile:
collected 1 item                                                                             

tests/test_my_module.py::test_add_numers_is_commutative PASSED                         [100%]

================================== 1 passed in 0.01 seconds ==================================

Success?

You didn't prove that the commutative property holds in general. You just proved that such property holds for this specific case.

The combination of 1.23 and 4.56 is a very tiny subset of the entire input space of numbers that your function can receive...

# tests/test_my_module.py

from my_package.my_module import add_numbers


def test_add_numers_is_commutative():
    assert add_numbers(1.23, 4.56) == add_numbers(4.56, 1.23)

So...?

The lazy solution

# tests/test_my_module.py

from my_package.my_module import add_numbers


def test_add_numers_is_commutative():
    assert add_numbers(1.23, 4.56) == add_numbers(4.56, 1.23)

def test_add_numers_is_commutative_another_case(self):
    assert add_numbers(0.789, 321) == add_numbers(321, 0.789)

# more tests here...

You simply write more test cases.

For these other specific cases the commutative property holds. But you don't want to write a million test cases by hand...

The risky solution

# tests/test_my_module.py
import random
import unittest
from ddt import ddt, idata, unpack
from my_package.my_module import add_numbers

def float_pairs_generator():
    num_test_cases = 100
    for i in range(num_test_cases):
        a = random.random() * 10.0
        b = random.random() * 10.0
        yield (a, b)

@ddt
class TestAddNumbers(unittest.TestCase):
    @idata(float_pairs_generator())
    @unpack
    def test_add_floats_ddt(self, a, b):
        self.assertEqual(add_numbers(a, b), add_numbers(b, a))

You use fuzzing to create random test cases at every run.

You use ddt to multiply the test cases (works with pytest too).

100 test cases

==================================== test session starts =====================================
platform linux -- Python 3.6.3, pytest-3.5.0, py-1.5.3, pluggy-0.6.0
               -- /home/jack/.virtualenvs/intro_hypothesis/bin/python3.6
cachedir: .pytest_cache
rootdir: /home/jack/Repos/intro-to-hypothesis, inifile:
collected 100 items 

tests/test_my_module.py::TestAddNumbers::test_add_floats_ddt_00001__1_5626962926374943__1_9960917540401857_ PASSED [  1%]
tests/test_my_module.py::TestAddNumbers::test_add_floats_ddt_00002__8_826169800117212__0_46531523690026333_ PASSED [  2%]
tests/test_my_module.py::TestAddNumbers::test_add_floats_ddt_00003__5_9016174415787415__9_626363288868493_ PASSED [  3%]
tests/test_my_module.py::TestAddNumbers::test_add_floats_ddt_00004__4_2946227013991685__6_73085683629837_ PASSED [  4%]
tests/test_my_module.py::TestAddNumbers::test_add_floats_ddt_00005__5_758774597260805__9_72994743211482_ PASSED [  5%]

...

tests/test_my_module.py::TestAddNumbers::test_add_floats_ddt_00096__7_322612946947605__3_3536474120855377_ PASSED [ 96%]
tests/test_my_module.py::TestAddNumbers::test_add_floats_ddt_00097__1_075369396293605__4_872490525884292_ PASSED [ 97%]
tests/test_my_module.py::TestAddNumbers::test_add_floats_ddt_00098__1_5173664261532571__1_2611556220323872_ PASSED [ 98%]
tests/test_my_module.py::TestAddNumbers::test_add_floats_ddt_00099__6_727606012779317__2_4322871197800144_ PASSED [ 99%]
tests/test_my_module.py::TestAddNumbers::test_add_floats_ddt_00100__9_319277865106583__3_9858815547475537_ PASSED [100%]

================================= 100 passed in 0.15 seconds =================================

Still not enough

If you think about it, we are still testing some random combinations of numbers between 0.0 and 10.0 here...

It's not a very extensive portion of the input domain of the function add_numbers.

What to do?

  1. find a way to generate domain objects that your function can accept. In this case the domain objects are the floats that add_numbers can receive.
  2. use hypothesis

The best solution

# tests/test_my_module.py

from hypothesis import given
from hypothesis.strategies import floats
from my_package.my_module import add_numbers


@given(a=floats(), b=floats())
def test_add_numbers(a, b):
    assert add_numbers(a, b) == add_numbers(b, a)

Run the tests and...

What happened?

Increase the verbosity level of your test by using the @settings decorator to find it out.

# tests/test_my_module.py

from hypothesis import given, settings, Verbosity
from hypothesis.strategies import floats
from my_package.my_module import add_numbers


@settings(verbosity=Verbosity.verbose)
@given(a=floats(), b=floats())
def test_add_numbers(a, b):
    assert add_numbers(a, b) == add_numbers(b, a)

A falsifying example

Trying example: test_add_numbers(a=nan, b=0.0)
Traceback (most recent call last):
  [...]
  File "/home/jack/Repos/intro-to-hypothesis/tests/test_my_module_hypothesis.py", line 10,
  in test_add_numbers
    assert add_numbers(a, b) == add_numbers(b, a)
AssertionError: assert nan == nan
 +  where nan = add_numbers(nan, 0.0)
 +  and   nan = add_numbers(0.0, nan)
Trying example: test_add_numbers(a=6.4348585852518236e-232, b=0.0)
Trying example: test_add_numbers(a=-0.99999, b=3.402823466e+38)
Trying example: test_add_numbers(a=-inf, b=-1.175494351e-38)
Falsifying example: test_add_numbers(a=0.0, b=nan)

Some test cases are fine.

Some other ones are not.

Fail? It depends...

from hypothesis import given
from hypothesis.strategies import floats
from my_package.my_module import add_numbers

@given(
  a=floats(allow_nan=False, allow_infinity=False),
  b=floats(allow_nan=False, allow_infinity=False))
def test_add_numbers(a, b):
    assert add_numbers(a, b) == add_numbers(b, a)

The test fails because in Python nan is a valid float.

Is nan a valid input for your application? What about inf?

For example, if you are absolutely sure that add_numbers will never receive a nan or a inf as inputs, you can write a test that never generates either nan or inf.

Test exceptions 1/4

What if add_numbers could in fact receive nan or inf as (invalid) inputs?

  • the test should be able to generate nan or inf as (invalid) inputs
  • add_numbers should raise a specific exception for each (invalid) input
  • the test should reject that failure (i.e. the test should not consider that specific exception a failure)

Test exceptions 2/4

# my_package/my_module.py

import math

class NaNIsNotAllowed(ValueError):
    pass

class InfIsNotAllowed(ValueError):
    pass

def add_numbers(a, b):
    if math.isnan(a) or math.isnan(b):
        raise NaNIsNotAllowed('nan is not a valid input')
    elif math.isinf(a) or math.isinf(b):
        raise InfIsNotAllowed('inf is not a valid input')
    return a + b

Test exceptions 3/4

========================================== FAILURES ==========================================
______________________________________ test_add_numbers ______________________________________

    @given(a=floats(), b=floats())
>   def test_add_numbers(a, b):
E   hypothesis.errors.MultipleFailures: Hypothesis found 2 distinct failures.

tests/test_my_module_hypothesis.py:7: MultipleFailures
----------------------------------------- Hypothesis -----------------------------------------
Falsifying example: test_add_numbers(a=0.0, b=nan)
Traceback (most recent call last):
  [...]
    raise NaNIsNotAllowed('nan is not a valid input')
my_package.my_module.NaNIsNotAllowed: nan is not a valid input

Falsifying example: test_add_numbers(a=0.0, b=inf)
Traceback (most recent call last):
  [...]
    raise InfIsNotAllowed('inf is not a valid input')
my_package.my_module.InfIsNotAllowed: inf is not a valid input

Test exceptions 4/4

# tests/test_my_module.py

from hypothesis import given, reject
from hypothesis.strategies import floats
from my_package.my_module import add_numbers, NaNIsNotAllowed, InfIsNotAllowed


@given(a=floats(), b=floats())
def test_add_numbers_invalid_inputs(a, b):
    try:
        assert add_numbers(a, b) == add_numbers(b, a)
    except (NaNIsNotAllowed, InfIsNotAllowed):
        reject()

Explicit examples

# tests/test_my_module.py

from hypothesis import given, example
from hypothesis.strategies import floats
from my_package.my_module import add_numbers


@example(a=1.23, b=4.56)
@given(
  a=floats(allow_nan=False, allow_infinity=False),
  b=floats(allow_nan=False, allow_infinity=False))
def test_add_numbers_explicit_example(a, b):
    assert add_numbers(a, b) == add_numbers(b, a)

Test all the things!

# tests/test_my_module.py

from hypothesis import given, example, reject
from hypothesis.strategies import floats
from my_package.my_module import add_numbers, NaNIsNotAllowed, InfIsNotAllowed

@example(a=1.23, b=4.56)
@given(
  a=floats(allow_nan=False, allow_infinity=False),
  b=floats(allow_nan=False, allow_infinity=False))
def test_add_numbers_explicit_example(a, b):
    assert add_numbers(a, b) == add_numbers(b, a)

@given(
  a=floats(allow_nan=False, allow_infinity=False),
  b=floats(allow_nan=False, allow_infinity=False))
def test_add_numbers_valid_inputs(a, b):
    assert add_numbers(a, b) == add_numbers(b, a)

@given(a=floats(), b=floats())
def test_add_numbers_invalid_inputs(a, b):
    try:
        assert add_numbers(a, b) == add_numbers(b, a)
    except (NaNIsNotAllowed, InfIsNotAllowed):
        reject()

Success!

==================================== test session starts =====================================
platform linux -- Python 3.6.3, pytest-3.5.0, py-1.5.3, pluggy-0.6.0
               -- /home/jack/.virtualenvs/intro_hypothesis/bin/python3.6
cachedir: .pytest_cache
rootdir: /home/jack/Repos/intro-to-hypothesis, inifile:
plugins: hypothesis-3.55.1
collected 3 items                                                                            

tests/test_my_module_hypothesis.py::test_add_numbers_explicit_example PASSED           [ 33%]
tests/test_my_module_hypothesis.py::test_add_numbers_valid_inputs PASSED               [ 66%]
tests/test_my_module_hypothesis.py::test_add_numbers_invalid_inputs PASSED             [100%]

================================== 3 passed in 0.51 seconds ==================================

Reference

@jackdbd

giacomodebidda.com

A 5 minute intro to Hypothesis

By Giacomo Debidda

A 5 minute intro to Hypothesis

Property based testing in Python. It works by letting you write tests that assert that something should be true for every case, not just the ones you happen to think of.

  • 1,145