Assumptions at every step.
(for data scientists)
... what could go wrong?
def mean(values): return sum(values) / len(values)
def test_mean(): assert(mean([1, 2, 3, 4, 5]) == 2)
... deterministic answers may not exist
Test properties, not specific values
Make assumptions about data shape & type
For "defensive" data analysis
"The raison d’être for engarde is the fact of life that data are messy."
Property-based testing inspired
by Haskell's Quickcheck
(and be slightly diabolical about it)
engarde: is_monotonic(), within_n_std(), within_set()
scikit-learn, SciPy, NumPy have excellent test suites
pandas, SciPy, NumPy have excellent testing methods
Numerical computing is tricky.
Try to use existing tools as much as possible.