Testing for data scientists

Trey Causey

@treycausey

treycausey.com

slides.com/treycausey/pydata2015

whoami
@treycausey

Day
- Data scientist at Dato
Nights/weekends
- Sports analytics fan & consultant
- Blog at thespread.us

Today

Why testing for data scientists?
Kinds of tests
Quick review of unit testing
Unique problems for data scientists
Promising Python packages

Life of a data scientist

Flickr: michaelmattiphotography

Flickr: sukiweb

Work looks like this

http://www3.cs.stonybrook.edu/~aychakrabort/

Data is messy...

... so is your code

Assumptions at every step.

Software development skills for data scientists

treycausey.com/software_dev_skills.html

Writing modular, reusable code
Documentation
Version control
Testing
Logging

What is testing?

Ned Batchelder
"Getting Started Testing"

PyCon 2014

When to test?

Extract
Transform
Model -- the Wild West
Load
Repeat

Testing helps you:

Find bugs
Check your assumptions
(by making them explicit)
Write simpler code
Work with others
Be surprised less

Most relevant kinds of tests

(for data scientists)

Unit tests
Regression (why?!?) tests
Integration tests

Unit tests

Test one "unit" of code
No dependencies on
other code you've written
Don't require access
to databases, APIs, etc.

The standard Python unit testing landscape

py.test
unittest, unittest2
nose
fixtures
mock

A simple function...

... what could go wrong?

def mean(values):
    return sum(values) / len(values)

A simple test

import pytest

def test_mean():
    assert(mean([1, 2, 3, 4, 5]) == 2)

Why py.test?

Less boilerplate
Fewer classes
Gets you testing quickly
Easy to interpret errors

When & what to test?

When you change code, add a test.
Test the outcome, not the implementation
When you find a bug, add a test.
Help identify complexity
Don't test code that's already tested!

Test-driven development?

Write failing tests first,
fix code until tests pass.

Testing for data science can be a little different...

... deterministic answers may not exist

Overfitting

Laziness (not the good kind)

Extract data once, build many models
Data is representative of the future
Using specific samples to spot-check

Better ways to test

Test properties, not specific values

Make assumptions about data shape & type

Test probabalistically

Some promising
projects

engarde
Hypothesis

Feature Forge

engarde

Tom Augspurger

For "defensive" data analysis

"The raison d’être for engarde is the fact of life that data are messy."

Great for ETL on changing data

Built with pandas
Very lightweight
Use for functions that
accept & return a Dataframe
Just add decorators!
(or use DataFrame.pipe)

Usage

Hypothesis

David R. MacIver

Property-based testing inspired
by Haskell's Quickcheck

How it works

Generate data randomly
according to some specs

(and be slightly diabolical about it)

Details

Very flexible
Plugin support
Finds corner cases fast
Ideal for code that will be
accepting input "from the wild"

Other features

Works with existing testing frameworks
Works with Faker
Has a datetime plugin
Tests many Python WTFs
Experimental NumPy support

Feature Forge

Daniel Moisset
Javier Mansilla
Rafael Carrascosa

Declare feature schemas & test them

Other features

Designed with ML and sklearn in mind
Can build feature creation & testing pipelines
Supports experiments for testing variety of
features and evaluating many models,
storing results to a database

Probabilistic testing

Model testing is the wild west

engarde: is_monotonic(), within_n_std(), within_set()

Don't test algorithms you haven't personally implemented

scikit-learn, SciPy, NumPy have excellent test suites

Testing algorithms you have implemented

pandas, SciPy, NumPy have excellent testing methods

Numerical computing is tricky.

Try to use existing tools as much as possible.

What I didn't cover

Testing MCMC code
- Follow @tdhopper
Continuous integration
See @digitallogic's answer here:
http://bit.ly/testing_algorithms

Questions?

@treycausey

Testing for data scientists

whoami @treycausey

Today

Life of a data scientist

Work looks like this

Data is messy...

... so is your code

Software development skills for data scientists

What is testing?

Ned Batchelder "Getting Started Testing"

PyCon 2014

When to test?

Testing helps you:

Most relevant kinds of tests

Unit tests

The standard Python unit testing landscape

A simple function...

A simple test

Why py.test?

When & what to test?

Test-driven development?

Write failing tests first, fix code until tests pass.

Testing for data science can be a little different...

Overfitting

Laziness (not the good kind)

Better ways to test

Some promising projects

engarde Hypothesis

Feature Forge

engarde

Tom Augspurger

Great for ETL on changing data

Usage

Hypothesis

David R. MacIver

How it works

Generate data randomly according to some specs

Details

Other features

Feature Forge

Daniel Moisset Javier Mansilla Rafael Carrascosa

Declare feature schemas & test them

Other features

Probabilistic testing

Model testing is the wild west

Don't test algorithms you haven't personally implemented

Testing algorithms you have implemented

What I didn't cover

Questions?

whoami
@treycausey

Ned Batchelder
"Getting Started Testing"

Write failing tests first,
fix code until tests pass.

Some promising
projects

engarde
Hypothesis

Generate data randomly
according to some specs

Daniel Moisset
Javier Mansilla
Rafael Carrascosa