Failing Faster
Christopher Gandrud
IQSS Tech Talk
22 March 2017
Caveat
Talk is most applicable to R statistical software development
But, We Want You
Part of a larger IQSS effort ("Social Science Software Toolkit")
We want your contributions especially for other languages.
What is software development?
(often) failure management
Complexity
Software involves many interconnected parts and user behaviours.
Difficult to anticipate how a change will affect software behaviour.
So your software will fail to meet expectations.
When do you want to fail?
As soon as possible
How can we fail faster?
Test-Driven Development
We all Test
But (frequently) not:
- systematically
- automatically
- regularly
Testing Basics
What is a test?
Comparison of an output to an expectation
Fail if output does not meet expectation
Expectation -> Collaboration
Spelling out your expectations so that they can be automatically tested, makes your expectations clear to collaborators.
-- Lets you all know when collaborators have broken something --
Enforce API & Backwards Compatibility
Tests help ensure that software actually follows its stated API. Avoids API Drift
By enforcing an API across software updates, tests enhance backwards compatibility
Types of Tests
-
Unit tests: test individual units (lowest level) of source code
- In R: usually individual functions or classes
-
Integration tests: test units in combination
- In R: functions that work in combination in an expected workflow
Other Types
Proliferation of testing types, e.g. V-Model:
- Require tests: test if your software is able to do some required task
- Failure tests: test if your software fails to do tasks it is not "required" to do
Types of Tests
Aside: Limitations
Software necessarily has limitations.
Let your users know when they have reached these limitations, why, and suggest what to do about it.
Let them know as soon as possible.
Fail Fast
# Initialize Zelig5 least squares object
z5 <- zls$new()
# Estimate ls model
z5$zelig(Fertility ~ Education, data = swiss)
# Simulate quantities of interest
z5$sim()
# Plot quantities of interest
z5$graph()
> Invalid call
Graph Returns:
Warning message:
In par(old.par) : calling par(new=TRUE) with no plot
Missing setx()
Fail Fast
# Initialize Zelig5 least squares object
z5 <- zls$new()
# Estimate ls model
z5$zelig(Fertility ~ Education, data = swiss)
# Simulate quantities of interest
z5$sim()
> Invalid call
Now after sim:
Warning message:
No simulations drawn, likely due to insufficient inputs.
Be Informative
Not informative:
Better:
Error in models4[[model]]: invalid subscript type 'symbol'.
Estimation model type was not specified.
Select estimation model type with the "model" argument.
z.out <- zelig(y ~ x1 + x2, data = example)
> Invalid call
No estimation model
type specified
Testing surface
Definition:
The sum of the different software behaviours that are tested.
Aim:
Maximise the testing surface.
Trade off
But, there is a trade off between maximising the testing surface and reasonable test run time
The longer your tests take to run, the less likely you are to run them.
Obligatory xkcd comic
This is not CRAN
Coverage
Definition:
The proportion of source code that is run during testing.
Why?:
Proxy for the testing surface.
Proxy
Test coverage does not mean that your test is an accurate proxy.
# Create function to find mean
my_mean <- function(x) {
mean <- x
return(mean)
}
# Test
testthat::expect_error(my_mean(1:10), NA)
100% coverage, But Poor Proxy
Effective Tests
Have well designed expectations
# Create function to find mean
my_mean <- function(x) {
mean <- x
return(mean)
}
# Test
testthat::expect_equal(my_mean(1:10), 5.5)
well designed expectations
# Create function to find mean
my_mean <- function(x) {
mean <- x
return(mean)
}
# Test
testthat::expect_equal(my_mean(1:10), 5.5)
well designed expectations
Error: my_mean(1:10) not equal to 5.5.
Lengths differ: 10 vs 1
Error Message
(R) Testing Tools
Dynamic Documentation
Executable code can be included as documentation examples with Roxygen2.
It is executed as part of CRAN check.
#' Find the mean of a numeric vector
#'
#' @param x a numeric vector
#'
#' @examples
#' my_mean(1:10)
# Create function to find mean
my_mean <- function(x) {
mean <- x
return(mean)
}
Dynamic Documentation
Executable code can be included as documentation examples with Roxygen2.
It is executed as part of CRAN check.
#' Find the mean of a numeric vector
#'
#' @param x a numeric vector
#'
#' @examples
#' my_mean(1:10)
# Create function to find mean
my_mean <- function(x) {
mean <- x
return(mean)
}
Keeps documentation and function source code in the same place
But, the implicit expectation is that given the numeric vector 1:10, the function will not return an error.
Expectations
Expectations
Better than nothing,
But not great
testthat
The testthat package allows you to specify a broader range of expectations including:
expect_equal
expect_equivalent
expect_match
expect_true
expect_false
expect_error
expect_warning
and more.
library(testthat)
# Create function to find mean
my_mean <- function(x) {
mean <- x
return(mean)
}
test_that("Correct mean of a numeric vector is returned", {
expect_equal(my_mean(1:10), 5.5)
})
Require Testing (Example Replay)
Error: my_mean(1:10) not equal to 5.5.
Lengths differ: 10 vs 1
Error Message
# Create function to find mean
my_mean <- function(x) {
if (!is.numeric(x)) stop('x must be numeric.', call. = FALSE)
mean <- sum(x) / length(x)
return(mean)
}
# Test
test_that('my_mean failure test when supplied character string', {
expect_error(my_mean('A'), 'x must be numeric.')
}
Failure testing
Hard to Test
Stubs and Mocks can sometimes be used
Set up Test Suite
devtools::use_testthat()
Creates:
-
tests/testthat.R:
- test suite set up, including what packages to load
-
tests/testthat:
- R files with tests
- testthat in DESCRIPTION Suggests
In package directory:
Run tests locally with:
Running Tests Locally
testthat::test_package()
# or
devtools::test()
# or
devtools::check(args = c('--as-cran'))
or in RStudio
- Original aim: avoid "integration hell" by merging changes into a master as often as possible
- Also refers to build servers that build the software and (can) run included tests.
- Useful for testing remotely on "clean" systems
- Can test on multiple operating systems
Continuous Integration
Windows
Linux/macos
SetUp Steps
- Have your package source code on GitHub
- Include .travis.yml and appveyor.yml in your project's root directory
- Can automate with devtools: use_travis() and use_appveyor()
- Login to the services and tell them to watch your package's GitHub repo. E.g. in TravisCI:
SetUp Steps
Now every time you push changes to GitHub:
Dynamically Report CI results with README badges
Code Coverage
Once a package uses testthat you can find and explore code coverage with the covr package:
library(covr)
cov <- package_coverage()
shine(cov)
Code Coverage
Returns a Shiny App to explore code coverage
library(covr)
cov <- package_coverage()
shine(cov)
+
codecov.io
Save code coverage results from Travis build and display/track results with codecov.io
+
codecov.io
Setup
1. Add to .travis.yml:
r_github_packages:
- jimhester/covr
after_success:
- Rscript -e 'covr::codecov()'
2. Login to codecov with GitHub username and add package repo:
Whenever travis build, CodeCov updates
Dynamically Report CodeCov results with README badges
Workflow
Zelig Test-Driven Workflow
Want: Bugfix/new feature
Always start at master
Want: Bugfix/new feature
Create feature/hotfix branch
Want: Bugfix/new feature
Create test
Want: Bugfix/new feature
Create feature/fix
Want: Bugfix/new feature
Run test locally
Want: Bugfix/new feature
Did it pass?
Want: Bugfix/new feature
If yes, merge into master
Want: Bugfix/new feature
Push master to GitHub to initiate CI
Want: Bugfix/new feature
Passes + accumulated changes
Future: IQSSdevtools
IQSSdevtools (R) Report Card
IQSSdevtools::check_best_practices()
Documentation:
readme: yes
news: yes
bugreports: yes
vignettes: yes
pkgdown_website: no
License:
gpl3_license: yes
Version_Control:
git: yes
github: yes
Testing:
uses_testthat: yes
uses_travis: yes
uses_appveyor: yes
build_check:
build_check_completed: yes
no_check_warnings: yes
no_check_errors: yes
no_check_notes: yes
test_coverage: 86
Background:
package_name: Zelig
package_version: 5.0-18
package_commit_sha: d5a8dcf0c9655ea187d2533fa977919be90612f6
iqssdevtools_version: 0.0.0.9000
check_time: 2017-03-17 16:16:55
IQSSdevtools (R) Report Card
IQSSdevtools::check_best_practices()
Documentation:
readme: yes
news: yes
bugreports: yes
vignettes: yes
pkgdown_website: no
License:
gpl3_license: yes
Version_Control:
git: yes
github: yes
Testing:
uses_testthat: yes
uses_travis: yes
uses_appveyor: yes
build_check:
build_check_completed: yes
no_check_warnings: yes
no_check_errors: yes
no_check_notes: yes
test_coverage: 86
Background:
package_name: Zelig
package_version: 5.0-18
package_commit_sha: d5a8dcf0c9655ea187d2533fa977919be90612f6
iqssdevtools_version: 0.0.0.9000
check_time: 2017-03-17 16:16:55
Suggestions?
Variations for Other Languages?
Failing Faster
By Christopher Gandrud
Failing Faster
- 2,038