Mutation Testing: How good are your unit tests, really?
Mark Robinson
How do I test the quality of a test suite?
That's QA's problem
I do TDD, I know my tests are good
- Are you sure?
- What about the tests you didn't write?
- How do you have confidence in test refactors?
Pull requests enforce test quality
- Do not always catch everything
- Time intensive activity
- Line
- Branch
- Statement
- Data, Path, Modified Condition etc
We measure code coverage
None of these coverage metrics tell you which parts of your code have been tested
What code coverage does tell you
This code was executed as part of a test
This code was not executed as part of a test
Executing code and testing code is not the same
public class CalculatorTest() {
@Test
public void seniorEngineerSaysMustHaveTestCoverage() {
int result = Calculator.add(5, 2);
}
}
public class Calculator() {
public static int add(int first, int second) {
return first + second;
}
}
Code coverage tells you what code has not been tested
All OK?
In 1971 Richard Lipton proposed a good solution to the problem
"Fault diagnosis of computer programs"
If we want to know if a test suite has properly checked some code...
1) Introduce a bug!
2) See if the test suite fails
Here's a bug:
(But our tests still pass!)
public void countIfGreaterThanNine(int number) {
if (number > 10) {
count++;
}
}
We can introduce these bugs in many ways, called mutation operators
- >= to <=
- >= to =
- == to !=
- a == b to false
- object.aMethod() to // object.aMethod()
- object.aMethod() to object.anotherMethod()
- null returns
- etc etc
Applying mutation operators to code creates a mutant
We can create a lot of mutants and do it automatically
Survived
Killed
Test Suite Fails
Test Suite Passes
Killing is good!
If the test suite can find these artificial bugs, can it find real ones?
The competent programmer hypothesis
Programmers are generally competent enough to produce code which is at least almost correct
Three types of Bugs
- Built the wrong thing
- Built it wrong
- Oops
The coupling effect
Tests that can distinguish a program differing from a correct one by only simple errors can also implicitly distinguish more complex errors
- A. Offutt. 1989. The coupling effect: fact or fiction. In Proceedings of the ACM SIGSOFT '89 third symposium on Software testing, analysis, and verification (TAV3), Richard A. Kemmerer (Ed.). http://dx.doi.org/10.1145/75308.75324
Strong empirical evidence for this
1
So if your tests can find these mutants, they will probably find real bugs
But what about this?
public void someFunction(int i) {
if (i <= 100) {
throw new IllegalArgumentException();
}
if (i == 100) { // changed from >= to ==
doSomething();
}
}
It is not possible to write a test to kill this mutant
public void someFunction(int i) {
if (i <= 100) {
throw new IllegalArgumentException();
}
if (i == 100) { // changed from >= to ==
doSomething();
}
}
The mutant is said to be equivalent
Equivalent mutants can highlight redundancy...
public void someFunction(int i) {
if (i <= 100) {
throw new IllegalArgumentException();
}
doSomething();
}
public void someFunction(int i) {
if (i <= 100) {
throw new IllegalArgumentException();
}
if (i >= 100) {
doSomething();
}
}
Mutation testing highlights code which definitely is tested
Gives a very high confidence in the test suite
It can highlight redundant code
It can sometimes find bugs
It effectively tests your tests
Does it fit?
- A development activity, not a QA step
...mutation testing with productive mutants does not add a significant overhead to the software development process...
1
|
Based on testing at Google
How do I try it out?
<plugin>
<groupId>org.pitest</groupId>
<artifactId>pitest-maven</artifactId>
<version>1.4.1</version>
</plugin>
-
Mutation testing takes time
- Target specific parts of the code base on large projects
- On CI, probably do not run on every commit
Tips
mvn clean install org.pitest:pitest-maven:mutationCoverage
The Demo
mutationtesting
By Mark Robinson
mutationtesting
- 1,644