Mutation Testing: How good are your unit tests, really?

Mark Robinson

How do I test the quality of a test suite?

That's QA's problem

I do TDD, I know my tests are good
  • Are you sure?
  • What about the tests you didn't write?
  • How do you have confidence in test refactors?
Pull requests enforce test quality
  • Do not always catch everything
  • Time intensive activity
  • Line
  • Branch
  • Statement
  • Data, Path, Modified Condition etc
We measure code coverage

None of these coverage metrics tell you which parts of your code have been tested

What code coverage does tell you

This code was executed as part of a test

This code was not executed as part of a test

Executing code and testing code is not the same

public class CalculatorTest() {

    @Test
    public void seniorEngineerSaysMustHaveTestCoverage() {
        int result = Calculator.add(5, 2);
    }

}
public class Calculator() {

    public static int add(int first, int second) {
        return first + second;
    }

}

Code coverage tells you what code has not been tested

All OK?

In 1971 Richard Lipton proposed a good solution to the problem

 

"Fault diagnosis of computer programs"

If we want to know if a test suite has properly checked some code... 

1) Introduce a bug!

2) See if the test suite fails

Here's a bug:

(But our tests still pass!)

  public void countIfGreaterThanNine(int number) {
    if (number > 10) {
      count++;
    }
  }

We can introduce these bugs in many ways, called mutation operators

  • >= to <=
  • >= to =
  • == to !=
  • a == b to false
  • object.aMethod() to // object.aMethod()
  • object.aMethod() to object.anotherMethod()
  • null returns
  • etc etc

Applying mutation operators to code creates a mutant

We can create a lot of mutants and do it automatically

Survived

Killed

Test Suite Fails

Test Suite Passes

Killing is good!

If the test suite can find these artificial bugs, can it find real ones?

The competent programmer hypothesis

Programmers are generally competent enough to produce code which is at least almost correct 

Three types of Bugs

  • Built the wrong thing
  • Built it wrong
  • Oops

The coupling effect

Tests that can distinguish a program differing from a correct one by only simple errors can also implicitly distinguish more complex errors

  1. A. Offutt. 1989. The coupling effect: fact or fiction. In Proceedings of the ACM SIGSOFT '89 third symposium on Software testing, analysis, and verification (TAV3), Richard A. Kemmerer (Ed.). http://dx.doi.org/10.1145/75308.75324

Strong empirical evidence for this

1

So if your tests can find these mutants, they will probably find real bugs

But what about this?

public void someFunction(int i) {
    if (i <= 100) {
        throw new IllegalArgumentException();
    }
    if (i == 100) { // changed from >= to ==
        doSomething();
    }
}

It is not possible to write a test to kill this mutant

public void someFunction(int i) {
    if (i <= 100) {
        throw new IllegalArgumentException();
    }
    if (i == 100) { // changed from >= to ==
        doSomething();
    }
}

The mutant is said to be equivalent

Equivalent mutants can highlight redundancy...

public void someFunction(int i) {
    if (i <= 100) {
        throw new IllegalArgumentException();
    }
    doSomething();
}
public void someFunction(int i) {
    if (i <= 100) {
        throw new IllegalArgumentException();
    }
    if (i >= 100) {
        doSomething();
    }
}

Mutation testing highlights code which definitely is tested

Gives a very high confidence in the test suite

It can highlight redundant code

It can sometimes find bugs

It effectively tests your tests

Does it fit?

  • A development activity, not a QA step
 ...mutation testing with productive mutants does not add a significant overhead to the software development process...

1

 

  1. Ivanković, Goran Petrović Marko, et al. "An Industrial Application of Mutation Testing: Lessons, Challenges, and Research Directions." Proceedings of the International Workshop on Mutation Analysis (Mutation). IEEE Press, Piscataway, NJ, USA. 2018.

Based on testing at Google

How do I try it out?

<plugin>
    <groupId>org.pitest</groupId>
    <artifactId>pitest-maven</artifactId>
    <version>1.4.1</version>
</plugin>

 

  • Mutation testing takes time
    • Target specific parts of the code base on large projects
    • On CI, probably do not run on every commit

Tips

mvn clean install org.pitest:pitest-maven:mutationCoverage

The Demo

Made with Slides.com