MUTATION TESTING

Can we write perfect tests? - Maybe!

MY TESTING JOURNEY

First I HATED it.

Then I FEARED it.

Later I did not do ENOUGH.

Finally to MUCH.

TEST METRICS

How can you prove the tests are solid?

And how do you make sure you got all the edges?

TEST TO CODE RATIO

I'm joking yeah.

Lets move on.

LINE COVERAGE

a start

 def cover_me(input)   input ? :foo : :bar end

100% covering test case :(

 expect(cover_me(true).to be(:foo)

Misses to specify the else branch.

BRANCH/Statement Coverage

(more sophisticated)

def cover_me   side_effect_a # No test for this one :(   side_effect_bend

100% covering test case :(

 expect { cover_me }.to change { side_effect_b }.from(initial).to(other)

Misses to specify side effect a.

MUTATION COVERAGE

My unscientific claim:

Mutation-Coverage > Statement-Coverage > Line-Coverage > Test-To-Code-Ratio

TESTING RUBY

It is even harder!

Sub 100% coverage guarantees you deploy code with bugs.

REAL STORY

def fooend
def bar  baz ? fooo : other # note misspelled method call "fooo"end

Only running this code can identify the spelling error!

(Some IDEs will still try but fail)

DEFINITION

A mutation testing tool changes (mutates) your code strategically and expects your tests to FAIL!

Mutants with failing tests are KILLED.

Mutants without failing tests are ALIVE.

Fear ALIVE mutants. They should be dead!

What to MUTATE

CODE!

HOW?

String#gsub is NOT an option.

Transformable representations of code must be used.

AST-BASED

Some code:

def foo  barend

Some AST:

(whitequark/parser)

(def :foo  (args)  (send nil :bar)

BYTECODE BASED

some bytecode (rbx)

============= :__script__ ==============
0000:  push_rubinius              
0001:  push_literal               :foo
0003:  push_literal               #<:compiledcode foo="" file="x.rb">
0005:  push_scope                 
0006:  push_variables             
0007:  send_stack                 :method_visibility, 0
0010:  send_stack                 :add_defn_method, 4
0013:  pop                        
0014:  push_true                  
0015:  ret
================= :foo =================
0000:  push_self                  
0001:  send_method                :bar
0003:  ret

Not used by any ruby tool so far.

MUTATION EXAMPLE

Original:

def cover_me(input)  input ? :foo : :barend

Mutation:

def cover_me(input)   true ? :foo : :barend

The test?

 expect(cover_me).to eql(:foo)

Still passes! - Mutant is ALIVE.

KILLING - Mutation

Mutation

def cover_me(input)   true ? :foo : :barend

Killing Test!

expect(cover_me(true)).to eql(:foo)expect(cover_me(false)).to eql(:bar)

He is dead Jim!

THE BAD EXAMPLE

Nobody is perfect.

def square_root_bug(value)  3end

Test

 expect(squire_root_bug(9)).to be(3)

No way of covering this invalid implementation via mutation testing.

Mutation Operators

literal / primitive and compound
statement deletion
conditional
binary connective replacment
argument deletion / rename / swap
unary operator exchange
bitwise
many, many more!

In the REAL WORLD

Subjects:  424       # Amount of subjects(methods) being mutated
Mutations: 6760      # Amount of mutations mutant generated ~13 / method
Kills:     6664      # Amount of successfully killed mutations
Runtime:   5123.13s  # Total runtime
Killtime:  5092.63s  # Time spend killing mutations (~83min)
Overhead:  0.60%
Coverage:  98.58%    # Coverage score Alive:     96

These numbers are outdated.

mutant-0.3.0.rc1 is 35-50% faster.

REPORTING

evil:ROM::Mapper::Dumper#identity:/home/mbj/devel/rom-mapper/lib/rom/mapper/dumper.rb:18:08a61
@@ -1,6 +1,6 @@
 def identity(object)
   header.keys.map do |key|
-    object.send(key.name)
+    object.public_send(key.name)
   end
 end

SPEED

Mutation testing is slow.

For N mutants your tests get to run N times.

Write real unit tests, to make your test execution fast.

TEST SELECTION

Selecting the correct subset of your tests is the key.

Known-Strategies:

Brute force
(Method)-Name based mapping
Use Line-Coverage metrics to identify subject - test mapping.

ISOLATION

Mutations must be isolated from each other.

Strategies: fork(), sandboxing

EQUIVALENT MUTANTS

Original:

i = 0while i != 10  do_something  i+=1end

Equivalent mutant:

i = 0while i < 10  do_something  i+=1end

No observable change, reported as alive mutation.

INFINITE Runtime

Original

while expression  do_somethingend

Mutation

while true
  do_something
end

Only killable via time, or refactoring.

Hunting CHEAT SHEET

Refactor mutations away, avoid literals

 array[0] => array.first

Only use syntactic constructs when needed

 ::Foo => Foo

Do not pass literal defaults into methods

 string.to_i(10) => string.to_i

To be continued...

FINALLY

Thank you for listening!

https://github.com/mbj/mutant

Contact:

Markus Schirp

https://github.com/mbj

https://twitter.com/_m_b_j

THANKS

%w(  j-j-k dkubb whitequark  solnic snusnu postmodern    txus and_all_i_forgot).shuffle

For various achievements and all the related work.

mutation-testing

By Markus Schirp

mutation-testing

4,705

Markus Schirp

_m_b_j_

MUTATION TESTING

MY TESTING JOURNEY

TEST METRICS

TEST TO CODE RATIO

LINE COVERAGE

BRANCH/Statement Coverage

MUTATION COVERAGE

TESTING RUBY

REAL STORY

DEFINITION

What to MUTATE

HOW?

AST-BASED

BYTECODE BASED

MUTATION EXAMPLE

KILLING - Mutation

THE BAD EXAMPLE

Mutation Operators

In the REAL WORLD

REPORTING

SPEED

TEST SELECTION

ISOLATION

EQUIVALENT MUTANTS

INFINITE Runtime

Hunting CHEAT SHEET

FINALLY

THANKS

mutation-testing

More from Markus Schirp