Fuzzying test in Go

David Chou @ Golang Taipei

CC-BY-SA-3.0-TW

@ Umbo Computer Vision

@ Golang Taipei Co-organizer 🙋‍♂️


Software engineer, DevOps, and Gopher 👨‍💻

david74.chou @ gmail

david74.chou @ facebook

david74.chou @ medium

david7482 @ github

What is fuzzing test?

wiki: an automated testing that provides random data as inputs to a computer program.

A brief history of fuzzing

  • 1950s:                                                                      

  • 1988: term fuzzing is coined by Barton Miller

We didn't call it fuzzing back in the 1950s, but it was our standard practice to test programs by inputting decks of punch cards taken from the trash. This type of testing was so common that it had no name. - Gerald M. Weinberg

Fuzzing is the process of sending intentionally invalid data to a product in the hopes of triggering an error.
- H.D. Moore

Fuzzing test

  • Continuously manipulate inputs

  • Semi-random data from various mutation

  • Discover new code coverage based on instrumentation

  • Run more mutations quickly;
    rather than fewer mutations intelligently

What can be fuzzed?

  • deserialization (xml, json, proto, gob)

  • network protocols (HTTP, SMTP)

  • media codecs (audio, video, images, pdf)

  • crypto (boringssl, openssl)

  • compression (zip, gzip, bzip2, brotli)

  • etc

Why do we need fuzzing?

you don't know what you don't know

Why do we need fuzzing?

  • Fuzzing can reach edge cases which humans often miss

  • It is particularly valuable for finding vulnerabilities

  • Also a good choice for regression testing

  • Lots of real-world Trophies

    • found 15000+ bugs in Chrome [link]

    • found 1500+ bugs in FFMPEG [link]

A simple example

func CountAverage(num []byte) int {
	sum := byte(0)
	for _, v := range num {
		sum += v
	}
	return int(sum) / len(num)
}
func TestCountAverage(t *testing.T) {
	tests := []struct {
		name string
		num []byte
		want int
	}{
		{
			num: []byte{1, 2, 3, 4, 5},
			want: 3,
		},
	}
	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			got := CountAverage(tt.num)
			assert.EqualValues(t, tt.want, got)
		})
	}
}
$ go test -run TestCountAverage -cover
PASS
coverage: 100.0% of statements

A real-world example:

OpenSSL Heartbleed

Heartbleed fuzzing

150255 REDUCE cov: 485 ft: 756 corp: 38/15713b exec/s: 25042 
              rss: 402Mb L: 2891/2891 MS: 1 EraseBytes-
=================================================================
==6098==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x629000009748 at pc 0x0000005133a2 bp 0x7fffe29233c0 sp 0x7fffe2922b70
READ of size 48830 at 0x629000009748 thread T0
   #0 0x5133a1 in __asan_memcpy (/app/handshake-fuzzer+0x5133a1)
   1 0x5630c8 in tls1_process_heartbeat /app/openssl-1.0.1f/ssl/t1_lib.c:2586:3
   #2 0x5cfa9d in ssl3_read_bytes /app/openssl-1.0.1f/ssl/s3_pkt.c:1092:4
   #3 0x5d42da in ssl3_get_message /app/openssl-1.0.1f/ssl/s3_both.c:457:7
   #4 0x59f537 in ssl3_get_client_hello /app/openssl-1.0.1f/ssl/s3_srvr.c:941:4
   #5 0x59b5a9 in ssl3_accept /app/openssl-1.0.1f/ssl/s3_srvr.c:357:9
   #6 0x551335 in LLVMFuzzerTestOneInput /app/handshake-fuzzer.cc:66:3

...

SUMMARY: AddressSanitizer: heap-buffer-overflow (/app/handshake-fuzzer+0x5133a1) in __asan_memcpy

Also works for logical bugs

  • Sanity check still works

    • the result must be within [0, 1) range

    • image decoder: 100 byte input -> 100 MB output?

    • encrypt, check decryption would fail with wrong key

    • sorting: each element exists and the order is expected

Also works for logical bugs

  • Roud-trip test

    • deserialize -> serialize -> deserialize

    • decompress/compress, decrypt/encrypt

  • Check

    • serialize does not fail

    • 2nd deserialize does not fail

    • deserialize results are equal

Fuzzing test in Go

go-fuzz to the rescue

go-fuzz

  • Dmitry Vyukov, Google

  • A successful 3rd-party Go fuzzing solution

  • It found 200+ bugs in go stdlib, and thousands more

  • Coverage-based fuzzing

Instrument program for code coverage
Collect initial corpus of inputs
for {
    Randomly mutate an input from the corpus
    Execute and collect coverage
    If the input gives new coverage, add it to corpus
}

1. Write fuzz function

// +build gofuzz

func Fuzz(data []byte) int {
  gob.NewDecoder(bytes.NewReader(data)).Decode(new(interface{}))
  return 0
}

2. Build

go get github.com/dvyukov/go-fuzz/...
go-fuzz-build github.com/dvyukov/go-fuzz-corpus/gob

3. Run

go-fuzz -bin gob-fuzz.zip -workdir ./workdir

workers: 8, corpus: 1525 (6s ago), crashers: 6, execs: 0 (0/sec), cover: 1651, uptime: 6s
workers: 8, corpus: 1525 (9s ago), crashers: 6, execs: 16787 (1860/sec), cover: 1651, uptime: 9s
workers: 8, corpus: 1525 (12s ago), crashers: 6, execs: 29840 (2482/sec), cover: 1651, uptime: 12s

go-fuzz's problems

  • Might break (multiple times) due to Go internal package changes.

  • It tries to do coverage instrumentation without compiler's help.

  • More difficult to use compared to Go's unit testing

    • custom command-line tools
      separate test files or build tags, etc.

Go's official fuzzing proposal

go test -fuzz

  • Official proposal [link]

  • Write fuzz function just like test function

    • func FuzzFoo(f *testing.F)

  • Integrate with  go command

    • go test -fuzz

  • Coveraged-based fuzzing

  • Plan to land in 1.18

Already beta now

func FuzzCountAverage(f *testing.F) {
	f.Add([]byte{1})
	f.Fuzz(func(t *testing.T, num []byte) {
		CountAverage(num)
	})
}
  • The fuzz target is a FuzzX function

  • Each fuzz target has its own corpus input

  • testing.F

    • f.Add(): add seed corpus

    • f.Fuzz(): run the fuzz function

$ gotip test -fuzz=FuzzCountAverage -parallel=2
fuzzing, elapsed: 3.0s, execs: 40648 (13549/sec), workers: 2, interesting: 3
fuzzing, elapsed: 3.4s, execs: 44291 (13157/sec), workers: 2, interesting: 3
found a crash, minimizing...
--- FAIL: FuzzCountAverage (3.37s)
        panic: runtime error: integer divide by zero
        goroutine 21364 [running]:
        runtime/debug.Stack()
                /home/david74/sdk/gotip/src/runtime/debug/stack.go:24 +0x90
        testing.tRunner.func1.2({0x69e4c0, 0x887760})
                /home/david74/sdk/gotip/src/testing/testing.go:1281 +0x267
        testing.tRunner.func1()
                /home/david74/sdk/gotip/src/testing/testing.go:1288 +0x218
        panic({0x69e4c0, 0x887760})
                /home/david74/sdk/gotip/src/runtime/panic.go:1038 +0x215
        github.com/david7482/go-fuzzing-playground.CountAverage({0xc000246000, 0x0, 0x0})
                /home/david74/projects/go-fuzzing-playground/count_average.go:8 +0xa5
        ...
        --- FAIL: FuzzCountAverage (0.00s)
    
Crash written to 
  testdata/corpus/FuzzCountAverage/d40a98862ed393eb712e47a91bcef18e6f24cf368bb4bd248c7a7101ef8e178d
To re-run:
go test github.com/david7482/go-fuzzing-playground \
  -run=FuzzCountAverage/d40a98862ed393eb712e47a91bcef18e6f24cf368bb4bd248c7a7101ef8e178d
func FuzzUnmarshal(f *testing.F) {
	f.Add([]byte{1})
	f.Fuzz(func(t *testing.T, num []byte) {
		var v interface{}
		_ = yaml.Unmarshal([]byte(input), &v)
	})
}

go-yaml/yaml

$ gotip test -fuzz=FuzzUnmarshal
fuzzing, elapsed: 3.0s, execs: 62242 (20740/sec), workers: 4, interesting: 41
fuzzing, elapsed: 6.0s, execs: 127025 (21168/sec), workers: 4, interesting: 48
...
fuzzing, elapsed: 1794.0s, execs: 39365685 (21943/sec), workers: 4, interesting: 324
fuzzing, elapsed: 1796.9s, execs: 39427737 (21942/sec), workers: 4, interesting: 324
found a crash, minimizing...
--- FAIL: FuzzUnmarshal (1796.90s)
        panic: runtime error: invalid memory address or nil pointer dereference
        goroutine 9884315 [running]:
        panic({0x72d820, 0x93abe0})
                /home/ubuntu/sdk/gotip/src/runtime/panic.go:1038 +0x215
        gopkg.in/yaml%2ev3.handleErr(0xc00007f6b0)
                /home/ubuntu/go/pkg/mod/gopkg.in/yaml.v3@v3.0.0-20210107192922-496545a6307b/yaml.go:294 +0xc5
        panic({0x72d820, 0x93abe0})
                /home/ubuntu/sdk/gotip/src/runtime/panic.go:1038 +0x215
        gopkg.in/yaml%2ev3.yaml_parser_split_stem_comment(0xc00bf34c00, 0x1)
                /home/ubuntu/go/pkg/mod/gopkg.in/yaml.v3@v3.0.0-20210107192922-496545a6307b/parserc.go:789 +0x6a
        gopkg.in/yaml%2ev3.yaml_parser_parse_block_sequence_entry(0xc00bf34c00, 0xc00bf34eb0, 0x0)
                /home/ubuntu/go/pkg/mod/gopkg.in/yaml.v3@v3.0.0-20210107192922-496545a6307b/parserc.go:703 +0x293
        gopkg.in/yaml%2ev3.yaml_parser_state_machine(0xc00bf34c00, 0x40df54)
        ...
        --- FAIL: FuzzUnmarshal (0.00s)

Crash written to 
  testdata/corpus/FuzzUnmarshal/9c9e78ca4b2c797536d2fbe662c68321c5c3ab6df680664b23c913799fc7f092
To re-run:
go test gopkg.in/yaml.v2 \
  -run=FuzzUnmarshal/9c9e78ca4b2c797536d2fbe662c68321c5c3ab6df680664b23c913799fc7f092
package main

import (
	"fmt"

	"gopkg.in/yaml.v3"
)

func main() {
	in := "#\n-[["

	var n yaml.Node
	if err := yaml.Unmarshal([]byte(in), &n); err != nil {
		fmt.Println(err)
	}
}
  • It does fuzzing with multiple processes

  • Seed corpus folder: ${pkg}/testdata/corpus

  • Seed corpus = seeds in files + seeds in test

  • A good seed corpus can save the mutation engine a lot of work

  • Regression test

    • go test (no -fuzz) also runs Fuzz() functions with seed corpus as input 

Current limitation

  • Only support []byte and primitive types
    No struct type, slice and array support

  • Cannot run multiple fuzzers in the same pkg

  • Cannot keep running after a crash is found 

  • Cannot convert existing files to the corpus format

go test fuzz v1
float(45.241)
int(12345)
[]byte("ABC\xa8\x8c\xb3G\xfc")

How "go test -fuzz" works

show me the codes

Instrument program for code coverage
Collect initial corpus of inputs
for {
    Randomly mutate an input from the corpus
    Execute and collect coverage
    If the input gives new coverage, add it to corpus
}
  • The architecture of "go test -fuzz"

  • How it collects code coverage

  • How it mutates input data

Coordinator

Worker

Worker

  • run & ping workers

  • ask workers to fuzz next input

  • write to seed corpus if crash

  • write to corpus cache if new edge

  • RPC

    • request <-> response

    • command: pipe

    • input data: shm

  • mutate input

  • run fuzz function

  • collect coverage

  • return crash or new edge; otherwise cont. 

Compiler instrumentation

// edge inserts coverage instrumentation for libfuzzer.
func (o *orderState) edge() {
	// Create a new uint8 counter to be allocated in section
	// __libfuzzer_extra_counters.
	counter := staticinit.StaticName(types.Types[types.TUINT8])
	counter.SetLibfuzzerExtraCounter(true)

	// counter += 1
	incr := ir.NewAssignOpStmt(base.Pos, ir.OADD, counter, ir.NewInt(1))
	o.append(incr)
}

edge() inserts coverage instrumentation

Compiler instrumentation

func (o *orderState) stmt(n ir.Node) {
    switch n.Op() {
    ...
    case ir.OFOR:
        edge()
    case ir.OIF:
        edge()
    case ir.ORANGE:
        edge()
    case ir.OSELECT:
        edge()
    case ir.OSWITCH:
        edge()
    ...
    }
}

compiler adds edge() into each edge

Compiler instrumentation

// _counters and _ecounters mark the start and end, respectively, of where
// the 8-bit coverage counters reside in memory. They're known to cmd/link,
// which specially assigns their addresses for this purpose.
var _counters, _ecounters [0]byte

func coverage() []byte {
	addr := unsafe.Pointer(&_counters)
	size := uintptr(unsafe.Pointer(&_ecounters)) - uintptr(addr)

	var res []byte
	*(*unsafeheader.Slice)(unsafe.Pointer(&res)) = unsafeheader.Slice{
		Data: addr,
		Len:  int(size),
		Cap:  int(size),
	}
	return res
}

coverage() returns the coverage counters

The mutators

var byteSliceMutators = []byteSliceMutator{
	byteSliceRemoveBytes,
	byteSliceInsertRandomBytes,
	byteSliceDuplicateBytes,
	byteSliceOverwriteBytes,
	byteSliceBitFlip,
	byteSliceXORByte,
	byteSliceSwapByte,
	byteSliceOverwriteInterestingUint8,
	byteSliceOverwriteInterestingUint16,
	byteSliceOverwriteInterestingUint32,
	byteSliceInsertConstantBytes,
	byteSliceOverwriteConstantBytes,
	byteSliceShuffleBytes,
	byteSliceSwapBytes,
	....
}

func (m *mutator) mutateBytes(ptrB *[]byte)

func (m *mutator) mutateInt(v, maxValue int64) int64

func (m *mutator) mutateUInt(v, maxValue uint64) uint64

func (m *mutator) mutateFloat(v, maxValue float64) float64
  • fuzzing test

  • the benefit of fuzzing

  • go-fuzz project

  • go official fuzzing solution

  • continuous fuzzing ????

Fuzzying test in Go

By Ting-Li Chou

Fuzzying test in Go

My talk in COSCUP 2021

  • 162