Fuzzying test in Go
David Chou @ Golang Taipei


CC-BY-SA-3.0-TW
@ Umbo Computer Vision
@ Golang Taipei Co-organizer 🙋♂️
Software engineer, DevOps, and Gopher 👨💻

david74.chou @ gmail
david74.chou @ facebook
david74.chou @ medium
david7482 @ github


What is fuzzing test?
wiki: an automated testing that provides random data as inputs to a computer program.

A brief history of fuzzing
-
1950s:
-
1988: term fuzzing is coined by Barton Miller
We didn't call it fuzzing back in the 1950s, but it was our standard practice to test programs by inputting decks of punch cards taken from the trash. This type of testing was so common that it had no name. - Gerald M. Weinberg


Fuzzing is the process of sending intentionally invalid data to a product in the hopes of triggering an error.
- H.D. Moore
Fuzzing test
-
Continuously manipulate inputs
-
Semi-random data from various mutation
-
Discover new code coverage based on instrumentation
-
Run more mutations quickly;
rather than fewer mutations intelligently

What can be fuzzed?
-
deserialization (xml, json, proto, gob)
-
network protocols (HTTP, SMTP)
-
media codecs (audio, video, images, pdf)
-
crypto (boringssl, openssl)
-
compression (zip, gzip, bzip2, brotli)
-
etc
Why do we need fuzzing?
you don't know what you don't know


Why do we need fuzzing?
A simple example
func CountAverage(num []byte) int {
sum := byte(0)
for _, v := range num {
sum += v
}
return int(sum) / len(num)
}func TestCountAverage(t *testing.T) {
tests := []struct {
name string
num []byte
want int
}{
{
num: []byte{1, 2, 3, 4, 5},
want: 3,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := CountAverage(tt.num)
assert.EqualValues(t, tt.want, got)
})
}
}$ go test -run TestCountAverage -cover
PASS
coverage: 100.0% of statementsA real-world example:
OpenSSL Heartbleed


Heartbleed fuzzing
150255 REDUCE cov: 485 ft: 756 corp: 38/15713b exec/s: 25042
rss: 402Mb L: 2891/2891 MS: 1 EraseBytes-
=================================================================
==6098==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x629000009748 at pc 0x0000005133a2 bp 0x7fffe29233c0 sp 0x7fffe2922b70
READ of size 48830 at 0x629000009748 thread T0
#0 0x5133a1 in __asan_memcpy (/app/handshake-fuzzer+0x5133a1)
1 0x5630c8 in tls1_process_heartbeat /app/openssl-1.0.1f/ssl/t1_lib.c:2586:3
#2 0x5cfa9d in ssl3_read_bytes /app/openssl-1.0.1f/ssl/s3_pkt.c:1092:4
#3 0x5d42da in ssl3_get_message /app/openssl-1.0.1f/ssl/s3_both.c:457:7
#4 0x59f537 in ssl3_get_client_hello /app/openssl-1.0.1f/ssl/s3_srvr.c:941:4
#5 0x59b5a9 in ssl3_accept /app/openssl-1.0.1f/ssl/s3_srvr.c:357:9
#6 0x551335 in LLVMFuzzerTestOneInput /app/handshake-fuzzer.cc:66:3
...
SUMMARY: AddressSanitizer: heap-buffer-overflow (/app/handshake-fuzzer+0x5133a1) in __asan_memcpyAlso works for logical bugs
-
Sanity check still works
-
the result must be within [0, 1) range
-
image decoder: 100 byte input -> 100 MB output?
-
encrypt, check decryption would fail with wrong key
-
sorting: each element exists and the order is expected
-
Also works for logical bugs
-
Roud-trip test
-
deserialize -> serialize -> deserialize
-
decompress/compress, decrypt/encrypt
-
-
Check
-
serialize does not fail
-
2nd deserialize does not fail
-
deserialize results are equal
-
Fuzzing test in Go
go-fuzz to the rescue

go-fuzz
-
Dmitry Vyukov, Google
-
A successful 3rd-party Go fuzzing solution
-
It found 200+ bugs in go stdlib, and thousands more
-
Coverage-based fuzzing
Instrument program for code coverage
Collect initial corpus of inputs
for {
Randomly mutate an input from the corpus
Execute and collect coverage
If the input gives new coverage, add it to corpus
}1. Write fuzz function
// +build gofuzz
func Fuzz(data []byte) int {
gob.NewDecoder(bytes.NewReader(data)).Decode(new(interface{}))
return 0
}2. Build
go get github.com/dvyukov/go-fuzz/...
go-fuzz-build github.com/dvyukov/go-fuzz-corpus/gob3. Run
go-fuzz -bin gob-fuzz.zip -workdir ./workdir
workers: 8, corpus: 1525 (6s ago), crashers: 6, execs: 0 (0/sec), cover: 1651, uptime: 6s
workers: 8, corpus: 1525 (9s ago), crashers: 6, execs: 16787 (1860/sec), cover: 1651, uptime: 9s
workers: 8, corpus: 1525 (12s ago), crashers: 6, execs: 29840 (2482/sec), cover: 1651, uptime: 12sgo-fuzz's problems
-
Might break (multiple times) due to Go internal package changes.
-
It tries to do coverage instrumentation without compiler's help.
-
More difficult to use compared to Go's unit testing
-
custom command-line tools
separate test files or build tags, etc.
-
Go's official fuzzing proposal
go test -fuzz

-
Official proposal [link]
-
Write fuzz function just like test function
-
func FuzzFoo(f *testing.F)
-
-
Integrate with go command
-
go test -fuzz
-
-
Coveraged-based fuzzing
-
Plan to land in 1.18
Already beta now

func FuzzCountAverage(f *testing.F) {
f.Add([]byte{1})
f.Fuzz(func(t *testing.T, num []byte) {
CountAverage(num)
})
}The fuzz target is a FuzzX function
Each fuzz target has its own corpus input
-
testing.F
f.Add(): add seed corpus
f.Fuzz(): run the fuzz function
$ gotip test -fuzz=FuzzCountAverage -parallel=2
fuzzing, elapsed: 3.0s, execs: 40648 (13549/sec), workers: 2, interesting: 3
fuzzing, elapsed: 3.4s, execs: 44291 (13157/sec), workers: 2, interesting: 3
found a crash, minimizing...
--- FAIL: FuzzCountAverage (3.37s)
panic: runtime error: integer divide by zero
goroutine 21364 [running]:
runtime/debug.Stack()
/home/david74/sdk/gotip/src/runtime/debug/stack.go:24 +0x90
testing.tRunner.func1.2({0x69e4c0, 0x887760})
/home/david74/sdk/gotip/src/testing/testing.go:1281 +0x267
testing.tRunner.func1()
/home/david74/sdk/gotip/src/testing/testing.go:1288 +0x218
panic({0x69e4c0, 0x887760})
/home/david74/sdk/gotip/src/runtime/panic.go:1038 +0x215
github.com/david7482/go-fuzzing-playground.CountAverage({0xc000246000, 0x0, 0x0})
/home/david74/projects/go-fuzzing-playground/count_average.go:8 +0xa5
...
--- FAIL: FuzzCountAverage (0.00s)
Crash written to
testdata/corpus/FuzzCountAverage/d40a98862ed393eb712e47a91bcef18e6f24cf368bb4bd248c7a7101ef8e178d
To re-run:
go test github.com/david7482/go-fuzzing-playground \
-run=FuzzCountAverage/d40a98862ed393eb712e47a91bcef18e6f24cf368bb4bd248c7a7101ef8e178dfunc FuzzUnmarshal(f *testing.F) {
f.Add([]byte{1})
f.Fuzz(func(t *testing.T, num []byte) {
var v interface{}
_ = yaml.Unmarshal([]byte(input), &v)
})
}go-yaml/yaml
$ gotip test -fuzz=FuzzUnmarshal
fuzzing, elapsed: 3.0s, execs: 62242 (20740/sec), workers: 4, interesting: 41
fuzzing, elapsed: 6.0s, execs: 127025 (21168/sec), workers: 4, interesting: 48
...
fuzzing, elapsed: 1794.0s, execs: 39365685 (21943/sec), workers: 4, interesting: 324
fuzzing, elapsed: 1796.9s, execs: 39427737 (21942/sec), workers: 4, interesting: 324
found a crash, minimizing...
--- FAIL: FuzzUnmarshal (1796.90s)
panic: runtime error: invalid memory address or nil pointer dereference
goroutine 9884315 [running]:
panic({0x72d820, 0x93abe0})
/home/ubuntu/sdk/gotip/src/runtime/panic.go:1038 +0x215
gopkg.in/yaml%2ev3.handleErr(0xc00007f6b0)
/home/ubuntu/go/pkg/mod/gopkg.in/yaml.v3@v3.0.0-20210107192922-496545a6307b/yaml.go:294 +0xc5
panic({0x72d820, 0x93abe0})
/home/ubuntu/sdk/gotip/src/runtime/panic.go:1038 +0x215
gopkg.in/yaml%2ev3.yaml_parser_split_stem_comment(0xc00bf34c00, 0x1)
/home/ubuntu/go/pkg/mod/gopkg.in/yaml.v3@v3.0.0-20210107192922-496545a6307b/parserc.go:789 +0x6a
gopkg.in/yaml%2ev3.yaml_parser_parse_block_sequence_entry(0xc00bf34c00, 0xc00bf34eb0, 0x0)
/home/ubuntu/go/pkg/mod/gopkg.in/yaml.v3@v3.0.0-20210107192922-496545a6307b/parserc.go:703 +0x293
gopkg.in/yaml%2ev3.yaml_parser_state_machine(0xc00bf34c00, 0x40df54)
...
--- FAIL: FuzzUnmarshal (0.00s)
Crash written to
testdata/corpus/FuzzUnmarshal/9c9e78ca4b2c797536d2fbe662c68321c5c3ab6df680664b23c913799fc7f092
To re-run:
go test gopkg.in/yaml.v2 \
-run=FuzzUnmarshal/9c9e78ca4b2c797536d2fbe662c68321c5c3ab6df680664b23c913799fc7f092
package main
import (
"fmt"
"gopkg.in/yaml.v3"
)
func main() {
in := "#\n-[["
var n yaml.Node
if err := yaml.Unmarshal([]byte(in), &n); err != nil {
fmt.Println(err)
}
}-
It does fuzzing with multiple processes
-
Seed corpus folder: ${pkg}/testdata/corpus
-
Seed corpus = seeds in files + seeds in test
-
A good seed corpus can save the mutation engine a lot of work
-
Regression test
-
go test (no -fuzz) also runs Fuzz() functions with seed corpus as input
-
Current limitation
-
Only support []byte and primitive types
No struct type, slice and array support -
Cannot run multiple fuzzers in the same pkg
-
Cannot keep running after a crash is found
-
Cannot convert existing files to the corpus format
go test fuzz v1
float(45.241)
int(12345)
[]byte("ABC\xa8\x8c\xb3G\xfc")How "go test -fuzz" works
show me the codes

Instrument program for code coverage
Collect initial corpus of inputs
for {
Randomly mutate an input from the corpus
Execute and collect coverage
If the input gives new coverage, add it to corpus
}-
The architecture of "go test -fuzz"
-
How it collects code coverage
-
How it mutates input data
Coordinator
Worker
Worker
-
run & ping workers
-
ask workers to fuzz next input
-
write to seed corpus if crash
-
write to corpus cache if new edge
-
RPC
-
request <-> response
-
command: pipe
-
input data: shm
-
-
mutate input
-
run fuzz function
-
collect coverage
-
return crash or new edge; otherwise cont.
Compiler instrumentation
// edge inserts coverage instrumentation for libfuzzer.
func (o *orderState) edge() {
// Create a new uint8 counter to be allocated in section
// __libfuzzer_extra_counters.
counter := staticinit.StaticName(types.Types[types.TUINT8])
counter.SetLibfuzzerExtraCounter(true)
// counter += 1
incr := ir.NewAssignOpStmt(base.Pos, ir.OADD, counter, ir.NewInt(1))
o.append(incr)
}edge() inserts coverage instrumentation
Compiler instrumentation
func (o *orderState) stmt(n ir.Node) {
switch n.Op() {
...
case ir.OFOR:
edge()
case ir.OIF:
edge()
case ir.ORANGE:
edge()
case ir.OSELECT:
edge()
case ir.OSWITCH:
edge()
...
}
}compiler adds edge() into each edge
Compiler instrumentation
// _counters and _ecounters mark the start and end, respectively, of where
// the 8-bit coverage counters reside in memory. They're known to cmd/link,
// which specially assigns their addresses for this purpose.
var _counters, _ecounters [0]byte
func coverage() []byte {
addr := unsafe.Pointer(&_counters)
size := uintptr(unsafe.Pointer(&_ecounters)) - uintptr(addr)
var res []byte
*(*unsafeheader.Slice)(unsafe.Pointer(&res)) = unsafeheader.Slice{
Data: addr,
Len: int(size),
Cap: int(size),
}
return res
}coverage() returns the coverage counters
The mutators
var byteSliceMutators = []byteSliceMutator{
byteSliceRemoveBytes,
byteSliceInsertRandomBytes,
byteSliceDuplicateBytes,
byteSliceOverwriteBytes,
byteSliceBitFlip,
byteSliceXORByte,
byteSliceSwapByte,
byteSliceOverwriteInterestingUint8,
byteSliceOverwriteInterestingUint16,
byteSliceOverwriteInterestingUint32,
byteSliceInsertConstantBytes,
byteSliceOverwriteConstantBytes,
byteSliceShuffleBytes,
byteSliceSwapBytes,
....
}
func (m *mutator) mutateBytes(ptrB *[]byte)
func (m *mutator) mutateInt(v, maxValue int64) int64
func (m *mutator) mutateUInt(v, maxValue uint64) uint64
func (m *mutator) mutateFloat(v, maxValue float64) float64-
fuzzing test
-
the benefit of fuzzing
-
go-fuzz project
-
go official fuzzing solution
-
continuous fuzzing ????

Fuzzying test in Go
By Ting-Li Chou
Fuzzying test in Go
My talk in COSCUP 2021
- 162