Garbage Collection in Go

Jalex Chang

2018.12.18

Jalex Chang

- Backend Engineer @ Umbo Computer Vision

- Taiwan Data Engineering Association Member

- Golang Taiwan Member

Contact:

- jalex.cpc @ Gmail

- jalex.chang @ Facebook

- JalexChang @ GitHub

Agenda

- Introduction to garbage collection (GC)

- Basic GC strategies

- Fundamental knowledge of Go

- Go GC - Concurrent Mark & Sweep Collector

- Discussions and future works

- Conclussion

References

[1] Go 1.11, https://github.com/golang/go/tree/dev.boringcrypto.go1.11

[2] R. L. Hudson, Getting to Go: The Journey of Go's Garbage Collector, https://blog.golang.org/ismmkeynote/

[3] H. Okada, Go GC, https://engineering.linecorp.com/en/blog/go-gc/

[4] G. Bikshandi, Garbage Collection, http://polaris.cs.uiuc.edu/~padua/cs426/cs426-15.pdf

[5] S. Ghemawat and P. Menage, TCMalloc : Thread-Caching Malloc, http://goog-perftools.sourceforge.net/doc/tcmalloc.html​

What is Garbage Collection (GC)? 

In computer science, garbage collection (GC) is a form of automatic memory management. The garbage collector attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program. - Wikipedia

  • Manual deallocation is tedious and error-­prone
    • Memory leak
    • Dangling pointer dereference
  • Other advantages
    • Memory compaction
    • Improving locality (temporal and spatial)

Why GC?

  • Minimize overall execution time
  • Optimize space usage - no fragmentation
  • Minimize pause time (stop-the-world, STW) during GC
  • Improved locality for mutator

GC's optimization goals

Basic GC Strategies

Reference Counting

Pros

- Incremental GC without STW

- Reclaim immediately

- Easy to implementation

Cons

- Hard to handle reference cycles

- Inefficiency in maintenance

- Cache-unfriendly (high miss rate)

- No memory compaction 

Mark and Sweep

2-phase object tracking method

Pros

- Overcome reference cycles

Cons

- Need to STW during GC

- No memory compaction

 

Tri-color marking

black objects can not refer white ones

Pros

- Overcome reference cycles

- Concurrent garbage collector

- STW only for setting up and re-scan

Cons

- Write barriers are needed

- No memory compaction

- Lower throughput than mark and sweep

 

Generational GC

Most objects die young

- GC more frequently from young space

- Relocating old objects to the space where GC less frequently

Pros

- Reduce the cost for scanning all objects

- Memory compaction

Cons

- Write barriers are needed all the time

Before we talk about Go's GC, there are some things you should know.......

Go is value-oriented language 

 Go is a value-oriented language in C-like systems languages rather than reference-oriented language in most managed runtime languages.

Escape Analysis

//go:noinline
func f1() people {
    p := &people{
        name:  "Jalex",
        email: "jalex.cpc@gmail.com",
    }

    println("V1", p)
    return *p
}

//go:noinline
func f2() *people {
    p := people{
        name:  "Jalex",
        email: "jalex.cpc@gmail.com",
    }

    println("V2", &p)
    return &p
}

 Where should p be allocated?  Stack frame or heap?

- A mechanism to automatically decide whether a variable should be allocated on the heap or not in compile time.

- It tries to keep objects on stack as much as possible.

Go's memory allocator

Based on TCMalloc (Thread-Caching Malloc) - Provided by Sanjay Ghemawat and Paul Menage in 2007

Concepts:

- Each thread in GO has its own local thread cache (in heap)

- There are 70 size classes of free lists for small objects in thread cache.

- Allocate small object in thread cache.

- Allocate larger object (>32KB) in page heap.

Why TCMalloc?

- Allocating objects in thread cache are lock-free
=> Improve memory allocation performance

 

- Deallocating objects in thread cache is light-weight (without GC until total size exceed 2MB)

 

- A span is split into the same size of chunks for one class
=> Minimize the fragmentation.

 

- A span is continuous pages => Improve the locality

 

Garbage Collection in GO

When will GC happen?

GC Checking:

- Allocate a object which is larger than 32KB

- Call runtime.GC()

- Every 2 minutes

 

GC triggering condition:

- Current heap size > 2 x defaultHeapMinimum x  GOGC / 100

defaultHeapMinimum: 4MB, GOGC:100 (set by runtime/debug.SetGCPercent)

- Simply said: 8MB in default

Go's garbage collector

The GC runs concurrently with mutator threads, is type accurate (aka precise), allows multiple GC thread to run in parallel. It is a concurrent mark and sweep that uses a write barrier. It is non-generational and non-compacting. Allocation is done using size segregated per P allocation areas to minimize fragmentation while eliminating locks in the common case. - Go1.11

GO GC Algorithm Phases

Write Barrier on During GC

  • Used in all mark phases.
  • Ensure no reachable objects get lost during the tri-color operations.
    • Shade changed objects as grey objects.
    • Shade newly allocated objects as black objects.
    • ​It will cause extra cost in memory allocation.
  • The write barrier is fast but it isn't fast enough.
    • That is why GO does not consider Generational GC.
    • Use value-oriented style, escape analysis, and TCMalloc to avoid GC happen frequently.

A 99%ile isolated GC latency service level objective

GC statistics in Twitter's servers (18 GB heap):

The future of GO GC

Summary

Go's GC strategies

- Use value-orient style to improve locality.

- Apply escape analysis to stick objects on stack frame.

- Use TCMalloc to achieve memory compaction and to reduce GC efforts.

- Use concurrent mark and sweep collector with 2 short STW

- Only use write barrier during GC to maintain overall performance

 

Go's GC pause time is amazingly low now (~0.5 ms).

So, GC is not a problem in GO, but your programming style is.

Q&A Time

Thanks for listening