Diego Parra
@diegolparra
@dpsoft
Lucas Amoroso
@_lucasamoroso
@lucasamoroso
Automatic Memory Management
GC Building Blocks(Algorithms)
Generational Hypotesis
Generational Collectors
JVM Collectors
Intro to Concurrent Collectors
What's memory management?
Memory management is the process of controlling and coordinating the way a software application access computer memory
Safety First: Automatically freeing objects when all reacheable pointers to them are gone
Control First: Your program's memory consumption is entirelly in your hands
Ownership: memory is managed through a system of ownership with a set of rules that the compiler checks at compile time
Python, JavaScript, Ruby, Java, C#, Haskell and Go
C, C++
Rust
GC Roots
Reachable Objects
Unreachable Objects
The simplest form of Garbage Collection
Count the number of references from live objects
Each object has a Reference Counter(RC)
An object is presumed live iff its RC > 0
An object can be reclaimed when its RC == 0
The RC is incremented/decremented when a reference is copied/deleted
Indirect Collection algorithm
Non-moving collector
Collection operates in Two phases
Stop the World mode
Heap tends to become Fragmented
Indirect Collection algorithm
Moving collector
Collection need multiple passes over live objects
Avoid fragmentation
May rearrange objects in the heap
More Slow than mark-sweep
Indirect Collection algorithm
Moving collector
Collection need multiple passes over live objects
Avoid fragmentation
May rearrange objects in the heap
More Slow than mark-sweep
Memory is divided into two equal-size regions
Moving collector
Requires only a single pass over the live objects
Compacted Heap
Locality benefits on large heaps(posible)
Space Overhead
Memory is divided into two equal-size regions
Moving collector
Compacted Heap
Locality benefits on large heaps(posible)
Space Overhead
Requires only a single pass over the live objects
Segregation by age
This is called the Weak Generational Hypothesis
Most objects die young
The ones that do not usually survive for a long time
Concentrate on the young generation to reduce pause time
Collect different generations at different frequencies
Segregate objects by ages into generations
Copying Collector
Mark/Sweep
Mark/Compact
Concentrate on the young generation to reduce pause time
Collect different generations at different frequencies
Segregate objects by ages into generations
Garbage in an old generation cannot be reclaimed by collection of younger generation
It's based on the generational hypothesis
A minor GC, on the Young Generation, is performed when the Eden fills up
A major GC, on the Old Generation, is performed when the Tenured fills up
A full GC, on the entire heap, is performed when there is no more space to allocate new objects
The simplest implementation of a GC algorithm
There is only one thread performing GC
When it runs it freezes all of the application threads (Stop The world)
It uses Mark-copy in the Young Generation
It uses Mark-sweep-compact in the Old Generation
Similar to Serial GC
There are N threads performing GC
When it runs it freezes all of the application threads (Stop The world)
It uses Mark-copy in the Young Generation
It uses Mark-sweep-compact in the Old Generation
These pauses time are lower than the ones from Serial
It scans heap memory using multiple threads
There are two stop the world phases
It uses Mark-copy in the Young Generation
It uses Mark-sweep in the Old Generation
Initial mark: mark all live objects, in the Old Gen, that are reachable from GC roots or referenced from an object in the YG
Remark: find objects that were missed by the concurrent tracing phase
Pause times are lower than the previous ones but there is fragmentation in the Old Generation
A generational, incremental, parallel, mostly concurrent, stop-the-world, and evacuating garbage collector
The heap is divided in regions
Performs space-reclamation incrementally in steps and in parallel
Reclaims space in the most efficient areas first and mostly by using evacuation
A generational, incremental, parallel, mostly concurrent, stop-the-world, and evacuating garbage collector
The heap is divided in regions
Performs space-reclamation incrementally in steps and in parallel
Reclaims space in the most efficient areas first and mostly by using evacuation
Regionalized GC(Derived from G1)
Concurrent Compaction
Single Generation
Sub-millisecond max pause times
-XX:+UseShenandoahGC
-XX:+UseZGC
Divides memory into regions(ZPages)
Concurrent Compaction
Colored pointers
Single Generation
Sub-millisecond max pause times
Like everything, memory management is about trade-off
Safety
Throughput
Pause Time
Space overhead
...
Describe the State of objects during collection
Black nodes that have been marked and their children have been marked as well
White nodes that have not yet been marked, and at the end of mark-phase, are garbage
Gray nodes that have been marked but their children have not been visited, and must be visited again to be painted black
The Algorithm
Invariant*: after the marking loop, there can be no references from a black node to a white one