How the Go runtime implement maps efficiently

David Chou

We are Umbo Computer Vision
We build autonomous video security system

Golang Taipei
Streaming Meetup

david74.chou @ facebook
david74.chou @ medium
david7482 @ github
How the Go runtime implements maps efficiently (without generics)
Dave Cheney, GoCon Spring 2018


C++
JAVA
template<
class Key,
class T,
class Hash = std::hash<Key>,
class KeyEqual = std::equal_to<Key>,
class Allocator = std::allocator< std::pair<const Key, T> >
> class unordered_map;Class HashMap<K,V>
java.lang.Object
java.util.AbstractMap<K,V>
java.util.HashMap<K,V>
Type Parameters:
K - the type of keys maintained by this map
V - the type of mapped values
Go
var m map[string]intmap(key) → value
The map function

Go uses HashMap
hash(key) → integer
The hash function
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| key | value |
|---|---|
Hashmap
Bucket: 3
Hashmap Data Structure
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| key | value |
|---|---|
| pkg/errors | 2903 |
| spf13/cobra | 7136 |
| golang/go | 40260 |
Hashmap
Bucket: 3
insert(star, "golang/go", 40260)
"golang/go"
HashFunction
78356113
Mask
Four properties of a hash map
- A hash function for the key
- An equality function to compare keys
- Need to know the size of the key type
- Need to know the size of the value type
C++
template<
class Key,
class T,
class Hash = std::hash<Key>,
class KeyEqual = std::equal_to<Key>,
class Allocator = std::allocator< std::pair<const Key, T> >
> class unordered_map;- class Key
- class T
- std::hash<Key>
- std::equal_to<Key>
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| key | value |
|---|---|
| pkg/errors | 2903 |
| spf13/cobra | 7136 |
| golang/go | 40260 |
Hashmap
Bucket: 3
insert(star, "golang/go", 40260)
"golang/go"
std::hash<key>
78356113
Mask
std::equal_to<key>
JAVA
Class HashMap<K,V>
java.lang.Object
java.util.AbstractMap<K,V>
java.util.HashMap<K,V>
Type Parameters:
K - the type of keys maintained by this map
V - the type of mapped values- K and V are Object
- Object.equals()
- Object.hashCode()
- Need boxing for primitive types
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| key | value | next |
|---|---|---|
| pkg/errors | 2903 |
Hashmap
Bucket: 3
insert(star, "golang/go", 40260)
"golang/go"
key.hashCode()
78356113
Mask
| spf13/cobra | 7136 |
| golang/go | 40260 | null |
key.equals()
C++
- Pros
- The size of key and value are always known
- Array implementation
- No need for boxing or pointer chasing
- Cons
- Larger binary size. Different types means different maps.
- Slower compile time.
- Larger memory footprint for predetermined size for each array element.
JAVA
- Pros
- Single implementation for any subclass of Object
- Faster compile time and smaller binary size
- Linked list implementation. No predetermined size for each array element.
- Cons
- Boxing would increase gc preasure
- Slower for boxing and linked list pointer chasing
Go's hashmap implementaion
Use interface{} ?
Code generation ?
No
No
Compiler + Runtime
v := m["key"] → runtime.mapaccess1(m, ”key", &v)
v, ok := m["key"] → runtime.mapaccess2(m, ”key”, &v, &ok)
m["key"] = 9001 → runtime.mapinsert(m, ”key", 9001)
delete(m, "key") → runtime.mapdelete(m, “key”)
Compile time rewriting
func mapaccess1(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer
mapaccess1
Different maptype values for each unique map declaration
map[string]int → var mt1 maptype{...}
map[string]http.Header → var mt2 maptype{...}
map[structA]structB → var mt3 maptype{...}
type maptype struct {
typ _type
key *_type
elem *_type
bucket *_type // internal type representing a hash bucket
hmap *_type // internal type representing a hmap
keysize uint8 // size of key slot
indirectkey bool // store ptr to key instead of key itself
valuesize uint8 // size of value slot
indirectvalue bool // store ptr to value instead of value itself
bucketsize uint16 // size of bucket
reflexivekey bool // true if k==k for all keys
needkeyupdate bool // true if we need to update key on overwrite
}type _type struct {
size uintptr
alg *typeAlg
...
}type typeAlg struct {
// function for hashing objects of this type
// (ptr to object, seed) -> hash
hash func(unsafe.Pointer, uintptr) uintptr
// function for comparing objects of this type
// (ptr to object A, ptr to object B) -> ==?
equal func(unsafe.Pointer, unsafe.Pointer) bool
}C++
map<K0,V0>
map<K0,V0>
map<K0,V0>
map<K0,V0>
Compile Time
JAVA
map<K,V>
Run Time
Object0
Object0
Object0
Object0
Go
map<K,V>
Compile Time
maptype0
maptype0
maptype0
maptype0
Conclusion
- A good compromise between C++ and JAVA
- Single hashmap implementation to reduce binary size
- Already known the the size of key and value.
Array implementation for better performance. - Could use primitive types without boxing.
No extra gc preasure
Any Question?

How the Go runtime implement maps efficiently
By Ting-Li Chou
How the Go runtime implement maps efficiently
- 197