Feedback in V8
Michael Stanton
- V8: Compiler Team/Manager
- V8: ICs and Feedback Vectors
- I like cats, climbing and old typewriters :p
Feedback
The transmission of evaluative or corrective information about an action, event, or process to the original or controlling source; also : the information so transmitted
source: Merriam-Webster
Feedback
The transmission of evaluative or corrective information about an action, event, or process to the original or controlling source; also : the information so transmitted
Deliver high performance
source: Merriam-Webster
What people think we do
What people think we do
What we actually do
How does v8 achieve Performance?
- Compilation pipeline with learning
- Hidden-Class-based object layout
- Inline Caches to maintain and observe layout
How does v8 achieve Performance?
- Compilation pipeline with learning
- HiddenClass-based object layout
- Inline Caches to maintain and observe layout
Convey what we know to the right place to use it
How does v8 achieve Performance?
- Compilation pipeline with learning
- Hidden-Class-based object layout
- Inline Caches to maintain and observe layout
Make it easy to act on what we know
How does v8 achieve Performance?
- Compilation pipeline with learning
- Hidden-Class-based object layout
- Inline Caches to maintain and observe layout
move the process forward
Compilation pipeline with learning
Source
Code
Ignition
Byte
Code
Compilation pipeline with learning
Source
Code
Ignition
Byte
Code
Run for a while
Gather feedback with ICs
Slot | Type | Value |
---|---|---|
1 | LOAD | MONO |
2 | CALL | UNINIT |
... | ... | ... |
Feedback Vector
Compilation pipeline with learning
Source
Code
Ignition
Turbofan
Byte
Code
Optimized
Code
Slot | Type | Value |
---|---|---|
1 | LOAD | MONO |
2 | CALL | UNINIT |
... | ... | ... |
Feedback Vector
Compilation pipeline with learning
Source
Code
Ignition
Turbofan
Byte
Code
Optimized
Code
Deoptimization
Slot | Type | Value |
---|---|---|
1 | LOAD | MONO |
2 | CALL | UNINIT |
... | ... | ... |
- Objects have a "hidden class" (called a Map in V8)
- The Map describes the layout in memory
- Adding/removing properties changes the Map
- The Map is the first pointer in every object
Hidden-Class-based Object layout
- Objects have a "hidden class" (called a Map in V8)
- The Map describes the layout in memory
- Adding/removing properties changes the Map
- The Map is the first pointer in every object
Hidden-Class-based Object layout
We can recognize the class of an object with one pointer comparison
- Objects have a "hidden class" (called a Map in V8)
- The Map describes the layout in memory
- Adding/removing properties changes the Map
- The Map is the first pointer in every object
Hidden-Class-based Object layout
We can recognize the class of an object with one pointer comparison and avoid vast complexity.
Map M
x: 1
let o = { x: 1 };
o.y = 2;
Evolution of o.map
Hidden-Class-based Object layout
Map M
x: 1
Map N
x: 1
y: 2
let o = { x: 1 };
o.y = 2;
Evolution of o.map
Hidden-Class-based Object layout
Inline Caches to maintain and observe layout
- An Inline Cache (IC) is a listening site placed in your code.
- We have them at LOAD, STORE and CALL locations.
- It caches the Map of objects that pass by...
Inline Caches to maintain and observe layout
- An Inline Cache (IC) is a listening site placed in your code.
- We have them at LOAD, STORE and CALL locations.
- It caches the Map of objects that pass by in the Feedback Vector for the function.
IC Slot | IC Type | Value |
---|---|---|
1 | LOAD | MONO |
2 | CALL | UNINIT |
... | ... | ... |
Every function has a Feedback Vector. It's just an array that holds state for each IC.
function processLogFile(fileName) {
this.collectEntries = true;
this.lastLogFileName_ = fileName;
var line;
while (line = readline()) {
this.processLogLine(line);
}
print();
print("Load: " + this.LoadIC);
print("Store: " + this.StoreIC);
...
}
IC Slot | IC Type | State |
---|---|---|
... | ... | ... |
26 | LOAD | MONO(M) |
27 | LOAD | UNINITIALIZED |
Inline Caches to maintain and observe layout
An Inline Cache is also...
- A state machine.
- Follows a basic pattern.
- Variations are always possible.
Inline Caches to maintain and observe layout
An Inline Cache is also...
- A state machine.
- Follows a basic pattern.
- Variations are always possible.
ICs to maintain and observe layout
Uninitialized
Premonomorphic
Monomorphic
Polymorphic
Megamorphic
Slow
Load IC states
mrale.ph/blog
Some facts
- We can say an IC is monomorphic when it only saw objects of one class.
- It's uninitialized if it's never been visited.
- It's generic if it has given up trying to provide information (too many weird things happened).
a[b]; // where b is not a number
// and changes
o[i] = x; // where o is sometimes
// a typed array and
// sometimes a normal array.
this.name = "wut"; // where name is an
// accessor property
// with no setter.
...
Some facts
- It never makes sense to talk about an object as monomorphic.
- The term only applies to load of a particular property at a particular site.
How does v8 achieve Performance?
- Compilation pipeline with learning
- Hidden-Class-based object layout
- ICs to maintain and observe layout
function load(a) {
return a.key;
}
// pseudo-code for the LOAD IC:
if (a.map == vector[slot].map) {
return valueAtOffset(a, vector[slot].offset);
} else {
// Hmpf. Go to the Runtime for an hour
// or three.
}
function load(a) {
return a.key;
}
; assembly level in the IC
mov ebx, [ebx+edx*4] ; ebx = vector[slot]
cmp ebx, [a] ; is ebx == a.map?
jne &miss ; if not, MISS
mov eax, [a+0xc] ; return a.key
ret
miss: ; ah, jeesh
call ReallySlowC++Thing
REALLYSLOWC++THING
- Complete and correct answer for the property load.
- The more exotic the load is, the less chance we've optimized it.
- The more different types of loads at a single IC, the less valuable the feedback gets.
- It goes Polymorphic...
- Then Generic.
function load(a) {
return a.key;
}
; assembly level optimized code
cmp [a], 0x4830eee0 ; is a.map the one we saw?
jne &deopt ; if not, DEOPTIMIZE
mov eax, [a+0xc] ; return a.key
ret
deopt: ; ah, double jeesh
call TearDownTheWorld
TEARDOWNTHEWORLD
- Deoptimizing a function is expensive.
- You have to re-write the call stack with frames for all the functions you inlined.
- If it happens too many times, V8 becomes shy about trying again.
Feedback Workflow
Let's look at a simple property load.
function load(a) {
return a.key;
}
var o = { key: "usb" };
var o1 = { key: "port" };
for (var i = 0; i < 100000; i++) {
load(o);
load(o1);
}
Feedback Workflow
Here is the AST from parsing:
function load(a) {
return a.key;
}
var o = { key: "usb" };
var o1 = { key: "port" };
for (var i = 0; i < 100000; i++) {
load(o);
load(o1);
}
// Run with --print-ast
FUNC
. NAME "load"
. PARAMS
. . VAR "a"
. RETURN
. . PROPERTY Slot(0) at 29
. . . VAR PROXY parameter[0] "a"
. . . NAME key
Feedback Workflow
Here is the AST from parsing:
function load(a) {
return a.key;
}
var o = { key: "usb" };
var o1 = { key: "port" };
for (var i = 0; i < 100000; i++) {
load(o);
load(o1);
}
// Run with --print-ast
FUNC
. NAME "load"
. PARAMS
. . VAR "a"
. RETURN
. . PROPERTY Slot(0) at 29
. . . VAR PROXY parameter[0] "a"
. . . NAME key
Slot | Type |
---|---|
0 | LOAD_IC |
Here is the feedback vector specification
Feedback Workflow
Here is the bytecode:
function load(a) {
return a.key;
}
var o = { key: "usb" };
var o1 = { key: "port" };
for (var i = 0; i < 100000; i++) {
load(o);
load(o1);
}
Slot | Type |
---|---|
0 | LOAD_IC |
Here is the feedback vector specification:
"load" -- Parameter count 2
Frame size 0
StackCheck
Nop
LdaNamedProperty a0, [0], [3] // "key"
Return
Feedback Workflow
function load(a) {
return a.key;
}
var o = { key: "usb" };
var o1 = { key: "port" };
for (var i = 0; i < 100000; i++) {
load(o);
load(o1);
}
Slot | Type | Value |
---|---|---|
0 | LOAD_IC |
Here is the feedback vector:
Feedback Workflow
function load(a) {
return a.key;
}
var o = { key: "usb" };
var o1 = { key: "port" };
for (var i = 0; i < 100000; i++) {
load(o);
load(o1);
}
Slot | Type | Value |
---|---|---|
0 | LOAD_IC |
We execute the LoadIC for "a.key"...
Feedback Workflow
function load(a) {
return a.key;
}
var o = { key: "usb" };
var o1 = { key: "port" };
for (var i = 0; i < 100000; i++) {
load(o);
load(o1);
}
Slot | Type | Value |
---|---|---|
0 | LOAD_IC | MONO(o.map) |
Return the answer & remember the Map.
Feedback Workflow
function load(a) {
return a.key;
}
var o = { key: "usb" };
var o1 = { key: "port" };
for (var i = 0; i < 100000; i++) {
load(o);
load(o1);
}
Slot | Type | Value |
---|---|---|
0 | LOAD_IC | MONO(o.map) |
Now with a different object...
Feedback Workflow
function load(a) {
return a.key;
}
var o = { key: "usb" };
var o1 = { key: "port" };
for (var i = 0; i < 100000; i++) {
load(o);
load(o1);
}
Slot | Type | Value |
---|---|---|
0 | LOAD_IC | MONO(o.map) |
Is the map the same?
o1.map == o.map?
Feedback Workflow
function load(a) {
return a.key;
}
var o = { key: "usb" };
var o1 = { key: "port" };
for (var i = 0; i < 100000; i++) {
load(o);
load(o1);
}
Slot | Type | Value |
---|---|---|
0 | LOAD_IC | MONO(o.map) |
We remain monomorphic.
Yes
Feedback Workflow
function load(a) {
return a.key;
}
var o = { key: "usb" };
var o1 = { key: "port" };
for (var i = 0; i < 100000; i++) {
load(o);
load(o1);
}
Slot | Type | Value |
---|---|---|
0 | LOAD_IC | MONO(o.map) |
The Runtime Profiler asks:
- Is the function "hot?"
- Is there enough feedback?
(Run with --nouse-osr to ensure we optimize load(), and not the whole script)
Feedback Workflow
function load(a) {
return a.key;
}
var o = { key: "usb" };
var o1 = { key: "port" };
for (var i = 0; i < 100000; i++) {
load(o);
load(o1);
}
Slot | Type | Value |
---|---|---|
0 | LOAD_IC | MONO(o.map) |
The Runtime Profiler asks:
- Is the function "hot?"
- Is there enough feedback?
// --trace-opt output
[marking <JSFunction load> for
optimized recompilation,
reason: small function,
ICs with typeinfo: 1/1 (100%),
generic ICs: 0/1 (0%)]
Feedback Workflow
function load(a) {
return a.key;
}
var o = { key: "usb" };
var o1 = { key: "port" };
for (var i = 0; i < 100000; i++) {
load(o);
load(o1);
}
Slot | Type | Value |
---|---|---|
0 | LOAD_IC | MONO(o.map) |
The Runtime Profiler asks:
- Is the function "hot?"
- Is there enough feedback?
// --trace-opt output
[marking <JSFunction load> for
optimized recompilation,
reason: small function,
ICs with typeinfo: 1/1 (100%),
generic ICs: 0/1 (0%)]
// load(a) - Turbofanned
push ebp // Build frame
mov ebp,esp //
push esi //
push edi //
mov eax,[ebp+0x8] // eax = a
test al,0x1 // is a an object?
jz DEOPT_0 // If not, deoptimize
mov ecx,[eax-0x1] // ecx = a.map
mov edx,0x37e0b4ad // edx = o.map
jnz DEOPT_1 // if not same, deopt.
mov eax,[eax+0xb] // return a.key!
mov esp,ebp // Tear down frame
pop ebp //
ret 0x8 // return for realz
...
DEOPT_0: call 0x52d06000 // Sad!
DEOPT_1: call 0x52d0600a // Very sad!
// load(a) - Turbofanned
push ebp // Build frame
mov ebp,esp //
push esi //
push edi //
mov eax,[ebp+0x8] // eax = a
test al,0x1 // is a an object?
jz DEOPT_0 // If not, deoptimize
mov ecx,[eax-0x1] // ecx = a.map
mov edx,0x37e0b4ad // edx = o.map
jnz DEOPT_1 // if not same, deopt.
mov eax,[eax+0xb] // return a.key!
mov esp,ebp // Tear down frame
pop ebp //
ret 0x8 // return for realz
...
DEOPT_0: call 0x52d06000 // Sad!
DEOPT_1: call 0x52d0600a // Very sad!
Boilerplate
// load(a) - Turbofanned
push ebp // Build frame
mov ebp,esp //
push esi //
push edi //
mov eax,[ebp+0x8] // eax = a
test al,0x1 // is a an object?
jz DEOPT_0 // If not, deoptimize
mov ecx,[eax-0x1] // ecx = a.map
mov edx,0x37e0b4ad // edx = o.map
jnz DEOPT_1 // if not same, deopt.
mov eax,[eax+0xb] // return a.key!
mov esp,ebp // Tear down frame
pop ebp //
ret 0x8 // return for realz
...
DEOPT_0: call 0x52d06000 // Sad!
DEOPT_1: call 0x52d0600a // Very sad!
Boilerplate
Object check
// load(a) - Turbofanned
push ebp // Build frame
mov ebp,esp //
push esi //
push edi //
mov eax,[ebp+0x8] // eax = a
test al,0x1 // is a an object?
jz DEOPT_0 // If not, deoptimize
mov ecx,[eax-0x1] // ecx = a.map
mov edx,0x37e0b4ad // edx = o.map
jnz DEOPT_1 // if not same, deopt.
mov eax,[eax+0xb] // return a.key!
mov esp,ebp // Tear down frame
pop ebp //
ret 0x8 // return for realz
...
DEOPT_0: call 0x52d06000 // Sad!
DEOPT_1: call 0x52d0600a // Very sad!
Boilerplate
Object check
Map check
// load(a) - Turbofanned
push ebp // Build frame
mov ebp,esp //
push esi //
push edi //
mov eax,[ebp+0x8] // eax = a
test al,0x1 // is a an object?
jz DEOPT_0 // If not, deoptimize
mov ecx,[eax-0x1] // ecx = a.map
mov edx,0x37e0b4ad // edx = o.map
jnz DEOPT_1 // if not same, deopt.
mov eax,[eax+0xb] // return a.key!
mov esp,ebp // Tear down frame
pop ebp //
ret 0x8 // return for realz
...
DEOPT_0: call 0x52d06000 // Sad!
DEOPT_1: call 0x52d0600a // Very sad!
Boilerplate
Object check
Map check
The actual load!
Things to consider when optimizing
- How to use the feedback?
- Monomorphic is a no-brainer: use it
Things to consider when optimizing
- How to use the feedback?
- Monomorphic is a no-brainer: use it
The more references there are to "a" downwind of the feedback, the greater the benefit of knowing the object's class.
Downwind includes inlined functions.
Things to consider when optimizing
- How to use the feedback?
- Monomorphic is a no-brainer: use it
- Polymorphic, probably use...but what if some of the maps are stale and never used recently?
Things to consider when optimizing
- How to use the feedback?
- Monomorphic is a no-brainer: use it
- Polymorphic, probably use...but what if some of the maps are stale and never used recently?
cmp eax, 0x43501230 // is this the map?
je load_map_1 // yes, handle it
cmp eax, 0x99503210 // how about this one?
je load_map_2 // han'l it different-style
jmp DEOPT // Oh jeesh
load_map_1: mov eax, [edx+0xc]
ret
load_map_2: mov eax, [edx+0x10]
ret
Things to consider when optimizing
- How to use the feedback?
- Monomorphic is a no-brainer: use it
- Polymorphic: probably use it
- What about uninitialized IC sites?
Things to consider when optimizing
- How to use the feedback?
- Monomorphic is a no-brainer: use it
- Polymorphic: probably use it
- What about uninitialized IC sites?
It's tempting to just compile generic code, and allow the IC to learn through the feedback vector.
This way, you at least won't deoptimize...
Things to consider when optimizing
- How to use the feedback?
- Monomorphic is a no-brainer: use it
- Polymorphic: probably use it
- What about uninitialized IC sites?
But in practice, the less efficient code over many iterations costs more than the cost of deoptimization and reoptimization.
Things to consider when optimizing
- How to use the feedback?
- Monomorphic is a no-brainer: use it
- Polymorphic: probably use it
- What about uninitialized IC sites?
So we deoptimize unconditionally. And save time compiling because we don't bother with that whole block.
Things to consider when optimizing
- How to use the feedback?
- Monomorphic is a no-brainer: use it
- Polymorphic: probably use it
- Uninitialized IC sites: deoptimize
- Be careful about "hoisting" map checks out of loops!
Things to consider when optimizing
- How to use the feedback?
- Monomorphic is a no-brainer: use it
- Polymorphic: probably use it
- Uninitialized IC sites: deoptimize
- Be careful about "hoisting" map checks out of loops!
You lose touch with the Feedback Vector, and have nowhere to put the new map.
Eventually V8 stops reoptimizing.
Things to consider when optimizing
- How to use the feedback?
- Monomorphic is a no-brainer: use it
- Polymorphic: probably use it
- Uninitialized IC sites: deoptimize
- Be careful about "hoisting" map checks out of loops!
-
How much to inline?
- Increases compilation time
- How to decide what to inline, and to what depth?
Feedback Workflow
Slot | Type | Value |
---|---|---|
0 | LOAD_IC | MONO(o.map) |
// load(a) - Turbofanned
...
mov ecx,[eax-0x1] // ecx = a.map
mov edx,0x37e0b4ad // edx = o.map
jnz DEOPT_1 // if not same, deopt
mov eax,[eax+0xb] // return a.key
...
Feedback Vector not being used
function load(a) {
return a.key;
}
var o = { key: "usb" };
var o1 = { key: "port" };
for (var i = 0; i < 100000; i++) {
load(o);
load(o1);
}
load({ key: "usb", name: "francis" });
Feedback Workflow
function load(a) {
return a.key;
}
var o = { key: "usb" };
var o1 = { key: "port" };
for (var i = 0; i < 100000; i++) {
load(o);
load(o1);
}
load({ key: "usb", name: "francis" });
Slot | Type | Value |
---|---|---|
0 | LOAD_IC | MONO(o.map) |
// load(a) - Turbofanned
...
mov ecx,[eax-0x1] // ecx = a.map
mov edx,0x37e0b4ad // edx = o.map
jnz DEOPT_1 // if not same, deopt
mov eax,[eax+0xb] // return a.key
...
The map will be different...
Compilation pipeline with learning
Source
Code
Ignition
Turbofan
Byte
Code
Optimized
Code
Deoptimization
Slot | Type | Value |
---|---|---|
1 | LOAD | MONO |
2 | CALL | UNINIT |
... | ... | ... |
Feedback Workflow
function load(a) {
return a.key;
}
var o = { key: "usb" };
var o1 = { key: "port" };
for (var i = 0; i < 100000; i++) {
load(o);
load(o1);
}
load({ key: "usb", name: "francis" });
Slot | Type | Value |
---|---|---|
0 | LOAD_IC | POLY(o.map, new_map) |
The map will be different...
Question times?
Vector (too big)
By ripsawridge
Vector (too big)
How the V8 JavaScript engine uses feedback to achieve performance.
- 1,098