Feedback in V8

Michael Stanton

  • V8: Compiler Team/Manager
  • V8: ICs and Feedback Vectors
  • I like cats, climbing and old typewriters :p

Feedback

The transmission of evaluative or corrective information about an action, event, or process to the original or controlling source; also :  the information so transmitted

source: Merriam-Webster

Feedback

The transmission of evaluative or corrective information about an action, event, or process to the original or controlling source; also :  the information so transmitted

Deliver high performance

source: Merriam-Webster

What people think we do

What people think we do

What we actually do

How does v8 achieve Performance?

  • Compilation pipeline with learning
  • Hidden-Class-based object layout
  • Inline Caches to maintain and observe layout

How does v8 achieve Performance?

  • Compilation pipeline with learning
  • HiddenClass-based object layout
  • Inline Caches to maintain and observe layout

Convey what we know to the right place to use it

How does v8 achieve Performance?

  • Compilation pipeline with learning
  • Hidden-Class-based object layout
  • Inline Caches to maintain and observe layout

Make it easy to act on what we know

How does v8 achieve Performance?

  • Compilation pipeline with learning
  • Hidden-Class-based object layout
  • Inline Caches to maintain and observe layout

move the process forward

Compilation pipeline with learning

Source
Code

Ignition

Byte
Code

Compilation pipeline with learning

Source
Code

Ignition

Byte
Code

Run for a while

Gather feedback with ICs

Slot Type Value
1 LOAD MONO
2 CALL UNINIT
... ... ...

Feedback Vector

Compilation pipeline with learning

Source
Code

Ignition

Turbofan

Byte
Code

Optimized
Code

Slot Type Value
1 LOAD MONO
2 CALL UNINIT
... ... ...

Feedback Vector

Compilation pipeline with learning

Source
Code

Ignition

Turbofan

Byte
Code

Optimized
Code

Deoptimization

Slot Type Value
1 LOAD MONO
2 CALL UNINIT
... ... ...
  • Objects have a "hidden class" (called a Map in V8)
  • The Map describes the layout in memory
  • Adding/removing properties changes the Map
  • The Map is the first pointer in every object

Hidden-Class-based Object layout

  • Objects have a "hidden class" (called a Map in V8)
  • The Map describes the layout in memory
  • Adding/removing properties changes the Map
  • The Map is the first pointer in every object

Hidden-Class-based Object layout

We can recognize the class of an object with one pointer comparison

  • Objects have a "hidden class" (called a Map in V8)
  • The Map describes the layout in memory
  • Adding/removing properties changes the Map
  • The Map is the first pointer in every object

Hidden-Class-based Object layout

We can recognize the class of an object with one pointer comparison and avoid vast complexity.

Map M

x: 1

let o = { x: 1 };
o.y = 2;

Evolution of o.map

Hidden-Class-based Object layout

Map M

x: 1

Map N

x: 1

y: 2

let o = { x: 1 };
o.y = 2;

Evolution of o.map

Hidden-Class-based Object layout

Inline Caches to maintain and observe layout

  • An Inline Cache (IC) is a listening site placed in your code.
  • We have them at LOAD, STORE and CALL locations.
  • It caches the Map of objects that pass by...

Inline Caches to maintain and observe layout

  • An Inline Cache (IC) is a listening site placed in your code.
  • We have them at LOAD, STORE and CALL locations.
  • It caches the Map of objects that pass by in the Feedback Vector for the function.
IC Slot IC Type Value
1 LOAD MONO
2 CALL UNINIT
... ... ...

Every function has a Feedback Vector. It's just an array that holds state for each IC.

function processLogFile(fileName) {
  this.collectEntries = true;
  this.lastLogFileName_ = fileName;
  var line;
  while (line = readline()) {
    this.processLogLine(line);
  }
  print();
  print("Load: " + this.LoadIC);
  print("Store: " + this.StoreIC);
  ...
}
IC Slot IC Type State
... ... ...
26 LOAD MONO(M)
27 LOAD UNINITIALIZED

Inline Caches to maintain and observe layout

An Inline Cache is also...

  • A state machine.
  • Follows a basic pattern.
  • Variations are always possible.

Inline Caches to maintain and observe layout

An Inline Cache is also...

  • A state machine.
  • Follows a basic pattern.
  • Variations are always possible.

ICs to maintain and observe layout

Uninitialized

Premonomorphic

Monomorphic

Polymorphic

Megamorphic

Slow

Load IC states

mrale.ph/blog

Some facts

  • We can say an IC is monomorphic when it only saw objects of one class.
  • It's uninitialized if it's never been visited.
  • It's generic if it has given up trying to provide information (too many weird things happened).
a[b];  // where b is not a number
       // and changes
o[i] = x;  // where o is sometimes 
           // a typed array and 
           // sometimes a normal array.
this.name = "wut"; // where name is an 
                   // accessor property 
                   // with no setter.
...

Some facts

  • It never makes sense to talk about an object as monomorphic.
  • The term only applies to load of a particular property at a particular site.

How does v8 achieve Performance?

  • Compilation pipeline with learning
  • Hidden-Class-based object layout
  • ICs to maintain and observe layout
function load(a) {
  return a.key;
}
// pseudo-code for the LOAD IC:
if (a.map == vector[slot].map) {
  return valueAtOffset(a, vector[slot].offset);
} else {
  // Hmpf. Go to the Runtime for an hour
  // or three.
}
function load(a) {
  return a.key;
}
; assembly level in the IC

mov ebx, [ebx+edx*4]  ; ebx = vector[slot]
cmp ebx, [a]          ; is ebx == a.map?
jne &miss             ; if not, MISS
mov eax, [a+0xc]      ; return a.key
ret

miss:                 ; ah, jeesh
call ReallySlowC++Thing

REALLYSLOWC++THING

  • Complete and correct answer for the property load.
  • The more exotic the load is, the less chance we've optimized it.
  • The more different types of loads at a single IC, the less valuable the feedback gets.
    • It goes Polymorphic...
    • Then Generic.
function load(a) {
  return a.key;
}
; assembly level optimized code

cmp [a], 0x4830eee0   ; is a.map the one we saw?
jne &deopt            ; if not, DEOPTIMIZE
mov eax, [a+0xc]      ; return a.key
ret

deopt:                 ; ah, double jeesh
call TearDownTheWorld

TEARDOWNTHEWORLD

  • Deoptimizing a function is expensive.
  • You have to re-write the call stack with frames for all the functions you inlined.
  • If it happens too many times, V8 becomes shy about trying again.

Feedback Workflow

Let's look at a simple property load.

function load(a) {
  return a.key;
}

var o = { key: "usb" };
var o1 = { key: "port" };

for (var i = 0; i < 100000; i++) {
  load(o);
  load(o1);
}

Feedback Workflow

Here is the AST from parsing:

function load(a) {
  return a.key;
}

var o = { key: "usb" };
var o1 = { key: "port" };

for (var i = 0; i < 100000; i++) {
  load(o);
  load(o1);
}
// Run with --print-ast
FUNC
. NAME "load"
. PARAMS
. . VAR "a"
. RETURN
. . PROPERTY Slot(0) at 29
. . . VAR PROXY parameter[0] "a"
. . . NAME key

Feedback Workflow

Here is the AST from parsing:

function load(a) {
  return a.key;
}

var o = { key: "usb" };
var o1 = { key: "port" };

for (var i = 0; i < 100000; i++) {
  load(o);
  load(o1);
}
// Run with --print-ast
FUNC
. NAME "load"
. PARAMS
. . VAR "a"
. RETURN
. . PROPERTY Slot(0) at 29
. . . VAR PROXY parameter[0] "a"
. . . NAME key
Slot Type
0 LOAD_IC

Here is the feedback vector specification

Feedback Workflow

Here is the bytecode:

function load(a) {
  return a.key;
}

var o = { key: "usb" };
var o1 = { key: "port" };

for (var i = 0; i < 100000; i++) {
  load(o);
  load(o1);
}
Slot Type
0 LOAD_IC

Here is the feedback vector specification:

"load" -- Parameter count 2
Frame size 0
   StackCheck
   Nop
   LdaNamedProperty a0, [0], [3]  // "key"
   Return

Feedback Workflow

function load(a) {
  return a.key;
}

var o = { key: "usb" };
var o1 = { key: "port" };

for (var i = 0; i < 100000; i++) {
  load(o);
  load(o1);
}
Slot Type Value
0 LOAD_IC

Here is the feedback vector:

Feedback Workflow

function load(a) {
  return a.key;
}

var o = { key: "usb" };
var o1 = { key: "port" };

for (var i = 0; i < 100000; i++) {
  load(o);
  load(o1);
}
Slot Type Value
0 LOAD_IC

We execute the LoadIC for "a.key"...

Feedback Workflow

function load(a) {
  return a.key;
}

var o = { key: "usb" };
var o1 = { key: "port" };

for (var i = 0; i < 100000; i++) {
  load(o);
  load(o1);
}
Slot Type Value
0 LOAD_IC MONO(o.map)

Return the answer & remember the Map.

Feedback Workflow

function load(a) {
  return a.key;
}

var o = { key: "usb" };
var o1 = { key: "port" };

for (var i = 0; i < 100000; i++) {
  load(o);
  load(o1);
}
Slot Type Value
0 LOAD_IC MONO(o.map)

Now with a different object...

Feedback Workflow

function load(a) {
  return a.key;
}

var o = { key: "usb" };
var o1 = { key: "port" };

for (var i = 0; i < 100000; i++) {
  load(o);
  load(o1);
}
Slot Type Value
0 LOAD_IC MONO(o.map)

Is the map the same?

o1.map == o.map?

Feedback Workflow

function load(a) {
  return a.key;
}

var o = { key: "usb" };
var o1 = { key: "port" };

for (var i = 0; i < 100000; i++) {
  load(o);
  load(o1);
}
Slot Type Value
0 LOAD_IC MONO(o.map)

We remain monomorphic.

Yes

Feedback Workflow

function load(a) {
  return a.key;
}

var o = { key: "usb" };
var o1 = { key: "port" };

for (var i = 0; i < 100000; i++) {
  load(o);
  load(o1);
}
Slot Type Value
0 LOAD_IC MONO(o.map)

The Runtime Profiler asks:

  • Is the function "hot?"
  • Is there enough feedback?

(Run with --nouse-osr to ensure we optimize load(), and not the whole script)

Feedback Workflow

function load(a) {
  return a.key;
}

var o = { key: "usb" };
var o1 = { key: "port" };

for (var i = 0; i < 100000; i++) {
  load(o);
  load(o1);
}
Slot Type Value
0 LOAD_IC MONO(o.map)

The Runtime Profiler asks:

  • Is the function "hot?"
  • Is there enough feedback?
// --trace-opt output
[marking <JSFunction load> for 
    optimized recompilation, 
  reason: small function, 
  ICs with typeinfo: 1/1 (100%), 
  generic ICs: 0/1 (0%)]

Feedback Workflow

function load(a) {
  return a.key;
}

var o = { key: "usb" };
var o1 = { key: "port" };

for (var i = 0; i < 100000; i++) {
  load(o);
  load(o1);
}
Slot Type Value
0 LOAD_IC MONO(o.map)

The Runtime Profiler asks:

  • Is the function "hot?"
  • Is there enough feedback?
// --trace-opt output
[marking <JSFunction load> for 
    optimized recompilation, 
  reason: small function, 
  ICs with typeinfo: 1/1 (100%), 
  generic ICs: 0/1 (0%)]
// load(a) - Turbofanned

push ebp             // Build frame             
mov  ebp,esp         //
push esi             //
push edi             //

mov  eax,[ebp+0x8]   // eax = a
test al,0x1          // is a an object?
jz   DEOPT_0         // If not, deoptimize

mov  ecx,[eax-0x1]   // ecx = a.map
mov  edx,0x37e0b4ad  // edx = o.map
jnz  DEOPT_1         // if not same, deopt.
mov  eax,[eax+0xb]   // return a.key!

mov  esp,ebp         // Tear down frame
pop  ebp             //
ret  0x8             // return for realz

...

DEOPT_0: call 0x52d06000  // Sad!
DEOPT_1: call 0x52d0600a  // Very sad!
// load(a) - Turbofanned

push ebp             // Build frame             
mov  ebp,esp         //
push esi             //
push edi             //

mov  eax,[ebp+0x8]   // eax = a
test al,0x1          // is a an object?
jz   DEOPT_0         // If not, deoptimize

mov  ecx,[eax-0x1]   // ecx = a.map
mov  edx,0x37e0b4ad  // edx = o.map
jnz  DEOPT_1         // if not same, deopt.
mov  eax,[eax+0xb]   // return a.key!

mov  esp,ebp         // Tear down frame
pop  ebp             //
ret  0x8             // return for realz

...

DEOPT_0: call 0x52d06000  // Sad!
DEOPT_1: call 0x52d0600a  // Very sad!

Boilerplate

// load(a) - Turbofanned

push ebp             // Build frame             
mov  ebp,esp         //
push esi             //
push edi             //

mov  eax,[ebp+0x8]   // eax = a
test al,0x1          // is a an object?
jz   DEOPT_0         // If not, deoptimize

mov  ecx,[eax-0x1]   // ecx = a.map
mov  edx,0x37e0b4ad  // edx = o.map
jnz  DEOPT_1         // if not same, deopt.
mov  eax,[eax+0xb]   // return a.key!

mov  esp,ebp         // Tear down frame
pop  ebp             //
ret  0x8             // return for realz

...

DEOPT_0: call 0x52d06000  // Sad!
DEOPT_1: call 0x52d0600a  // Very sad!

Boilerplate

Object check

// load(a) - Turbofanned

push ebp             // Build frame             
mov  ebp,esp         //
push esi             //
push edi             //

mov  eax,[ebp+0x8]   // eax = a
test al,0x1          // is a an object?
jz   DEOPT_0         // If not, deoptimize

mov  ecx,[eax-0x1]   // ecx = a.map
mov  edx,0x37e0b4ad  // edx = o.map
jnz  DEOPT_1         // if not same, deopt.
mov  eax,[eax+0xb]   // return a.key!

mov  esp,ebp         // Tear down frame
pop  ebp             //
ret  0x8             // return for realz

...

DEOPT_0: call 0x52d06000  // Sad!
DEOPT_1: call 0x52d0600a  // Very sad!

Boilerplate

Object check

Map check

// load(a) - Turbofanned

push ebp             // Build frame             
mov  ebp,esp         //
push esi             //
push edi             //

mov  eax,[ebp+0x8]   // eax = a
test al,0x1          // is a an object?
jz   DEOPT_0         // If not, deoptimize

mov  ecx,[eax-0x1]   // ecx = a.map
mov  edx,0x37e0b4ad  // edx = o.map
jnz  DEOPT_1         // if not same, deopt.
mov  eax,[eax+0xb]   // return a.key!

mov  esp,ebp         // Tear down frame
pop  ebp             //
ret  0x8             // return for realz

...

DEOPT_0: call 0x52d06000  // Sad!
DEOPT_1: call 0x52d0600a  // Very sad!

Boilerplate

Object check

Map check

The actual load!

Things to consider when optimizing

  • How to use the feedback?
    • Monomorphic is a no-brainer: use it

Things to consider when optimizing

  • How to use the feedback?
    • Monomorphic is a no-brainer: use it

The more references there are to "a" downwind of the feedback, the greater the benefit of knowing the object's class.

Downwind includes inlined functions.

Things to consider when optimizing

  • How to use the feedback?
    • Monomorphic is a no-brainer: use it
    • Polymorphic, probably use...but what if some of the maps are stale and never used recently?

Things to consider when optimizing

  • How to use the feedback?
    • Monomorphic is a no-brainer: use it
    • Polymorphic, probably use...but what if some of the maps are stale and never used recently?
cmp eax, 0x43501230  // is this the map?
je  load_map_1       // yes, handle it
cmp eax, 0x99503210  // how about this one?
je  load_map_2       // han'l it different-style
jmp DEOPT            // Oh jeesh

load_map_1: mov eax, [edx+0xc]
            ret
load_map_2: mov eax, [edx+0x10]
            ret

Things to consider when optimizing

  • How to use the feedback?
    • Monomorphic is a no-brainer: use it
    • Polymorphic: probably use it
    • What about uninitialized IC sites?

Things to consider when optimizing

  • How to use the feedback?
    • Monomorphic is a no-brainer: use it
    • Polymorphic: probably use it
    • What about uninitialized IC sites?

It's tempting to just compile generic code, and allow the IC to learn through the feedback vector.

 

This way, you at least won't deoptimize...

Things to consider when optimizing

  • How to use the feedback?
    • Monomorphic is a no-brainer: use it
    • Polymorphic: probably use it
    • What about uninitialized IC sites?

But in practice, the less efficient code over many iterations costs more than the cost of deoptimization and reoptimization.

Things to consider when optimizing

  • How to use the feedback?
    • Monomorphic is a no-brainer: use it
    • Polymorphic: probably use it
    • What about uninitialized IC sites?

So we deoptimize unconditionally. And save time compiling because we don't bother with that whole block.

Things to consider when optimizing

  • How to use the feedback?
    • Monomorphic is a no-brainer: use it
    • Polymorphic: probably use it
    • Uninitialized IC sites: deoptimize
    • Be careful about "hoisting" map checks out of loops!

Things to consider when optimizing

  • How to use the feedback?
    • Monomorphic is a no-brainer: use it
    • Polymorphic: probably use it
    • Uninitialized IC sites: deoptimize
    • Be careful about "hoisting" map checks out of loops!

You lose touch with the Feedback Vector, and have nowhere to put the new map.

Eventually V8 stops reoptimizing.

Things to consider when optimizing

  • How to use the feedback?
    • Monomorphic is a no-brainer: use it
    • Polymorphic: probably use it
    • Uninitialized IC sites: deoptimize
    • Be careful about "hoisting" map checks out of loops!
  • How much to inline?
    • Increases compilation time
    • How to decide what to inline, and to what depth?

Feedback Workflow

Slot Type Value
0 LOAD_IC MONO(o.map)
// load(a) - Turbofanned
...
mov  ecx,[eax-0x1]  // ecx = a.map
mov  edx,0x37e0b4ad // edx = o.map
jnz  DEOPT_1        // if not same, deopt
mov  eax,[eax+0xb]  // return a.key
...

Feedback Vector not being used

function load(a) {
  return a.key;
}

var o = { key: "usb" };
var o1 = { key: "port" };

for (var i = 0; i < 100000; i++) {
  load(o);
  load(o1);
}

load({ key: "usb", name: "francis" });

Feedback Workflow

function load(a) {
  return a.key;
}

var o = { key: "usb" };
var o1 = { key: "port" };

for (var i = 0; i < 100000; i++) {
  load(o);
  load(o1);
}

load({ key: "usb", name: "francis" });
Slot Type Value
0 LOAD_IC MONO(o.map)
// load(a) - Turbofanned
...
mov  ecx,[eax-0x1]  // ecx = a.map
mov  edx,0x37e0b4ad // edx = o.map
jnz  DEOPT_1        // if not same, deopt
mov  eax,[eax+0xb]  // return a.key
...

The map will be different...

Compilation pipeline with learning

Source
Code

Ignition

Turbofan

Byte
Code

Optimized
Code

Deoptimization

Slot Type Value
1 LOAD MONO
2 CALL UNINIT
... ... ...

Feedback Workflow

function load(a) {
  return a.key;
}

var o = { key: "usb" };
var o1 = { key: "port" };

for (var i = 0; i < 100000; i++) {
  load(o);
  load(o1);
}

load({ key: "usb", name: "francis" });
Slot Type Value
0 LOAD_IC POLY(o.map,  new_map)

The map will be different...

Question times?

Made with Slides.com