V7 memory optimization techniques

@CesantaHQ

by Sergey Lyubka, Cesanta Software, for Dublin C/C++ User Group

What is V7

  • V7 is an embedded JavaScript engine written in C
  • Compliant to ISO C90 and ISO C++98
  • Targets embedded systems
  • Or C/C++ programs that need scripting engine
  • Conforms to ECMA 5.1 standard
  • Easy to embed: only two files, v7.c and v7.h
  • Easy to use embedding API
  • Open Source, under GPLv2/commercial license
  • GitHub repo: https://github.com/cesanta/v7/

Memory Footprint

  • Since V7 is developed for embedded systems, minimizing resource usage is critical
  • For memory, both static and run-time footprint is considered
    • Typical embedded board: 0.5-8k RAM, 16-64k flash
  • Several techniques are utilized to bring V7 memory usage to minimum

JavaScript values

  • In Javascript, following types are defined: undefined, null, boolean, string, number, object
  • Only the object type is compound: it is a collection of name/value pairs, where value could be of any type
  • Other types are scalar:
    • undefined and null hold no value
    • boolean holds either true or false
    • string holds a piece of Unicode text
    • number holds IEEE 754 floating point value, which is known to C/C++ developers as double type

Packing values

  • A naive a value is represented as a structure (or C++ class) that defines type and a value placeholder
  • So any value is actually a pointer to a structure (or class instance)
/* Rest of the types are omitted for clarity */
enum value_type { TYPE_NUMBER, TYPE_STRING, TYPE_OBJECT };

struct value {
  enum value_type type;
  union {
    char boolean;
    double number;
    struct string *string;
    struct object *object;
  } placeholder;
};

Packing values (cont)

  • On 64-bit arch, this structure occupies 16 bytes
  • Even if type is moved to the end of the member list
  • At least 7 bytes (padding) is wasted for every value
  • For boolean values, 15 bytes are wasted
/* Rest of the types are omitted for clarity */
enum value_type { TYPE_NUMBER, TYPE_STRING, TYPE_OBJECT };

struct value {
  enum value_type type;    /* Could be 1 byte, but will pad to 8 bytes */
  union {
    double number;
    struct string *string;
    struct object *object;
  } placeholder;           /* The largest type is double - 8 bytes*/
};

IEEE 754 double type

  • The largest JavaScript scalar type is Number, which is IEEE 754 value (double type in C/C++)
  • 1 sign bit, 11 bits exponent, 52 bits mantissa
  • If exponent bits are all 1 and the mantissa is non-0, the number is NaN
//  Double-precision floating-point number, IEEE 754
//
//  64 bit (8 bytes) in total
//  1  bit sign
//  11 bits exponent
//  52 bits mantissa
// 
//      7         6        5        4        3        2        1        0
//  seeeeeee|eeeemmmm|mmmmmmmm|mmmmmmmm|mmmmmmmm|mmmmmmmm|mmmmmmmm|mmmmmmmm
//
//  11111111|11110000|00000000|00000000|00000000|00000000|00000000|00000001  NaN

NaN packing

  • NaN packing is a technique to pack values into a IEEE 754 numbers
  • V7 uses 4 bits of mantissa to specify type
  • Rest of 48 bits are used to hold an actual value
//  V7 NaN-packing:
//    sign and exponent is 0xffff
//    4 bits specify type
//    48 bits specify value
//
//  11111111|1111xxxx|00000000|00000000|00000000|00000000|00000000|00000000  NaN
//   NaN marker |type|  48-bit placeholder for values: pointers, strings
//

typedef uint64_t val_t;

#define V7_TAG_BOOLEAN ((uint64_t) 0xFFFC << 48)
#define V7_TAG_FUNCTION ((uint64_t) 0xFFF5 << 48)  /* JavaScript function */
#define V7_TAG_OBJECT ((uint64_t) 0xFFFF << 48)    /* JavaScript object */
#define V7_TAG_REGEXP ((uint64_t) 0xFFF2 << 48)    /* RegExp */
...

Strings: naive approach

  • JavaScript strings are UNICODE memory chunks
  • The naive way of representing strings are using structure (or class) that describes a vector
// On 64-bit system, struct v7_string occupies 16 bytes
struct v7_string {
  unsigned char *data;  // sizeof(pointer)
  size_t length;        // SIZE_MAX is at least 0xffff - 2 bytes
};

struct v7_string *str = malloc(sizeof(*str));
  • 16 bytes per structure, plus malloc housekeeping overhead: padding, length, etc: circa 8 bytes
    • for both struct v7_string and data
  • 4 bytes string: 16 + 16 bytes = 32 bytes overhead

Strings in V7

  • Short strings are packed directly into NaN
  • All strings are guaranteed to be 0-terminated to be suitable for C/C++ string API (e.g. strcmp())
//  11111111|1111xxxx|00000000|00000000|00000000|00000000|00000000|00000000
//   NaN marker |type|  48-bit placeholder for values: pointers, strings
//


#define V7_TAG_STRING_I ((uint64_t) 0xFFFA << 48)  /* Inlined string len < 5 */
#define V7_TAG_STRING_5 ((uint64_t) 0xFFF9 << 48)  /* Inlined string len 5 */
#define V7_TAG_STRING_O ((uint64_t) 0xFFF8 << 48)  /* Owned string */
#define V7_TAG_STRING_F ((uint64_t) 0xFFF7 << 48)  /* Foreign string */


//  Inlined string:
//
//  11111111|11111010|00000011|01101000|01101001|00100001|00000000|00000000
//   NaN marker |0xa |
//       0xfffa      |    3   |   h    |   i    |   !    |        |

Strings in V7 (cont)

  • Long strings are stored in the resizable buffer
  • Represented as (length, data) tuple
  • Length is varint-encoded
  • Overhead: 16 bytes + 1-2 bytes of string length
//  Inlined string:
//
//  11111111|11111000|xxxxxxxx|xxxxxxxx|xxxxxxxx|xxxxxxxx|xxxxxxxx|xxxxxxxx
//   NaN marker |0x8 |           pointer to string data
//
//
//  Resizable memory buffer that holds long strings
//   |13|long string 1|12|hello world!|-----grows as new strings added --->
//

Garbage Collector

  • During the execution, VM creates many values
  • They need to be garbage collected
  • V7 uses combination of mark-and-sweep and mark-compact algorithms
  • During the garbage collection run,
    • mark phase: all variables that are in use by VM are marked
    • sweep phase: those that are not marked, reclaimed

Garbage Collector (cont)

  • V7 entities - like object, object properties, functions, are described by structures of fixed size
  • Most memory-efficient ways to store them is to have a contiguous array (pool) of such entities
struct v7_object {
  /* Fixed-size data structure */
};

struct v7_property {
  /* Fixed-size data structure */
};

Using pools for storing objects

  • For each entity type (e.g. object, function) V7 has a respective pool that holds live and dead entities (nodes)
  • Dead (free) nodes are linked together into a free list
  • Free list is a simple singly-linked list
  • First sizeof(pointer) bytes in a node are used for the linkage in a free list (a "next" pointer)

Object pool

struct pool {
  char *base;
  size_t size;
  char *free_list_head;
  size_t node_size;
};
  • Initially, when pool is created, all nodes are added to the free list
  • Each node in the free list is marked by setting LSB to 1

Node allocation

  • Pool with one node allocated (note LSB 0)
  • Allocated node is taken from the free list head
  // Node allocated from the free list
  void *r = (void *) p->free;
  (* (uintptr_t *) r)--;  // unmark node
  p->free = * (void **) r;
  return r;

Node allocation (cont)

Pool with two nodes allocated (note LSB 0)

Node allocation (cont)

Pool with three nodes allocated (note LSB 0)

  • When all nodes gets allocated, free_list_head becomes NULL
  • Next allocation triggers pool resize
  • Resize is done by realloc()-ing base

Node allocation (cont)

// Node allocation
void *v7_alloc_cell(struct v7_pool *p) {
  void *r;
  if (p->free == NULL) {
    v7_pool_grow(p, p->size * 1.51);
  }
  r = (void *) p->free;
  (* (uintptr_t *) r)--; // unmark
  p->free = * (void **) r;
  return r;
}

GC mark phase

  • V7 object hierarchy is simple
  • All objects are traversed and marked, starting from current activation frame and a set of root objects
  • Marking is done by setting LSB to 1 in the first pointer-sized field

GC sweep phase

  • After all live nodes are marked, each pool is scanned for an unmarked nodes
  • Unmarked nodes (LSB 0) are marked and added to the free list
  • Alive nodes (node3) are unmarked (LSB set to 0)

GC summary

  • GC complexity is linear
    •  O(N) , N is the number of nodes in a pool
  • GC takes no extra memory

Thank you!

contact me at

sergey.lyubka@cesanta.com

V7 memory footprint optimization

By Sergey Lyubka

V7 memory footprint optimization

This presentation describes memory optimization and garbage collection techniques used by V7 embedded JavaScript engine

  • 950