Data Oriented Programming

by Gendo (aka Daniele Maccioni)

Moore's Law

Hello! I'm important too!

Memory Gap

1980

RAM latency = ~1-2 cycles

2015

RAM latency = ~200 cycles

Cache Hit :)

Fetch instruction or data
(L1-I vs L1-D)
Search for a cache entry with correct tag
Load instruction

Cache Miss D:

Every layer is an order of magnitude worse than the previous one
L1 -> L2 -> RAM
If everything fails and we need to access RAM, we can spend HUNDREDS of cycles just waiting

Spatial Locality

if you reference a memory location it is likely that you will reference nearby locations too

Temporal Locality

if you reference a memory location it is likely that you will reference it again in the near future

When a byte of code or data is loaded in cache from the RAM a chunk of contiguous memory is fetched instead, called: cache line

Memory is the bottleneck...

...and if you care about performance...

...software hardware is the platform!

Intel Haswell i7-4770

32 kb L1
256 kb L2
8 mb L3
64 b cache line

L1 latency ~4-5 cycles
L2 latency ~12 cycles
L3 latency ~43 cycles
RAM latency ~230 cycles

OOP
Is not
So great...

Code following a model of the world
Creating independent "reusable" objects"
Hiding complexity
Code and data mixed together
Code is more important than data
Array of structures or...
...array of pointers to structure

class Object {
public:
    void addChild(Object *child);
    void removeChild(Object *child);
    virtual void update();
private:
    int ID;
    int width;
    int height;
    std::vector<Object *> children;
};

class MovableObject: public Object {
public:
    void translate(int x, int y);
private:
    int x;
    int y;
};

Side Effects

Memory is very fragmented:
code mixed with data, polymorphism,
encapsulation, templates...
High complexity
Very difficult to understand what's
going on under the hood
Tons of cache misses
Memory unfriendly
The cache will hate you
...

"General" Solutions

class ObjectManager : public Manager {
public:
    // ...
    
    void initObject(Object *);
    void updateObject(Object *);
    void removeObject(Object *);
    

    // ...
};

Do we really have only one object?

The common case for data is not considered

Branch Mispredictions

class SystemNetwork : public SystemSocket {
public:
    // ...

    void sendMessage(Message message) {
        int message_type = message.type;

        if (message_type == Message::Type::Text) {
            // ...
        } else if (message_type == Message::Type::Binary) {
            // ...
        }

        if (inactive) {
            // ...
        }
    }

    // ...
};

Difficult to predict the code path

Data
Oriented
Principles

Guidelines to create simpler code...
...and cleaner code paths...
...that are also cache-friendly and more efficient

Is All About Data

A software is a sequence of data transformations
Problems are always about data
Computers are data processing machines

Code Designed Around a Model of Data

Data flow is the focus
How data is read, how it is processed, how it is stored in memory
Follow the nature of the problem data: minimize transformations

Implement the Common Case

What's the common case for the data I'm dealing with?
Implement the common case not the "general" solution

// The Common Case
void updateObjects(Objects *objects, int count) {
    // ...
}

class Object {
    // The 0.01%: I will always have multiple objects!
    void update() {}
}

Separate Code From Data

Make data emerge from the code
Simpler code
Pipeline of data transformations

class Object {
    // ...
    int x
    int y
    void move(int x, int y);
    // ...
}

Point2D positions[COUNT_OBJS];
Point2D movements[COUNT_OBJS];

void moveObjects() {
    for (int i = 0; i < COUNT_OBJS; ++i) {
        positions[i] += movements[i];
    }
}

Packing Data

Avoid branching
Avoid complex code path

class Object {
    // ...
    void update() {
        if (active) {
            // ...
        } else {
            // ...
        }
    }
    // ...
};

void updateObjects(Objects *objects) {

    int numActives = sortByActive(objects);

    for (int i = 0; i < numActives; ++i) {
        // ...
        Object *obj = objects[i];
        // ...
    }
}

Hot/Cold Splitting

Split very frequently used data from rarely used one
Reduce the size of objects and structs in memory
Make data flow more explicit

Avoid Polymorphism

Simpler structure
Easy memory layout
Arrays of simple homogeneous data are better than complex hierarchies
Avoid vtable
Avoid memory fragmentation

Happy Cache

Data is gathered together in homogeneous chunks of memory
Memory layout is simpler
More predictable code paths
One array for each type of data

Logic grouped together to use what's already in cache
No virtual methods and tables that make you jump around in memory

What can we do?

Flat the hierarchy
Avoid array of pointers
Extract data from code
Identity the transformation flow
Pre-allocate memory
Group similar operations together

Data Oriented Programming

Moore's Law

Hello! I'm important too!

Memory Gap

1980

2015

Cache Hit :)

Cache Miss D:

Spatial Locality

Temporal Locality

Memory is the bottleneck...

...and if you care about performance...

...software hardware is the platform!

Intel Haswell i7-4770

OOP
Is not
So great...

Side Effects

"General" Solutions

Branch Mispredictions

Data
Oriented
Principles

Is All About Data

Code Designed Around a Model of Data

Implement the Common Case

Separate Code From Data

Packing Data

Hot/Cold Splitting

Avoid Polymorphism

Happy Cache

What can we do?

Questions?

Data Oriented Programming

Data Oriented Programming

Gendo Ikari

Data Oriented Programming

Moore's Law

Hello! I'm important too!

Memory Gap

1980

2015

Cache Hit :)

Cache Miss D:

Spatial Locality

Temporal Locality

Memory is the bottleneck...

...and if you care about performance...

...software hardware is the platform!

Intel Haswell i7-4770

OOP Is not So great...

Side Effects

"General" Solutions

Branch Mispredictions

Data Oriented Principles

Is All About Data

Code Designed Around a Model of Data

Implement the Common Case

Separate Code From Data

Packing Data

Hot/Cold Splitting

Avoid Polymorphism

Happy Cache

What can we do?

Questions?

Data Oriented Programming

More from Gendo Ikari

OOP
Is not
So great...

Data
Oriented
Principles