Data Oriented design

or How I Learned to Stop Worrying and Love the Cache

Robert Rouhani

What is it?

A paradigm that focuses on data , not objects

Why?

Cache misses are SLOW
Classes have overhead

Massive performance gap

http://gameprogrammingpatterns.com/images/data-locality-chart.png

Distance

What Is Cache?

But WHY?

You need to process a LOT of data
Code is time-critical
Examples:

Particle systems in games
Data analysis
Anything embedded

An example

class Particle {
    Vec3 position;
    Vec3 velocity;
    Color color;
    
    void update() {
        position.x += velocity.x;
        position.y += velocity.y;
        position.z += velocity.z;
    }

    void render(GraphicsContext& ctx) {
        //...
    }
};

What's wrong, robert?

class Particle {
    Vec3 position;
    Vec3 velocity;
    Color color;                <---- Cached but unused in update()
    
    void update() {             <---- Potential i-cache miss per particle
        position.x += velocity.x;   <---
        position.y += velocity.y;   <--- Potential data misses
        position.z += velocity.z;   <---
    }

    void render(GraphicsContext& ctx) {
        //...
    }
};

Worst case

void update() {
    position.x += velocity.x;
    position.y += velocity.y;
    position.z += velocity.z;
}

For (only) 4 particles:

one.update() -> i-cache miss (~600 cycles)

position.x -> data cache miss (~600 cycles)

velocity.x -> data cache miss (~600 cycles)

vector addition -> (~6 cycles)

600 + 600 + 600 + 6 = 1806 cycles

1806 * 4 = 7224 cycles

For only about 24 cycles of meaningful processing

The Solution (Part 1)

class ParticleManager {
    std::vector<Vec3> positions;     <--- Data is stored sequentially,
    std::vector<Vec3> velocities;    <--- not in bits and pieces all
    std::vector<Color> colors;       <--- over the heap

    void update() {     <---- Reduce number of i-cache misses to at most 1
        for (int i = 0; i < positions.size(); i++){
             positions[i].x += velocities[i].x;  <-- Read data sequentially
             positions[i].y += velocities[i].y;  <-- to minimize the number
             positions[i].z += velocities[i].z;  <-- of data cache misses
        }
    }
};

This is still sub-optimal, a particle's position and velocity are now very far apart in memory.

The Remaining problem

Position and Velocity vectors separate
Causes 2 cache misses when done with row
We can still reduce this!

The Solution (Part 2)

struct ParticleMotionData {
    Vec3 position;
    Vec3 velocity;
};
class ParticleManager {
    std::vector<ParticleMotionData> motion;  <-- Stored together now
    std::vector<Color> colors;

    void update() {
        for (int i = 0; i < motion.size(); i++) {
            motion[i].position.x += motion[i].velocity.x;
            motion[i].position.y += motion[i].velocity.y;
            motion[i].position.z += motion[i].velocity.z;
        }
    }
};

Data that is commonly used together should be stored together, sequentially.

performance

No hard numbers on the example (sorry)
Other presentations show 2x-4x performance
This Sony presentation

Reduction from 19.6ms to 4.8ms
Only moving data around in memory!

Sony Presentation

better design

Easier to isolate actions
Easier to serialize
Easier to send over a network
Easier to make parallel

Multi-threading

Create thread pool
Divide your array of data into chunks
Assign threads to chunks of data

It's THAT simple!

Thanks

Professor Goldschmidt

Professor Moorthy

Sean O' Sullivan

RCOS

Questions?

Data Oriented design

By Robert Rouhani

Data Oriented design

3,157

Data Oriented design

What is it?

Why?

Massive performance gap

Distance

What Is Cache?

But WHY?

An example

What's wrong, robert?

Worst case

The Solution (Part 1)

The Remaining problem

The Solution (Part 2)

performance

Sony Presentation

better design

Multi-threading

Further reading

Thanks

Questions?

Data Oriented design

More from Robert Rouhani