Data Oriented design

or How I Learned to Stop Worrying and Love the Cache









Robert Rouhani

What is it?

A paradigm that focuses on data , not objects

Why?


  • Cache misses are SLOW
  • Classes have overhead

Massive performance gap

Distance

What Is Cache?


But WHY?


  • You need to process a LOT of data
  • Code is time-critical
  • Examples:
    • Particle systems in games
    • Data analysis
    • Anything embedded

An example


class Particle {
    Vec3 position;
    Vec3 velocity;
    Color color;
    
    void update() {
        position.x += velocity.x;
        position.y += velocity.y;
        position.z += velocity.z;
    }

    void render(GraphicsContext& ctx) {
        //...
    }
};

What's wrong, robert?


class Particle {
    Vec3 position;
    Vec3 velocity;
    Color color;                <---- Cached but unused in update()
    
    void update() {             <---- Potential i-cache miss per particle
        position.x += velocity.x;   <---
        position.y += velocity.y;   <--- Potential data misses
        position.z += velocity.z;   <---
    }

    void render(GraphicsContext& ctx) {
        //...
    }
};

Worst case

void update() {
    position.x += velocity.x;
    position.y += velocity.y;
    position.z += velocity.z;
}
For (only) 4 particles:
one.update() -> i-cache miss (~600 cycles)
position.x -> data cache miss (~600 cycles)
velocity.x -> data cache miss (~600 cycles)
vector addition -> (~6 cycles)

600 + 600 + 600 + 6 = 1806 cycles
1806 * 4 = 7224 cycles
For only about 24 cycles of meaningful processing

The Solution (Part 1)


class ParticleManager {
    std::vector<Vec3> positions;     <--- Data is stored sequentially,
    std::vector<Vec3> velocities;    <--- not in bits and pieces all
    std::vector<Color> colors;       <--- over the heap

    void update() {     <---- Reduce number of i-cache misses to at most 1
        for (int i = 0; i < positions.size(); i++){
             positions[i].x += velocities[i].x;  <-- Read data sequentially
             positions[i].y += velocities[i].y;  <-- to minimize the number
             positions[i].z += velocities[i].z;  <-- of data cache misses
        }
    }
};

This is still sub-optimal, a particle's position and velocity are now very far apart in memory.

The Remaining problem


  • Position and Velocity vectors separate
  • Causes 2 cache misses when done with row
  • We can still reduce this!


The Solution (Part 2)


struct ParticleMotionData {
    Vec3 position;
    Vec3 velocity;
};
class ParticleManager {
    std::vector<ParticleMotionData> motion;  <-- Stored together now
    std::vector<Color> colors;

    void update() {
        for (int i = 0; i < motion.size(); i++) {
            motion[i].position.x += motion[i].velocity.x;
            motion[i].position.y += motion[i].velocity.y;
            motion[i].position.z += motion[i].velocity.z;
        }
    }
};
Data that is commonly used together should be stored together, sequentially.

performance


  • No hard numbers on the example (sorry)
  • Other presentations show 2x-4x performance
  • This Sony presentation
    • Reduction from 19.6ms to 4.8ms
    • Only moving data around in memory!

Sony Presentation

better design


  • Easier to isolate actions
  • Easier to serialize
  • Easier to send over a network
  • Easier to make parallel

Multi-threading


  • Create thread pool
  • Divide your array of data into chunks
  • Assign threads to chunks of data



It's THAT simple!

Further reading

Thanks


Professor Goldschmidt
Professor Moorthy
Sean O' Sullivan
RCOS




Questions?

Data Oriented design

By Robert Rouhani

Data Oriented design

  • 3,157