80 likes | 92 Views
Data-Oriented Software Design 101. How conventional object-oriented techniques kill performance and what you can do about it. by Christopher Myburgh. Innocent looking object-oriented code. namespace ObjectOrientedPhysicsEngine { class Body { public Vector3 Position;
E N D
Data-Oriented Software Design 101 How conventional object-oriented techniques kill performance and what you can do about it. by Christopher Myburgh
Innocent looking object-oriented code namespace ObjectOrientedPhysicsEngine { classBody { publicVector3 Position; publicVector3 Velocity; // members for other data: mass, net force, damping, angular motion etc. } publicclassWorld { privateList<Body> _bodies; publicvoid Step(float dt) { // ... foreach (Body body in _bodies) { body.Position += body.Velocity * dt; } // ... } } } Body is a reference type. Body objects could be scattered all over heap memory. Every iteration of the loop can incur a cache-miss.
What's a cache-miss? • Cache is small, fast memory located on the CPU. • Programmers never work with cache directly, but data must be present in cache before the CPU can process it. • A “cache-miss” is when the data required for an operation is not in cache and must first be retrieved from main memory. • Data is copied from main memory to cache in chunks. So when a cache-miss occurs, memory adjacent to the desired data is copied as well. • Frequent cache-misses are bad because main memory is much slower than the CPU in most computer systems. So when data needs to be copied to cache, the CPU must idle, wasting cycles until the data arrives from main memory.
Improvement attempt no. 1 namespace ObjectOrientedPhysicsEngine2 { structBody { publicVector3 Position; publicVector3 Velocity; // members for other data: mass, net force, damping, angular motion etc. } publicclassWorld { privateBody[] _bodies; privateint _bodyCount; publicvoid Step(float dt) { // ... for (int i = 0; i < _bodyCount; ++i) { _bodies[i].Position += _bodies[i].Velocity * dt; } // ... } } } Body is now a value type. Bodies are now allocated together in one contiguous block of memory, so many bodies will be read into cache at a time. Cache-misses are reduced, but cache memory is still being wasted with data not relevant to the operation.
Data-oriented programming to the rescue namespace DataOrientedPhysicsEngine { publicclassWorld { privateVector3[] _bodyPositions; privateVector3[] _bodyVelocities; // arrays for other body data: mass, net force, damping, angular motion etc. privateint _bodyCount; publicvoid Step(float dt) { // ... for (int i = 0; i < _bodyCount; ++i) { _bodyPositions[i] += _bodyVelocities[i] * dt; } // ... } } } Flatten the Body type into arrays for each member. Minimal cache-misses! Only the data relevant to the operation is now read into cache, and it's all in contiguous memory!
Another example: scene graph namespace ObjectOrientedSceneGraph { classSceneNode { publicSceneNode Parent; publicList<SceneNode> Children; publicMatrix LocalTransform; publicMatrix WorldTransform; } publicclassScene { privateSceneNode _rootNode; privatevoid UpdateChildTransforms(SceneNode node) { foreach (SceneNode childNode in node.Children) { childNode.WorldTransform = childNode.LocalTransform * node.WorldTransform; UpdateChildTransforms(childNode); } } publicvoid Draw() { // update world transforms _rootNode.WorldTransform = _rootNode.LocalTransform; UpdateChildTransforms(_rootNode); // ... } } } Multiple heap allocations per scene node! Recursive updates require jumping all over heap memory! Cache-misses galore!
The data-oriented take namespace DataOrientedSceneGraph { publicclassScene { privateint[][] _parentNodeIndices; privateMatrix[][] _localTransforms; privateMatrix[][] _worldTransforms; privateint[] _sceneNodeCounts; privateint _graphHeight; publicvoid Draw() { // update world transforms _worldTransforms[0][0] = _localTransforms[0][0]; for (int i = 1; i < _graphHeight; ++i) { for (int j = 0; j < _sceneNodeCounts[i]; ++j) { int parentNodeIndex = _parentNodeIndices[i][j]; _worldTransforms[i][j] = _localTransforms[i][j] * _worldTransforms[i - 1][parentNodeIndex]; } } // ... } } } An array for each level of the graph, sorted by parent node index. No more recursion. The graph is updated one level at a time, processing data in the same order as it is laid out in memory. Super cache-friendly win!
Cons of data-oriented design • A system's public interface can become more restrictive and less elegant for clients to use. • The vast majority of a system's data and logic tends to end up in a single, massive class. • Inserting and removing data from the system can become far more complex, and thus prone to bugs that can corrupt the state of the entire system.