1 / 42

A step towards data orientation

A step towards data orientation. JOHAN TORP <JOHAN.torp@DICE.SE>. STHLM GAME DEVELOPER FORUM 5/5 2011. MY BACKGROUND. M.Sc. Computer Science. OO (Java) and functional programming (Haskell)

isabel
Download Presentation

A step towards data orientation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A step towards data orientation • JOHAN TORP <JOHAN.torp@DICE.SE> • STHLM GAME DEVELOPER FORUM 5/5 2011

  2. MY BACKGROUND • M.Sc. Computer Science. OO (Java) and functional programming (Haskell) • Worked ~5 years outside game industry. C++, generic programming & boost, DbC • AI coder at DICE ~2½ years Optimal game dev credentials? 

  3. NEW TRADE-OFFS • OOP • GP • FP • DbC • TMP PC & normal sized apps = cache schmache Games on consoles 5000 L2 misses = ~1ms - data-oriented design ftw!

  4. THIS TALK • A lot of OO code and knowledge out there • Incrementally moving from OO to cache-friendlier code

  5. AGENDA • Facts needed before looking at code • Cache-friendly pathfinding • Async vs sync code • Questions

  6. Frostbite • Visual domain specific scripting language • Gameplay / AI code in C++ • NavPower pathfinding middleware • EASTL containers • We love to blow up parts of our game worlds – and call this destruction

  7. PS3 (PPU) / XBOX 360 CACHE economy • PS3 has 1 core: 32KB data and 32KB instruction L1 cache • 512KB L2 I+D cache • 360 has 3 cores: 32KBdata and 32KBinstruction L1 cache for each core, • 1Mb L2 I+D shared by all cores • 1 L1 cache miss ~= 40 cycles “You miss L1 so much that you cry yourself to sleep every night with a picture of it under your pillow” @okonomiyonda • 1 L2 cache miss ~= 600 cycles • 1 L2 cache miss ~= 20 matrix multiplications • Other than heavy calculations: CPU performance ~= cache misses

  8. KEEP HOT DATA NEARBY Keep copy of common data nearby … in a compactrepresentation Pointer chasing thrashes both I-cache and D-cache Often better to copy frequently accessed data once each frame, access copy instead getBot()->getPlayer()->getControllable()->getWorldPosition()

  9. EXAMPLE DATA-ORIENTED INPUT /// Temporary struct containing information about a single sensor. /// Never stored between updates. structVisionInfo { VisionInfo(const AiSettings& settings, EntryComponent& owner, ...); Vec3 eyePos; Vec2 eyeForwardXz; uint playerId; // Extracted from settings float centralAngle; float peripheralAngle; float seeingDistance; bool seeThroughTerrain; };

  10. TemporarIES – STACK SPACE • Temporary data structures common in Data-Oriented Design • Stack variables or alloca() • Not suited for large amounts or large edge cases

  11. TemporarIES – SCRATCH PAD • Put aside 8x128kb blocks for ”scratch pad calculations” • Linear allocator – doesn’t free within block • Return whole block when done – zero fragmentation

  12. NEW / MALLOC Find a good slot in fragmented memory space Expensive! Container of new:ed objects scattered in memory Poor cache locality! Mix short/long lived allocations -> fragmention Lose memory over time! You should prefer pre-allocated flat vectors and try to minimize new/malloc

  13. Time to look at pathfinding 

  14. IN an OO WORLD AI DECISION MAKING OO PATHFINDING NAVPOWER OO ANIMATION OO

  15. DECISION TO MOVEMENT

  16. DECISION TO MOVEMENT

  17. DECISION TO MOVEMENT

  18. DECISION TO MOVEMENT

  19. DECISION TO MOVEMENT

  20. NAVPOWER OPERATIONS NEEDED • Find path • Load / unload nav mesh section • Add / remove obstacles • Path invalidation detection • Can go-tests • Line- / can go straight-tests, circle tests, triangle tests

  21. NAVPOWER OPERATIONS NEEDED • Find path • Load / unload nav mesh section • Add / remove obstacles • Path invalidation detection • Can go-tests • Line- / can go straight-tests, circle tests, triangle tests Collect and batch process for good cache locality

  22. ABSTRACTIONS • Pathfinder - find path, path invalidation, circle/line tests • Random position generator - can go-tests • Manager - load nav mesh, obstacles, destruction, updates Let some line tests in AI decision making remain synchronous

  23. Pathfinder INTERFACE classPathfinder { virtualPathHandle* findPath(constPathfindingPosition& start, constPathfindingPosition& end, floatcorridorRadius, PathHandle::StateListener* listener) = 0; virtualvoidreleasePath(PathHandle* path) = 0; virtualboolcanGoStraight(Vec3Refstart, Vec3Refend, Vec3* collision = nullptr) const = 0; };

  24. PATH HANDLE typedef eastl::fixed_vector<Vec3, 8> WaypointVector; typedef eastl::fixed_vector<float, 8> WaypointRadiusVector; structPathHandle { enumState {ComputingPath, ValidPath, NoPathAvailable, RepathingRequired}; class StateListener { virtualvoidonStateChanged(PathHandle* handle) = 0; }; PathHandle():waypoints(pathfindingArena()), radii(pathfindingArena()) {} WaypointVectorwaypoints; WaypointRadiusVectorradii; Statestate; };

  25. PATH HANDLE typedef eastl::fixed_vector<Vec3, 8> WaypointVector; typedef eastl::fixed_vector<float, 8> WaypointRadiusVector; struct PathHandle { enum State {ComputingPath, ValidPath, NoPathAvailable, RepathingRequired}; class StateListener { virtual void onStateChanged(PathHandle* handle) = 0; }; PathHandle():waypoints(pathfindingArena()), radii(pathfindingArena()) {} WaypointVector waypoints; WaypointRadiusVector radii; State state; };

  26. NAVPOWER PATHFINDER • classNavPowerPathfinder : publicPathfinder { • public: virtualPathHandle* findPath(...) override; • virtualPathHandle* findPathFromDestination(...) override; • virtualvoidreleasePath(...) override; • virtualboolcanGoStraight(...) constoverride; voidupdatePaths(); • voidnotifyPathListeners(); • private: • bfx::PolylinePathRCPtrm_paths[MaxPaths]; PathHandlem_pathHandles[MaxPaths]; • PathHandle::StateListener* m_pathHandleListeners[MaxPaths]; • u64m_usedPaths, m_updatedPaths, m_updatedValidPaths; };

  27. CORRIDOR STEP Copy all new NavPower paths -> temporary representation Drop unnecessary points for all paths Corridor adjust all paths Copy temporaries -> PathHandles

  28. CORRIDOR STEP Copy all new NavPower paths -> temporary representation Drop unnecessary points for all paths Corridor adjust all paths Copy temporaries -> PathHandles typedef eastl::vector<CorridorNode> Corridor; ScratchPadArena scratch; Corridor corridor(scratch); corridor.resize(navPowerPath.size()); // Will allocate memory using scratch pad

  29. CORRIDOR STEP 2-4 for (...) { // Loop through all paths in their corridor representation dropUnnecessaryPoints(it->corridor, scratchPad); for (...) shrinkEndPoints(it->corridor); for (...) calculateCornerDisplacements(it->corridor); for (...) displaceCorners(it->corridor); for (...) shrinkSections(it->corridor); for (...) copyCorridorToHandle(it->corridor, it->pathHandle); }

  30. NavPOWER MANAGEr voidNavPowerManager::update(floatframeTime) { m_streamingManager.update(); m_destructionManager.update(); m_obstacleManager.update(); for (PositionGeneratorVector::const_iteratorit= ...) (**it).update(); bfx::SystemSimulate(frameTime); for (PathfinderVector::const_iteratorit=m_pathfinders.begin(), ...) (**it).updatePaths(); for(PathfinderVector::const_iteratorit=m_pathfinders.begin(), ...) (**it).notifyPathListeners(); }

  31. NAIVE OO CALL PATTERN AI Decision Making Code Pathfinding Runtime Code NavPower Code Animation Code EXECUTION

  32. CURRENT CALL PATTERN AI Decision Making Code AI Decision Making Code Pathfinding Runtime Code HOT HOT HOT!!!!! NavPower Code Animation Code Animation Code EXECUTION

  33. BATCHING BENEFITS • Keep pathfinding code/data cache hot • Avoid call sites cache running cold • Easier to jobify / SPUify • Easy to schedule and avoid spikes

  34. SIMPLIFIED ARCHITECTURE AI DECISION MAKING OO PATHFINDING NAVPOWER OO ANIMATION OO

  35. LESS SIMPLIFIED ARCHITECTURE SCRIPTING SERVER CLIENT AI DECISION MAKING PATH FOLLOWING PATHFINDING NAVPOWER LOCOMOTION DRIVING LOCOMOTION Waypoint Positions VEHICLE INPUT Waypoint Data ANIMATION CorridorRadii

  36. My precious latency Each server update 1. Each AI decision making 2. Pathfinding manager update All pathfinding requests All corridor adjustments All PathHandle notifications -> path following -> server locomotion 3. Network pulse. Server locomotion -> client locomotion 4. ...rest of update No extra latency added

  37. DELAYING/BATCHING • Callbacks. Delay? Fire in batch? • Handle+poll instead of callbacks. Poll in batch? • Record messages/events, act on them later.. in batch? • Assume success, recover from failure next update

  38. DELAYING/BATCHING - PROS & CONS + Cache friendly & parallelizable + Easy to profile & schedule + Avoid bugs with long synchronous callback chains + Modular - More glue code managers, handles, polling update calls, multiple representations of the same data - More bugs index fiddling, life time handling, latency, representations drifting out of sync - Callstack won’t tell you everything break point in sync code gives easy-to-debug vertical slice... ...but can we afford vertical deep dives?

  39. INCREMENTAL GAINS • Do not have to abandon OO nor rewrite the world • Start small, batch a bit, cut worst pointer chasing, avoid deep dives, grow from there • Much easer to rewrite a system in a DO fashion afterwards Existing code is crystallized knowledge, refactor incrementally to learn!

  40. SUMMARY • Background: Console caches, heap allocations expensive, temporary memory • AI decision making – pathfinding – animation • Code: Async abstractions, handles, scratch pad, fixed_vector, batch processing • Latency analysis, pros&cons sync vs async Think about depth/width of calls, try stay within your system, keep hot data nearby

  41. WHEN BATTLING AN OO OCTUPUS Avoid rewritis You can retract your synchronous tentacles slowly 

  42. QUESTIONS? email johan.torp@dice.se twitter semanticspeed slides www.johantorp.com

More Related