1 / 45

Using GPUView to Understand your DirectX 11 Game Jon Story Developer Relations Engineer, AMD

Using GPUView to Understand your DirectX 11 Game Jon Story Developer Relations Engineer, AMD. Agenda. Windows Display Driver Model (WDDM) What is GPUView ? CPU & GPU Queues Threads & Events Case Studies Summary. Windows Display Driver Model (WDDM). Graphics & WDDM. Session Space.

garth-west
Download Presentation

Using GPUView to Understand your DirectX 11 Game Jon Story Developer Relations Engineer, AMD

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using GPUView to Understand your DirectX 11 GameJon StoryDeveloper Relations Engineer, AMD

  2. Agenda • Windows Display Driver Model (WDDM) • WhatisGPUView? • CPU & GPU Queues • Threads & Events • Case Studies • Summary

  3. Windows Display Driver Model (WDDM)

  4. Graphics & WDDM Session Space Kernel Mode Driver (KMD) Kernel Mode Win32 kernel Dxgkrnl Application D3D Runtime DWM User Mode Application Process DWM Process User Mode Driver (UMD)

  5. Feeding the GPU… GPU GPU Scheduler Database 1 2 DMA Buffer Wait Win32k & dxgkrnl KMD Kernel Mode D3D Runtime UMD D3D Runtime UMD Application #1 Command Buffer Application #2 Command Buffer User Mode

  6. What is GPUView?

  7. What is GPUView? • An additional Microsoft performancetool • Complimentsexistingtools • Part ofthe Windows 7 SDK • Built on Event Tracingfor Windows • Perfectformonitoring CPU/GPU interaction (evenfor multiple GPU setups) • Allowsyoutoseehowwellthe GPU isbeingfed • Supports DX9, DX10 & DX11 on Win7

  8. Capturing Data • Run an elevated command prompt • \Program Files\Microsoft Windows Performance Toolkit\GPUView • Start your game in windowed mode • For fullscreen mode perhaps use PsExec from a remote machine • Start capturing with log.cmd • Capture 10-15 seconds of your game • Stop logging with log.cmd • Open merged.etl file with GPUView.exe

  9. Was this tool created for driver programmers?

  10. Navigating the Data • Usethemousetoselect a region • Ctrl+Zzooms in to a selection • Z zooms out • Use +/- toseemoreorlessdetail • Ctrl+Eopenstheeventmenu • Click on objectsfor additional details • More on thislater…

  11. Zooming in…

  12. DMA Packet Color Coding • Varioustypesof DMA packetsmaybesubmittedtothe GPU: • Red: Paging packet • Black: Preemption packet • Brown: DWM packet • Other Color: Standard packet • Other Color + Cross-Hatch: Present packet

  13. What does a Standard DMA Packet Represent? • Graphics system state objects • Draw commands • References to resource allocations • Textures • Vertex & Index Buffers • Render Targets • Constant Buffers

  14. CPU & GPU Queues

  15. SW Context CPU Queues (1) D3D app stacking up 3 frames of packets Desktop Window Manager packet

  16. SW Context CPU Queue (2) CPU queue depth is 6 Task submitted to HW queue CPU queue is empty! New Task submitted to CPU queue

  17. SW Context CPU Queues (3) • Objects represent work submitted to a GPU context • Queue is represented through time as a stack • Stack grows on submission of work by the UMD • Stack shrinks as work is completed by the GPU

  18. GPU HW Context Queue (1) Present Packet Preemption packet Queued DMA Packet GPU Processing DMA Packet DWM

  19. GPU HW Context Queue (2) GPU starts working on packet GPU finishes working on packet GPU has no work to do

  20. GPU HW Context Queue (3) • Queue is represented through time as a stack • Stack grows on submission of work by the KMD • Stack shrinks as work is completed by the GPU • Gaps indicate a CPU side bottleneck

  21. Object Selection Represents latency

  22. Object Details (1) Packet type & timing information Allocation references in DMA packet

  23. Object Details (2) Preferred memory segment P0 = Preferred P1 = Less P2 = Least (w) = Writable by GPU

  24. Object Viewer Segment Numbers: 1 = Vid Mem (CPU visible) 2 = Vid Mem (Non visible) 3 = PCI Express Mem Clearly the depth buffer

  25. Paging Buffer Packet • Submitted as the result of a paging operation (perhaps a large texture) • Cause is usually resulting from preparing a DMA buffer • Look at the DMA packet that follows the paging operation

  26. Threads & Events

  27. HW Threads Colored bars represent idle time Gaps represent work

  28. Thread Execution • Thread segmentsarecoloredcoded: • Light blue: Kernel mode • Dark blue: dxgkrnl • Red: KMD (Kernel Mode Driver)

  29. Charts: FPS / Latency / Memory

  30. Viewing Events • Ctrl+E opens the Event View window • Can track whatever events take your interest • DX- Create / Destroy Allocation • DX Block • Suggests possible resource contention • Perhaps trying to lock an in use buffer

  31. V-Sync Event

  32. Case Studies

  33. DrawPredicated SDK Sample GPU is busy, no gaps CPU queue is buffering up nicely App thread not saturated

  34. DrawPredicated SDK Sample: + blocking occlusion queries App thread fully saturated Not enough being queued up GPU is going idle

  35. Getting Occlusion Queries Right • Delay picking up results by N frames • Where N = Number of GPUs • May need to artificially inflate occlusion volumes to avoid poping

  36. What else could cause this problem? • Locking a Render Target • Use CopyResource & Staging Textures • This is a queued operation

  37. ContentStreaming SDK Sample (1) Paging packets GPU is going idle

  38. ContentStreaming SDK Sample (2) Large resources not getting preferred segments

  39. Avoiding Paging • Keep your video memory usage under control • Especially in MSAA modes • Drop texture resolution for lower end HW • Avoid excessively large amounts of dynamic data • Textures & Vertex Buffers • If not sure – talk to us!

  40. MultithreadedRendering11 SDK Sample But there is a lot of D3D runtime / driver overhead Additional threads preparing packets

  41. Multi-Threaded Rendering and Deferred Contexts • It is a complex issue • Don‘t expect it to be a magic bullet • Strongly recommend you talk to developer relations from AMD & NVIDIA

  42. Summary

  43. Summary • Make sure you‘re keeping the ever hungry GPU fed • Keep track of CPU/GPU interaction • Keep track of your threads • Monitor multi-GPU interaction • Add GPUView to your toolbox

  44. Acknowledgments • Microsoft for creating GPUView  • Microsoft for providing background content

  45. Questions?

More Related