1 / 34

Tools for Investigating Graphics System Performance

Tools for Investigating Graphics System Performance. Matthew Fisher Steve Pronovost. Goal. A video game runs slowly, skips frames, has high latency, etc. and the developers want to fix this problem

Leo
Download Presentation

Tools for Investigating Graphics System Performance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tools for Investigating Graphics System Performance Matthew Fisher Steve Pronovost

  2. Goal • A video game runs slowly, skips frames, has high latency, etc. and the developers want to fix this problem • The problem is almost always a cascade of bottlenecks at the application, CPU, and GPU levels that is very challenging to investigate locally • We want tools that lets programmers solve these problems faster

  3. Approaches • Profiling • Rig the game events with logging or use an automatic profiler • PIX (for Windows and Xbox 360) • All calls by the game to the graphics API are logged • GPUView • OS logs all CPU, graphics kernel and graphics driver events

  4. Profiling • Manual profiling requires a significant amount of development effort • Polling-based automatic profiling can work reasonably well for CPU applications but doesn’t capture graphics or memory transfer events well • Percentage-based statistics (“you spent 45% of the time in function X”) can sometimes be useful and sometimes extremely misleading

  5. PIX • Released by Microsoft as part of the DirectX SDK • Multiple modes for investigating performance targeted at game developers • Interactive mode • Frame logging • Frame capture and playback

  6. PIX – Interactive Mode • Various counters stream by as the game runs • You can change the counters, hope is to find that the observed problem correlates with one of the counters

  7. PIX – Interactive Mode

  8. Commonly Used Counter Types • Number, type, and size of draw primitive calls • Number of texture, vertex/index buffer locks, and what memory pool was locked • Object creation and destruction events • Allocated system and video memory • Frame latency, seconds per frame • Page faults

  9. PIX – Frame Capture Mode

  10. PIX – Debug Pixel

  11. Questions PIX is good at • Are object locks causing the frame skipping problem users are experiencing? • Are we allocating too many resources we don’t use? • What are the API calls that are taking the longest time to execute? • Why was this pixel in the sky green?

  12. GPUView

  13. Windows Display Driver Model • The XP Display Driver Model required applications to cede control of the graphics infrastructure and was largely designed assuming a single 3D application would be running • The Vista Display Driver Model added standard scheduling principles forcing applications to share control of graphics memory and compute resources

  14. GPUView • The graphics model switch induced a variety of constraints on graphics applications and forced highly optimized graphics drivers to be restructured • Many games were running more slowly on Vista than they did on XP (~5% - 30% slower) • GPUView was designed to help investigate these problems and see what stage was causing the speed difference

  15. Event Tracing • The GPUView logger enables logging of a vast set of events in the OS, such as • All calls to the Windows graphics kernel • All resource creation, lock, destruction, etc. events • All command buffer submissions • Context switches (w/ stack trace and reason) • Kernel mode enter/exits (w/ stack trace) • World of Warcraft generates approximately 1GB every 3 seconds

  16. GPUView Without Any Graphics

  17. Windows Display Driver Model • Applications build up local command buffers • When these command buffers get big enough they are submitted to the application’s local graphics queue for processing • The graphics scheduler selects which application should be running on which graphics card and submits work to the corresponding hardware queue

  18. One Second of a Game

  19. Setup

  20. Multiple Applications Fighting

  21. Simple Problems

  22. Relatively Normal Execution

  23. GPU Starvation

  24. GPU Idle

  25. Sleepy App

  26. Huge Render Times (GPU Bound)

  27. GPU and CPU Starvation

  28. Answering Questions

  29. Why Did Our Thread Context Switch?

  30. Does Surface Allocation Cause Frame Stuttering?

  31. Thoughts • Surprisingly, the overhead of GPUView logging is pretty minimal and the traces often reflect the underlying problem well • The biggest advantage of GPUView over PIX is that PIX can’t tell you crucial things like when the GPU is blocked on the CPU • GPUView is excellent for telling you what part of the application needs optimization

  32. Driver Perspective • Provides a lot of detail to let display driver writers and the DirectX graphics kernel diagnose problems with task submission, the command buffer submission threads, GPU preemption, video skipping, etc.

More Related