1 / 16

An Analytical Model for a GPU

An Analytical Model for a GPU. Overview. SVM Kernel Behavior: Need for other metrics. Degree of Parallelism. GPU Architecture Each SM executes multiple warps in a time-sharing fashion while one or more are waiting for memory values

bessie
Download Presentation

An Analytical Model for a GPU

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Analytical Model for a GPU

  2. Overview

  3. SVM Kernel Behavior: Need for other metrics

  4. Degree of Parallelism • GPU Architecture • Each SM executes multiple warps in a time-sharing fashion while one or more are waiting for memory values • Hiding the execution cost of warps that are executed concurrently. • How many memory requests can be serviced and how many warps can be executed together while one warp is waiting for memory values.

  5. MWP and CWP • Memory Warp: • The warp that is waiting for memory values • Memory Warp waiting period: • The time period from right after one warp sent memory requests until all the memory requests from the same warp are serviced. • CWP (Computation Warp Parallelism) • represents the number of warps that the SM processor can execute during one memory warp waiting period plus one. • MWP (Memory Warp Parallelism) • represents the maximum number of warps per SM that can access the memory simultaneously during memory warp waiting period

  6. Relationship between MWP and CWP: CWP > MWP Total execution time (a) 8 warps (b) 4 warps What is getting hidden?

  7. What is going on here?

  8. Relationship between MWP and CWP: MWP > CWP Total execution time (a) 8 warps (b) 4 warps

  9. Relationship between MWP and CWP: MWP > CWP Total execution time (8 warps)

  10. Not Enough Warps Running

  11. CPI PTX instruction set

  12. Model

  13. Example Blocks: 80 Threads: 128 5 blocks per SM

More Related