1 / 27

Texture Memory

TEXTURE MEMORY IN - IN CUDA PERSPECTIVE. Texture Memory. -in CUDA Perspective. VINAY MANCHIRAJU. TEXTURE MEMORY. Read only memory used by programs in CUDA Used in General Purpose Computing for Accuracy and Efficiency.

addison
Download Presentation

Texture Memory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TEXTURE MEMORY IN - IN CUDA PERSPECTIVE Texture Memory -in CUDA Perspective VINAY MANCHIRAJU

  2. TEXTURE MEMORY • Read only memory used by programs in CUDA • Used in General Purpose Computing for Accuracy and Efficiency. • Designed for DirectX and OpenGL rendering Pipelines.

  3. WHY USE TEXTURES? • Can cache non consecutive memory locations unlike CPU caching schemes. • Designed to accelerate access patterns.

  4. Texture memory is cached on a chip. • Provides higher effective bandwidth. • Reduces memory requests to the off-chip DRAM. • Improves performance of graphics application where memory access patterns exhibit great deal of spatial locality.

  5. PARALLELIZING PHYSICAL SIMULATIONS • Results are more accurate with reduced computational complexity and lesser time to solve. • Textures have a significant role in simulation problems.

  6. HEAT SIMULATION EXAMPLE • A rectangular room consisting of a grid. • Inside the grid various heaters with fixed temperatures are scattered in the cell .

  7. FLOW OF HEAT Warmer cells tend to cool as the heat is dissipated to cooler regions and vice versa

  8. AS A FUNCTION OF HEAT LOSS/GAIN • Imagine that there are 4 neighbors for a given cell. • K -> Rate of heat flow from one cell to another. • A large value of k will drive the system to a constant temperature quickly, while a small value will allow the solution to retain large temperature gradients longer.

  9. THREE STEPS TO COMPUTE TEMPERATURE UPDATES • copy_const_kernel() • Copy Heater temperatures to respective grids • Enforce a restriction that temperatures of the cells with heaters are constant. • blend_kernel(): • Output temperatures are calculated based on the input temperatures of the grid using the equation. • Swap the input and output temperatures for the calculation in next step.

  10. copy_const_kernel() • Convert threadIdx and blockIdx into an x and y coordinate. • Compute a linear offset into constant and input buffers. • If the cell in the constant grid is nonzero copy of the heater temperature in cptr[] to the input grid in iptr[] .

  11. blend_kernel() • 1 thread for every cell. • Offsets of the neighbors in all the 4 directions are computed to read the temperatures of those cells. • Each thread reads its cell’s temperature, the temperatures of its neighboring cells, perform the previous update computation, and then update its temperature with the new value. • Calculate updated temperature adding old temperatures and scaled differences and the neighboring cell temperatures.

  12. anim_kernel() • We use DataBlockcontains the constant buffer of heaters and the updated temperatures. • Arguments: pointer to a data block, number of ticks of animation that have elapsed.(not used) • We use a 16 x 16 grid and blocks of 256 threads.

  13. anim_kernel() • After the iteration we swap the input and output buffers to obtain the final temperatures. • The temperatures are converted into colors and the bitmap image is transferred from GPU to CPU. • The Program.

  14. USING TEXTURES • Declare inputs as texture references. • Use references to floating point textures . • Allocate GPU memory for these textures and then bind the references using cudaBindTexture()

  15. cudaBindTexture() • Use specified buffer as a texture and texture reference as texture name. • Please check cudaBindTexture()

  16. tex1Dfetch() • A Compiler intrinsic function. • Used to pass texIn, texOut, texConstSrc textures to the blend method. • This would help us to fetch the texture value into a float point variable.

  17. copy_const_kernel()

  18. cudaUnbindTexture()

  19. USING 2D-TEXTURES • Reference Declaration: • Instead of using offset to calculate left, right, top and bottom we directly use x,y to access the texture.

  20. USING 2-D TEXTURES • Bounds overflow over the grid is taken care of. • If one of x or y is less than zero, tex2D() will return the value at zero. • If one of these values is greater than the width, tex2D() will return the value at width 1.

  21. tex2D

  22. CudaBindTexture2d()

  23. Tradeoffs 1D vs 2D • So from a performance standpoint, the decision between one- and two-dimensional textures is likely to be inconsequential. • For our particular application, the code is a little simpler when using two- dimensional textures because we happen to be simulating a two-dimensional domain. But in general, since this is not always the case, we suggest you make the decision between one- and two-dimensional textures on a case-by-case basis.

  24. REFERENCES • http://http.developer.nvidia.com/Cg/tex1Dfetch.html • http://developer.download.nvidia.com/compute/cuda/4_1/rel/toolkit/docs/online/group__CUDART__HIGHLEVEL_g2aeb95eab6b9d90bb00b26406a27c515.html • http://developer.download.nvidia.com/compute/cuda/4_1/rel/toolkit/docs/online/group__CUDART__HIGHLEVEL_g67660ae3e9a1ff520575394f78087bea.html#g67660ae3e9a1ff520575394f78087bea

  25. THANK YOU…

More Related