Texture Memory

TEXTURE MEMORY IN - IN CUDA PERSPECTIVE Texture Memory -in CUDA Perspective VINAY MANCHIRAJU

TEXTURE MEMORY • Read only memory used by programs in CUDA • Used in General Purpose Computing for Accuracy and Efficiency. • Designed for DirectX and OpenGL rendering Pipelines.

WHY USE TEXTURES? • Can cache non consecutive memory locations unlike CPU caching schemes. • Designed to accelerate access patterns.

Texture memory is cached on a chip. • Provides higher effective bandwidth. • Reduces memory requests to the off-chip DRAM. • Improves performance of graphics application where memory access patterns exhibit great deal of spatial locality.

PARALLELIZING PHYSICAL SIMULATIONS • Results are more accurate with reduced computational complexity and lesser time to solve. • Textures have a significant role in simulation problems.

HEAT SIMULATION EXAMPLE • A rectangular room consisting of a grid. • Inside the grid various heaters with fixed temperatures are scattered in the cell .

FLOW OF HEAT Warmer cells tend to cool as the heat is dissipated to cooler regions and vice versa

AS A FUNCTION OF HEAT LOSS/GAIN • Imagine that there are 4 neighbors for a given cell. • K -> Rate of heat flow from one cell to another. • A large value of k will drive the system to a constant temperature quickly, while a small value will allow the solution to retain large temperature gradients longer.

THREE STEPS TO COMPUTE TEMPERATURE UPDATES • copy_const_kernel() • Copy Heater temperatures to respective grids • Enforce a restriction that temperatures of the cells with heaters are constant. • blend_kernel(): • Output temperatures are calculated based on the input temperatures of the grid using the equation. • Swap the input and output temperatures for the calculation in next step.

copy_const_kernel() • Convert threadIdx and blockIdx into an x and y coordinate. • Compute a linear offset into constant and input buffers. • If the cell in the constant grid is nonzero copy of the heater temperature in cptr[] to the input grid in iptr[] .

blend_kernel() • 1 thread for every cell. • Offsets of the neighbors in all the 4 directions are computed to read the temperatures of those cells. • Each thread reads its cell’s temperature, the temperatures of its neighboring cells, perform the previous update computation, and then update its temperature with the new value. • Calculate updated temperature adding old temperatures and scaled differences and the neighboring cell temperatures.

anim_kernel() • We use DataBlockcontains the constant buffer of heaters and the updated temperatures. • Arguments: pointer to a data block, number of ticks of animation that have elapsed.(not used) • We use a 16 x 16 grid and blocks of 256 threads.

anim_kernel() • After the iteration we swap the input and output buffers to obtain the final temperatures. • The temperatures are converted into colors and the bitmap image is transferred from GPU to CPU. • The Program.

USING TEXTURES • Declare inputs as texture references. • Use references to floating point textures . • Allocate GPU memory for these textures and then bind the references using cudaBindTexture()

cudaBindTexture() • Use specified buffer as a texture and texture reference as texture name. • Please check cudaBindTexture()

tex1Dfetch() • A Compiler intrinsic function. • Used to pass texIn, texOut, texConstSrc textures to the blend method. • This would help us to fetch the texture value into a float point variable.

copy_const_kernel()

cudaUnbindTexture()

USING 2D-TEXTURES • Reference Declaration: • Instead of using offset to calculate left, right, top and bottom we directly use x,y to access the texture.

USING 2-D TEXTURES • Bounds overflow over the grid is taken care of. • If one of x or y is less than zero, tex2D() will return the value at zero. • If one of these values is greater than the width, tex2D() will return the value at width 1.

tex2D

CudaBindTexture2d()

Tradeoffs 1D vs 2D • So from a performance standpoint, the decision between one- and two-dimensional textures is likely to be inconsequential. • For our particular application, the code is a little simpler when using two- dimensional textures because we happen to be simulating a two-dimensional domain. But in general, since this is not always the case, we suggest you make the decision between one- and two-dimensional textures on a case-by-case basis.

REFERENCES • http://http.developer.nvidia.com/Cg/tex1Dfetch.html • http://developer.download.nvidia.com/compute/cuda/4_1/rel/toolkit/docs/online/group__CUDART__HIGHLEVEL_g2aeb95eab6b9d90bb00b26406a27c515.html • http://developer.download.nvidia.com/compute/cuda/4_1/rel/toolkit/docs/online/group__CUDART__HIGHLEVEL_g67660ae3e9a1ff520575394f78087bea.html#g67660ae3e9a1ff520575394f78087bea

THANK YOU…

Texture Memory

Texture Memory

Presentation Transcript

Texture

CS 264, Lab 4: Texture Memory

Texture

Texture

Texture

Texture

Texture

Texture

Texture and Texture Mapping

Texture

Texture

texture

Texture

Texture

Texture

Texture

Texture

Texture

Texture

TEXTURE