80 likes | 203 Views
Addressing 2-D grids with 3-D blocks Class Discussion Notes. ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, March 5, 2011, 3-DBlocks.ppt. General Approach. Given two-dimensional addressing, row, column Use the general 2-D to 1-D flattening equation: index = col + row * N col
E N D
Addressing 2-D grids with 3-D blocks Class Discussion Notes ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, March 5, 2011, 3-DBlocks.ppt
General Approach Given two-dimensional addressing, row, column Use the general 2-D to 1-D flattening equation: index = col + row * Ncol where Ncol is the total number of columns in a row. Apply to 2-D addressing structures repeatedly if necessary.
We have already considered 2-D grids and 2-D blocks x Grid y Block blockIdx.y blockIdx.x threadID.y threadID.x Thread
Global thread ID – one approach Applicable when mapping 2-D data array onto grid. Determine number of threads there are to the chosen thread, row and column: col = blockIdx.x*blockDim.x+threadIdx.x row = blockIdx.y*blockDim.y+threadIdx.y Then use: ThreadID = col + row * N where N is the number of columns of threads in grid. N = blockDim.x * gridDim.x ThreadID = (blockIdx.x*blockDim.x+threadIdx.x) + (blockIdx.y*blockDim.y+threadIdx.y)* (blockDim.x * gridDim.x) = blockIdx.x*blockDim.x+threadIdx.x+ blockIdx.y*blockDim.y* blockDim.x * gridDim.x + threadIdx.y*blockDim.x *gridDim.x
Global thread ID - Another approach Using the general 2-D to 1-D flattening equation: index = column + row * Ncolumn Block ID within grid: blockID = blockIdx.x + blockIdx.y * gridDim.x Thread ID within block: BlockthreadID = threadIdx.x + threadIdx.y * blockDim.x Then substitute BlockthreadID and blockID into flattening equation again to get threadID
2-D Grids and 3-D blocks Grid Block blockIdx.y blockIdx.x threadID.y threadID.x threadID.z Thread
Global thread ID - One approach From the previous case, we have Thread ID not considering z direction. Call it now threadIDxy Using the general 2-D to 1-D flattening equation: index = col + row * Ncol threadID = threadID.z + threadIDxy * blockDim.z