1 / 15

Matrix Multiplication (i,j,k)

Matrix Multiplication (i,j,k). for I = 1 to n do for j = 1 to n do for k = 1 to n do C[i,j] = C[i,j] + A[i,k] x B[k,j] endfor endfor endfor. i. k. k. j. (i,j,k) Memory Map. i. x. =. j. Functional units. Main memory.

magar
Download Presentation

Matrix Multiplication (i,j,k)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Matrix Multiplication (i,j,k) for I = 1 to n do for j = 1 to n do for k = 1 to n do C[i,j] = C[i,j] + A[i,k] x B[k,j] endfor endfor endfor

  2. i k k j (i,j,k) Memory Map i x = j

  3. Functional units Main memory Scalar Architecture Registers Cache memory Memory bus

  4. Cache lines: matrix stored by rows Stride 1 dimension

  5. Matrix Multiplication (i,k,j)Improve Spatial Locality for i = 1 to n do for k = 1 to n do for j = 1 to n do C[i,j] = C[i,j] + A[i,k] x B[k,j] endfor endfor endfor

  6. (i,k,j) Memory Map i i k x = k j j

  7. Matrix Multiplication (i,k,j)Improve Temporal Locality C11 C12 C13 C21 C22 C23 C31 C32 C33 A11 A12 A13 A21 A22 A23 A31 A32 A33 B11 B12 B13 B21 B22 B23 B31 B32 B33 = x C11 = A11 x B11 + A12 x B21 + A13 x B31

  8. Submatrix Multiplication (i,k,j) for it = 1 to n by s do for kt = 1 to n by s do for jt = 1 to n by s do for i = it to min(it+s-1,n) do for k = kt to min(kt+s-1,n) do for j = jt to min(jt+s-1,n) do C[i,j] = C[i,j] + A[i,k] x B[k,j] endfor endfor endfor endfor endfor endfor

  9. (i,k,j) Memory Map s it it x kt = kt jt jt

  10. CPU CPU Main memory Cache memory Cache memory Multiprocessor Architecture Memory bus

  11. Parallel (i,k,j): Inner loop for i = 1 to n do for k = 1 to n do parfor j = 1 to n do C[i,j] = C[i,j] + A[i,k] x B[k,j] endparfor endfor endfor

  12. Parallel (i,k,j): Inner loopmemory mapping i i k x = k

  13. Parallel (i,k,j): Outer loop parfor i = 1 to n do for k = 1 to n do for j = 1 to n do C[i,j] = C[i,j] + A[i,k] x B[k,j] endfor endfor endparfor

  14. Parallel (i,k,j): Outer loopmemory mapping x =

  15. Parallel (i,k,j): Submatrix parfor it = 1 to n by s do for kt = 1 to n by s do for jt = 1 to n by s do for i = it to min(it+s-1,n) do for k = kt to min(kt+s-1,n) do for j = jt to min(jt+s-1,n) do C[i,j] = C[i,j] + A[i,k] x B[k,j] endfor endfor endfor endfor endfor endparfor

More Related