30 likes | 277 Views
Consider the matrix_add function shown below: void matrix_add ( int a[128][128], int b[128][128], int c[128][128]) { int i , j; for ( i = 0; i < 128; i ++) for (j = 0; j < 128; j++ ) c [ i ][j] = a[ i ][j] + b[ i ][j]; }
E N D
Consider the matrix_add function shown below: void matrix_add(int a[128][128], int b[128][128], int c[128][128]) { inti, j; for (i = 0; i < 128; i++) for (j = 0; j < 128; j++) c[i][j] = a[i][j] + b[i][j]; } In execution of the loop body, three memory operations occur in order of a[i][j], b[i][j], and c[i][j]. The processor has a 64KB, 4-way set-associative, 64B-block L1 data cache, which uses write-back, write- allocate, and LRU replacement. The cache is initially empty. The addresses of array a, b, c are 0x10000, 0x20000, 0x30000 and size of int variable is 4B. Compute the L1 data cache miss rate of the matrix_addfunction. Compute AMAT if L1 data cache and memory access time are 2 and 100 cycles, respectively.
I0: Target: ADD R4, R1, R0 I1: SUB R9, R3, R4 I2: ADD R4, R5, R6 I3: LW R2, 100(R3) I4: LW R2, 0(R2) I5: SW R2, 100(R4) I6: AND R2, R2, R1 I7: BEQ R9, R1, Target I8: AND R9,R9,R1 Case 1) No forwarding logic Draw pipeline diagram on a cycle basis. What is the final execution time of the code? Case 2) Forwarding logic Draw pipeline diagram on a cycle basis. What is the final execution time of the code?