410 likes | 509 Views
COSC 3330/6308 Solutions to Second Problem Set. Jehan-François Pâris October 2012. First problem. Detail for each of the four following MIPS instructions, which actions are being taken at each of their five steps.
E N D
COSC 3330/6308Solutions toSecond Problem Set Jehan-François PârisOctober 2012
First problem • Detail for each of the four following MIPS instructions, which actions are being taken at each of their five steps. • Do not forget to mention how and during which steps each instruction updates the program counter. (4×10 points).
jalr $s0, $s1 • Fetch instruction and add 4 to PC • Read $s0 and save "somewhere" current value of PC • Transmit value of $s0 to PC either directly or through adder • This data path does not exist in the toy MIPS architecture we studied • Write saved PC value into register $s1 • This data path does not exist in the toy MIPS architecture we studied
jalr $s0, $s1 (other good answer) • This instruction cannot be implemented on the toy MIPS architecture because • There is no data path going from a read register line to the PC • Cannot set new value of PC to contents of $S0 • There is no data path going from the PC to the write register line • Cannot save "old" value of PC into register $S1
sw $s1, 24($t0) • Fetch instruction and add 4 to PC • Read registers $s1and $t0 andsign extend contents of displacementfield of instruction • Compute memory address a by adding contents of register $t0 to sign-extended displacement • Store contents of register$s1 into memory address a
slt $t0, $s3, $s4 • Fetch instruction and add 4 to PC • Read registers $s3and $s4 • Compare values of $s3and $s4 using ALU • Store comparison result into register$t
jal 1048576 • Fetch instruction and add 4 to PC • Sign extend contents of displacementfield of instruction • Save "somewhere" contents of PC • Multiply by four sign-extended contents of displacementfield of instruction and replace 28 LSB of PC with new value • Write saved PC value into register $31 • This data path does not exist in the toy MIPS architecture we studied
jal 1048576 (other good answer) • This instruction cannot be implemented on the toy MIPS architecture because • There is no data path going from the PC to the write register line • Cannot save "old" value of PC into register $S1
Second problem • Consider these two potential additions to the MIPS instruction set and explain how they would restrict pipelining. (2×5 easy points) • cp d1(r1), d2(r2) • incr d2(r2)
cp d1(r1), d2(r2) • Copy contents of word at address contents of r2 plus offset d2 into address contents of r1 plus displacement d1.
Answer (I) • Let us look at the steps the instruction will have to take: • Instruction fetch • Instruction decode and read register r1 • Use arithmetic unit to compute d1+[r1] • Access memory to read word at address d1+[r1]
Answer (II) • And it continues: • Write somewhere the value v • Read register r2 • Use arithmetic unit to compute d2+[r2] • Access memory to write value v at address d2+[r2] • Instruction reads twice a register and accesses twice the ALU
incr d2(r2) • Adds one to the contents of word at address contents of r2 plus offset d2
Answer (I) • Let us look at the steps the instruction will have to take: • Instruction fetch • Instruction decode and read register r2 • Use arithmetic unit to compute d2+[r2] • Store the address somewhere • Access memory to read word at address d2+[r2]
Answer (II) • And it continues: • Use arithmetic unit to increment by 1 value that was just read • Access memory to write value v at address d2+[r2] that was previously saved • Instruction accesses twice the ALU
Third problem • Explain how you would pipeline the four following pairs of statements. (4×5 points)
Part A • add $t0, $s0, $s1beq $s1,$s2, 300 No data hazard!
Part A (with special unit) It can de done as this step uses a different paths than the previous instruction Both solutions will get full credit
Part B • add $t2, $t0, $t1sw $t3, 36($t2) Data hazard is avoided thanks to forwarding unit
Part B (without forwarding unit) • add $t2, $t0, $t1sw $t3, 36($t2) Two cycles are lost(60% CREDIT)
Part C • add $t0, $s0, $s1beq $t0,$s2, 300 Data hazard is avoided thanks to forwarding unit
Part C (with special unit) • add $t0, $s0, $s1beq $t0,$s2, 300 It can de done as this step uses a different paths than the previous instruction
Part C (without forwarding unit) • add $t0, $s0, $s1beq $t0,$s2, 300 Two cycles are lost(60% CREDIT)
Part D • lw $t0, 24($t1)sub $s2, $t0, $t1 Data hazard is reduced by forwarding unit
Part D (without forwarding unit) • lw $t0, 24($t1)sub $s2, $t0, $t1 Three cycles are lost(STILL 60% CREDIT)
Fourth problem • A computer system has a two-level memory cache hierarchy. • L1 cache has a zero hit penalty, a miss penalty of 5 ns and a hit rate of 95 percent • L2 cache has a miss penalty of 100 ns and a hit rate of 90 percent.
Part A • How many cycles are lost by each instruction accessing the memory if the CPU clock rate is 2 GHz?
Answer (I) • Let P1 and P2 be respectively the misspenalties of caches L1 and L2 • Duration of clock cycle 1/(2 GHz) = 0.25×10-9 s = 0.5 ns • Cache miss penalties P1 = 25 = 10 cycles P2 = 2100 = 200 cycles
Answer (II) • Recall P1 = 10 cycles and P2 = 200 cycles • Let M1 and M2 be respectively the missrates of caches L1 and L2 • We have M1 = 0.05 and M2 = 0.10 • Number of lost cycles/instruction M1P1 + M1M2P2 =0.0510 + 0.050.10200 = 0.5 + 1 =1.5
Hint • Use fractions to reduce risk of error
Part B • We can either increase the hit rate of the topmost cache to 98 percent or increase the hit rate of the second cache to 95 percent. • Which improvement would have more impact? (10 points)
Better L1 cache • Recall P1 = 10 cycles and P2 = 200 cycles • We now have M1 = 0.02 and M2 = 0.10 • Number of lost cycles/instruction M1P1 + M1M2P2 =0.0210 + 0.020.10200 = 0.2 + 0.4 = 0.6
Better L2 cache • Recall P1 = 10 cycles and P2 = 200 cycles • We now have M1 = 0.05 and M2 =0.05 • Number of lost cycles/instruction M1P1 + M1M2P2 =0.0510 + 0.050.05200 = 0.5 + 0.5 = 1
Answer • For the values of M1, P1, M2 and P2 we considered • Improving the hit ratio of the L1 cache provides the best speedup
Fifth problem • A virtual memory system has • A virtual address space of 4 Gigabytes • a page size of 8 Kilobytes. • Each page table entry occupies 4 bytes.
Part A • How many bits remain unchanged during the address translation? (5 points)
Answer • How many bits remain unchanged during the address translation? (5 points) • Page size is 8 KB = 23 210 = 213 bytes • The last 13 bits of each address will remain unchanged during the address translation
Part B • How many bits are used for the page number? (5 points)
Answer • How many bits are used for the page number? (5 points) • Address space is 4 GB = 22 230 = 232 bytes • Will have 32-bit addresses • The page number will occupy the32 – 13 = 19 most significant bits of the address
Page number Offset Reminder Virtual address: 32 or 64 bits Used to find right page frame number Copied unmodified Page frame number Offset
Part C • What is the maximum number of page table entries in a page table? (5 points) • Page number occupies 19 bits • Can have 219pages in a process • Page tables will have 219 entries