260 likes | 364 Views
10/13: Lecture Topics. Data Hazards Control Hazards. Grading Disputes. Bring grading disputes to me up to one week after an assignment is handed back We try to be very fair when grading Please don’t beg for one point here or one point there each hw counts 5% of your grade
E N D
10/13: Lecture Topics • Data Hazards • Control Hazards
Grading Disputes • Bring grading disputes to me up to one week after an assignment is handed back • We try to be very fair when grading • Please don’t beg for one point here or one point there • each hw counts 5% of your grade • hw’s are out of ~70 points • each hw point is only 0.07% of your final grade (or 0.0028 grade points) • I will add 0.028 to everyone’s final grade if you don’t dispute 1 or 2 point grading issues • Exams are worth more and I will be more tolerant of begging
1 2 3 4 5 6 7 8 9 IF IF IF IF IF ID ID ID ID ID EX EX EX EX EX MEM MEM MEM MEM MEM WB WB WB WB WB inst 1 inst 3 inst 4 inst 2 inst 5 Pipelined Xput and Latency • What’s the throughput of this implementation? • What’s the latency of this implementation?
IF IF ID ID EX EX MEM MEM WB WB Data Hazards • What happens in the following code? add $s0, $s1, $s2 add $s4, $s3, $s0 $s0 is read here $s0 is written here • This is called as a data dependency • When it causes a pipeline stall it is called a data hazard
IF IF ID ID EX EX MEM MEM WB WB add s0,s1,s2 add s4,s3,s0 Solution: Forwarding • The value of $s0 is known after cycle 3 (after the first instruction’s EX stage) • The value of $s0 isn’t needed until cycle 4 (before the second instruction’s EX stage) • If we forward the result there isn’t a stall
IF IF ID ID EX EX MEM MEM WB WB lw s0,0(s2) add s4,s3,s0 Another data hazard • What if the first instruction is lw? • s0 isn’t known until after the MEM stage • We can’t forward back into the past • Either stall or reorder instructions
IF IF IF IF ID ID ID ID EX EX EX EX MEM MEM MEM MEM WB WB WB WB lw s0,0(s2) add s4,s3,s0 IF ID EX MEM WB Solutions to the lw hazard • We can stall for one cycle, but we hate to stall stall • Try to execute an unrelated instruction between the two instructions lw s0,0(s2) sub t4,t2,t3 add s4,s3,s0 sub t4,t2,t3
Reordering Instructions • Reordering instructions is a common technique for avoiding pipeline stalls • Sometimes the compiler does the reordering statically • Almost all modern processors do this reordering dynamically • they can see several instructions and they execute anyone that has no dependency • this is known as out-of-order execution and is very complicated to implement
Structural Hazards • Instructions in different stages want to use the same resource • Suppose a lw instruction is in stage four (memory access) • Meanwhile, an add instruction is in stage one (instruction fetch) • Both of these actions require access to memory; they could collide • Add more hardware to eliminate the problem • Or stall (cheaper & easier), not usually done
IF IF ID ID EX EX MEM MEM WB WB bne $s0, $s1, next add $s4, $s3, $s0 ... next: sub $s4, $s3, $s0 Control Hazards • Branch instructions cause control hazards (aka branch hazards) because we don’t know which instruction to execute next do we fetch add or sub? we don’t know until here
IF IF ID ID EX EX MEM MEM WB WB Solution: Stall • We can stall to see which instruction to execute next bne $s0, $s1, next stall sub $s4, $s3, $s0 • But we hate to stall
IF IF ID ID EX EX MEM MEM WB WB Solution: Move Branch to ID • Move the branch hardware to ID stage • Hardware to compare to registers is simpler than hardware to add them (i.e. EX stage hardware) bne $s0, $s1, next sub $s4, $s3, $s0 stall • We still have to stall for one cycle • But we can’t move the branch up any more
Branch Delay Slot • A branch now causes a stall of one cycle • Try to execute an instruction instead of stall • The compiler must find an instruction to fill the branch delay slot • 50% of the instructions are useful • 50% are nop’s (no ops) which don’t do anything • Might have been a good idea originally but not any more
Branch Delay Slot Example • “addi $t0,$t0,1” will always execute move $t0,$zero bne $s0,$zero,Done addi $t0,$t0,1 addi $t0,$t0,3 Done: move $t1,$t0 branch not taken branch taken move $t0,$zero bne $s0,$zero,Done addi $t0,$t0,1 addi $t0,$t0,3 move $t1,$t0 move $t0,$zero bne $s0,$zero,Done addi $t0,$t0,1 move $t1,$t0
Solution: Speculate • Executing the following instructions assuming the branch is taken (or not taken) • If we guessed right, then let the instructions proceed • If we guessed wrong, then squash the partially completed instructions. • This is called flushing the pipeline. • These instructions were wasted, but we would have stalled otherwise • Never let a speculating instruction write to memory or a register until we’re sure it should execute • This is known as speculative execution
IF IF ID ID EX EX MEM MEM WB WB IF IF IF ID ID ID EX EX EX MEM MEM MEM WB WB WB IF Branch not taken Branch taken addi bne bne move SQUASH addi addi Speculate Never Taken • Assume the branch isn’t taken and fetch the next instruction bne $s0,$zero,Done addi $t0,$t0,1 addi $t0,$t0,3 Done: move $t1,$t0 • Predicting taken is actually better, but still not good enough
Static Branch Prediction • Most backwards branch are taken (80%) • they are part of loops • Forward branches are taken about half the time • if statements • A common static branch prediction scheme is to predict • backwards branches are taken • forward branches are not taken • Some architectures allow the compiler to specify in the branch instruction to predict taken or not taken • This does okay (70-80%), but still not good enough
Dynamic Branch Prediction • In most programs you execute the same instructions over and over • You encounter the same branch instructions over and over • The same branch instruction is usually • taken if it was taken last time • not taken if it was not taken last time • If we keep a history of each branch instruction, then we can predict much better
Dynamic Branch Prediction • A table is kept on the CPU that • There is not room to store each instruction • last few bits of the instruction index this table • some instructions collide like a hash table • usually store 2 bits per entry • Dynamic branch prediction is 92-98% accurate
Importance of Branch Prediction • Branches occur every five instructions • Today’s processors execute up to 4 instructions per cycle • A branch occurs every 2 cycles • Pipelines are longer than MIPS (8,9,11,13 cycles) • branch mispredict penalty is 3-5 cycles instead of 1 cycle • Must predict accurately or you execute < 0.5 instructions per cycle instead of 4 instructions
Exceptions and Interrupts • So far, we’ve assumed that the assembled code can always be executed • Lots of ways for unexpected things to happen: • Undefined instruction • Arithmetic overflow • System call • I/O device request
Exceptions • An exception is an internal event • The unexpected condition was caused by something the program did • Undefined instructions and arithmetic overflows are examples • If you ran the program again, the exception would (probably) happen again at the same point in the program’s execution
Interrupts • An interrupt is an external event • The unexpected condition was not caused by the program • An I/O device request is an example • If you ran the program again, the interrupt would probably not happen at the same point
What should happen? • These events result in an unnatural change in the flow of control • Normally, the next instruction executed is ________ • When one of these events takes place, something else happens • The system must respond to the event • The response depends on the type of event
Exception Handling • Loosely, the following steps are taken: 1. Save the address of the offending instruction in a register 2. Make the reason for the exception known - Set the value of the status register, or - Use vectored interrupts to do step 3 3. Transfer control to the operating system 4. Operating system decides what to do: - May report the error to the user - May terminate the program
Exception/Pipelining Interface • Suppose an add instruction overflows, causing an exception • Instructions after the add are already in the pipeline • The partially computed instructions must be flushed • Exception must be caught before register contents have changed • Pipeline designers must be wary of exception handling