230 likes | 401 Views
2. Outline. Multi-cycle operationsFloating-point operationsStructural and data hazardsInterrupts, Faults and ExceptionsPrecise exceptionsComplications in pipelinesREADING: Appendix A. 3. Pipelining Multicycle Operations. Assume five-stage pipelineThird stage (execution) has two functional u
E N D
1. 1 COMP 206:Computer Architecture and Implementation Montek Singh
Wed, Sep 28, 2005
Topic: Pipelining -- Intermediate Concepts
(Multicycle Operations; Exceptions)
2. 2 Outline Multi-cycle operations
Floating-point operations
Structural and data hazards
Interrupts, Faults and Exceptions
Precise exceptions
Complications in pipelines
READING: Appendix A
3. 3 Pipelining Multicycle Operations Assume five-stage pipeline
Third stage (execution) has two functional units E1 and E2
Instruction goes through either E1 or E2, but not both
E1 and E2 are not pipelined
Stage delay of E1 = 2 cycles
Stage delay of E2 = 4 cycles
No buffering on inputs of E1 and E2
Stage delay of other stages = 1 cycle
Consider an instruction sequence of five instructions
Instructions 1, 3, 5 need E1
Instructions 2, 4 need E2
4. 4 Space-Time Diagram: Multicycle Operations Out-of-order completion
3 finishes before 2, and 5 finishes before 4
Instructions may be delayed after entering the pipeline because of structural hazards
Instructions 2 and 4 both want to use E2 unit at same time
Instruction 4 stalls in ID unit
This causes instruction 5 to stall in IF unit
5. 5 Floating-Point Operations in MIPS
6. 6 Structural Hazard on WB Unit This is worst-case scenario: max steady-state number of write ports is 1
Don’t replicate resources; detect and serialize access as needed
Early resolution
Track use of WB in ID stage (using shift register), stall instructions there
reservation register
Simplifies pipeline control; all stalls occur in ID
adds shift register and write-conflict logic
Late resolution
Stall instructions at entry to MEM or WB stage
Complicates pipeline control (two stall locations)
7. 7 WAW Hazards WAW hazard arises only when no instruction between ADD.D and L.D uses result computed by ADD.D
Adding an instruction like “ADD.D F8,F2,F4” before L.D would stall pipeline enough for RAW hazard to avoid WAW hazard
Can happen through a branch/trap (example in HP3, Section A.9)
Rare situation, but must still handle correctly
Hazard resolution
Delay the issue of L.D until ADD.D enters MEM
Cancel write of ADD.D
8. 8 RAW Hazards Longer delays of FP operations increases number of stalls in response to RAW hazards
Two methods for reducing stalls
Compiler could have moved instruction D between instructions M and A, which would allow D to complete earlier; or hardware could detect this possibility and issue instruction D out of order
ID stage is a bottleneck because instructions wait there for their operands to be available; could add buffers (reservation stations) to functional units and let instructions await their operands there
9. 9 Responsibilities of ID (all stalls in ID) Three sets of checks
Structural hazards
Check for availability of FP unit
Ensure WB unit will be available when needed
RAW hazards
Stall current instruction until its source registers are not listed as pending registers in a pipeline register that will not be available when current instruction needs the result
WAW hazards
If any instruction in adder, divider, or multiplier has same register destination as current instruction, stall current instruction
Hazards between FP and integer instructions
Integer and FP instructions use disjoint sets of registers, except for FP-integer register moves
FP load-stores can conflict with integer load-stores in MEM stage
10. 10 MIPS R4000 Floating-Point Pipeline
11. 11 Instruction Mixes in FP Pipeline: Adds Only
12. 12 FP Pipeline: Multiplies Only
13. 13 FP Pipeline: Adds and Multiplies
14. 14 Interrupts, Faults, or Exceptions Synchronous, coerced interrupts that occur within instructions and after which execution must resume are the hardest to implement
See Figure A.27 in HP3
15. 15 Precise Interrupts (Sequential Processor) When interrupt occurs, state of interrupted process is saved, including PC (= u), registers, and memory
Interrupt is precise if the following three conditions hold
All instructions preceding u have been executed, and have modified the state correctly
All instructions following u are unexecuted, and have not modified the state
If the interrupt was caused by an instruction, it was caused by instruction u, which is either completely executed (overflow) or completely unexecuted (VM page fault)
Precise interrupts are desirable if software is to fix up error that caused interrupt and execution has to be resumed
Easy for external interrupts, could be complex and costly for internal
Imperative for some interrupts (VM page faults, IEEE FP standard)
16. 16 Problems on Sequential Processors Instruction modifies state early, then causes an interrupt
State change must be undone
Example: First operand of VAX instruction uses autodecrement addressing mode, which writes a register. Trying to access second operand causes a page fault. Since instruction execution cannot be completed, we must restore the register written by autodecrement to its original value Long-running instructions
Not enough to be able to restore state, must make progress from interrupt to interrupt
Example: MVC on IBM 360 copies 256 bytes
No virtual memory, so interrupts not allowed to stop MVC
Example: MVC on IBM 370 copies 256 bytes
Has virtual memory, so first access all pages involved; after that, no interrupts allowed
Example: MVCL on IBM 370 copies up to 224 bytes
Has VM; two addresses and length are in registers
Registers saved and restored on interrupts (making progress)
17. 17 Interrupts in MIPS Pipeline How do we stop and restart execution on an interrupt to keep it precise?
What problems do delayed branches cause?
What happens if multiple exceptions occur in the pipeline?
Can exceptions occur out-of-order?
What problems do multi-cycle instructions cause?
18. 18 MIPS Integer Pipeline, Single Interrupt Force TRAP instruction in pipeline on next IF
Turn off all writes for faulting instruction and subsequent instructions
After exception-handling routine in OS receives control, save PC of faulting instruction
When exception has been handled, the RFE instruction reloads PC and restarts sequential instruction execution
19. 19 Complications with Delayed Branches Suppose instruction 2 causes an exception (e.g., a page fault) after the taken branch completes (determining that the branch outcome is true)
Instruction 2 cannot complete
Neither can instruction u
On restart, we do not have sequential execution
We must remember two PC values: 2 and u
20. 20 Complications with Multiple Exceptions At same cycle, LW takes a data page fault and ADD takes an arithmetic exception
On an unpipelined machine, LW’s exception would occur first
Handle the page fault
Restart execution
ADD will cause arithmetic exception to occur; handle it then
21. 21 Complications with Out-of-order Exceptions LW takes data page fault, ADD takes instruction page fault
Relative timing differs between unpipelined and pipelined machines
To maintain precise interrupts, we need to consider both when they occur and the instructions that caused them
Post exceptions in exception status vector, turn off state modifications, and check vector in WB unit
22. 22 Complications with Multicycle Operations Instructions are independent (no hazards) and therefore issue immediately
Differences in running times causes out-of-order termination
DIVF throws arithmetic exception late in its execution
At that point, ADDF and SUBF have both completed execution and destroyed one of their operands
Can we maintain precise interrupts under these conditions?
23. 23 FP Pipeline Exceptions: Solns. 1 and 2 Settle for imprecise interrupts (CRAY, with checkpointing)
Done on Alpha 21064 and 21164, IBM Power-1 and Power-2, MIPS R8000 by supporting a fast imprecise mode and a slow precise mode
Not an option if you have to support virtual memory or IEEE floating point standard
Software finishes certain instructions (SPARC)
Keep enough state around for trap handler to create a precise sequence for exception and finish work for some instruction stages
Only FP instructions cause this problem
24. 24 FP Pipeline Exceptions: Solns. 3 and 4 Stalling (MIPS R2000/3000, MIPS R4000, Pentium)
An instruction is allowed to issue only if it is certain that all the instructions before the issuing instruction will complete without causing an exception
To prevent excessive stalling, FP units must decide on possibility of exceptions early in pipeline
General methods (PowerPC 620, MIPS R10000)
Reorder buffer, history file, future file
An instruction is allowed to finalize its writes only when all previously issued instructions are complete
More naturally used in connection with ILP (Chapter 4)
Significant complexity (to be discussed later)