230 likes | 372 Views
CSC 4250 Computer Architectures. September 19, 2006 Appendix A. Pipelining. Three Classes of Pipeline Hazards. Structural Hazards: Arise from resource conflicts when hardware cannot support the overlapped execution of all possible combinations of instructions
E N D
CSC 4250Computer Architectures September 19, 2006Appendix A. Pipelining
Three Classes of Pipeline Hazards • Structural Hazards: Arise from resource conflicts when hardware cannot support the overlapped execution of all possible combinations of instructions • Data Hazards: Arise when an instruction depends on results of a previous instruction exposed by the pipeline • Control Hazards: Arise from pipelining of branches and other instructions that change PC (what is PC?)
Structural Hazards • Functional unit is not pipelined, e.g., FP divide • One register write port ─ two writes in a cycle; when can this happen? • Single memory pipeline for data and instructions ─ instruction contains data memory reference
Why Allow Structural Hazards? • Reduce cost • Pipelining (or duplicating) all functional units is expensive (e.g., fully pipeline FP multiply) • Processors that support both instruction and data cache accesses every cycle require twice as much bandwidth
Data Hazards • Pipelining changes order of read/write accesses: DADD R1,R2,R3 DSUB R4,R1,R5 AND R6,R1,R7 OR R8,R1,R9 XOR R10, R1, R11 • Add writes R1 in WB stage (5th cycle) • Sub reads R1 in ID (3rd cycle) → data hazard • Same problem for And instruction • What about Or? Or reads R1 in the 5th cycle, while Add writes R1
Minimize Data Hazard Stalls by Forwarding • ALU result from both EX/MEM and MEM/WB pipeline registers always fed back to ALU inputs • If forwarding hardware detects that previous ALU operation writes the register corresponding to current source for ALU operation, then control logic selects forwarded result as input
Forwarding • Generalized Forwarding Result forwarded from pipeline register corresponding to output of one unit to input of another unit • Forwarding Fails Load causes delay that forwarding cannot handle • Pipeline Interlock Hardware detects a hazard and stalls pipeline until hazard is cleared • MIPS Microprocessor without Interlocking Pipeline Stages
Control Hazard • Branch may change value of PC • Branch is taken or untaken • Three cycles of delay on MIPS
MIPS Branch Delay Clock Number Instr. # 1 2 3 4 5 6 7 8 9 Branch instr. IF ID EX ME WB Instr. i+1 IF stall stall stall stall Branch target IF ID EX ME WB Branch target+1 IF ID EX ME Branch target+2 IF ID EX
How MIPS Reduces Branch Delay • Consider only BEQZ and BNEZ • Move zero test into ID stage (from EX stage) • Compute both PCs (taken and untaken) early • Additional adder in ID stage (old: use ALU) • Only one cycle stall on branches • Branch on result of immediately preceding ALU instruction causes data hazard
Data Hazard in ALU Instr. followed by Branch Clock Number Instruction # 1 2 3 4 5 6 7 ALU instruction IF ID EX ME WB Branch instruction IF ID ID EX ME WB Example. ADD R1,R2,R3 BEQZ R1,name
Delayed Branch • Heavily used in early RISC processors • Works well with branch delay of one cycle • Sequential successor is in branch delay slot. This instruction is executed whether or not branch is taken: • Branch instruction • Sequential successor • Branch target if taken