150 likes | 261 Views
Lecture 11: Pipelining. Computer Engineering 585 Fall 2001. Three Generic Data Hazards. Instr I followed by Instr J Read After Write (RAW) Instr J tries to read operand before Instr I writes it (also known as data dependence ). Three Generic Data Hazards. Instr I followed by Instr J
E N D
Lecture 11: Pipelining Computer Engineering 585 Fall 2001
Three Generic Data Hazards InstrIfollowed by InstrJ • Read After Write (RAW)InstrJ tries to read operand before InstrIwrites it (also known as data dependence).
Three Generic Data Hazards InstrI followed by InstrJ • Write After Read (WAR)InstrJtries to write operand before InstrIreads i • Gets wrong operand • Can’t happen in DLX 5 stage pipeline because: • All instructions take 5 stages, and • Reads are always in stage 2, and • Writes are always in stage 5 • Antidependence
Three Generic Data Hazards InstrI followed by InstrJ • Write After Write (WAW)InstrJ tries to write operand before InstrIwrites it • Leaves wrong result ( InstrI not InstrJ) • Can’t happen in DLX 5 stage pipeline because: • All instructions take 5 stages, and • Writes are always in stage 5 • Output dependence • Will see WAR and WAW later in more complicated pipelines
Branches in DLX sequential implementation Execute/# Instruction decode/# Memory# Write# address# Instruction fetch register fetch access back calculation M# u# x Add NPC Branch Zero? Cond 4 taken M# PC u# A Instruction# x Registers IR ALU memory ALU# M# output B u# Data# M# # LMD x memory u# x 16 32 Sign# lmm extend FIGURE 3.1 The implementation of the DLX datapath allows every instruction to b e executed in four or five clock cycles.
Control Hazard:3 cycle stall Time (in clock cycles) CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 ALU Reg Reg IM DM 100: BEQZ R1, +40 Program execution order (in instructions) # 104: ADD R2,R3,R4 ALU Reg IM Reg DM 108: SUB R6,R7,R8 ALU IM Reg DM 112: ANDI R12,R10,0xAA ALU IM Reg 140: ADD R3,R4,R5 IM Reg ADD
Control Hazard: 1 cycle stall ID/EX ADD IF/ID EX/MEM MEM/WB Zero? 4 M# ADD u# x IR 6..10 PC IR 11..15 IR Instruction# Registers memory ALU MEM/WB.IR M# Data M# u# memory u# x x 16 32 Sign extend
1 Cycle Stall Pipeline RTL Description Pipe stage Branch instruction IF IF/ID.IR ¬ Mem[PC]; IF/ID.NPC,PC ¬ (if ID/EX.cond {ID/EX.NPC} else {PC+4}); ID ID/EX.A ¬ Regs[IF/ID.IR ]; ID/EX.B ¬ Regs[IF/ID.IR ]; 6..10 11..15 16 ID/EX.NPC ¬ IF/ID.NPC + (IR ) ##IR ; 16 16..31 ID/EX.IR ¬ IF/ID.IR ID/EX.cond ¬ ( Regs[IF/ID.IR ] op 0); 6..10 16 ID/EX.Imm ¬ (IR ) ##IR 16 16..31 EX MEM WB
Branch Stall Impact • If CPI = 1, 30% branch, Stall 3 cycles => new CPI = 1.9! • Two part solution: • Determine branch taken or not sooner, AND • Compute taken branch address earlier • DLX branch tests if register = 0 • DLX Solution: • Move Zero test to ID/RF stage • Adder to calculate new PC in ID/RF stage • 1 clock cycle penalty for branch versus 3
11% compress 3% 3% 22% eqntott 2% 2% 11% espresso 4% 1% 12% gcc 3% 4% 11% li 4% 8% Benchmark 6% doduc 2% 2% 6% ear 4% 4% 10% hydro2d 2% 0% 9% mdljdp 0% 0% 2% su2cor 1% 1% 0% 5% 10% 15% 20% 25% Percentage of instructions executed Forward conditional branches Backward conditional branches Unconditional branches Branch Behavior Statistics Int: 13% forward cond., 3% backward cond., 4% unconditional FP: 7% forward cond., 2% backward cond., 1% unconditional