430 likes | 560 Views
Intro to Pipelining. Chapter 6 P&H. Pipelining. Multiple instructions are overlapped in execution Key to making modern processors fast Example – laundry wash load of clothes in machine Dry load of washing in dryer Fold dry load of washing Flatmate puts clothes away. 1. 6. P. M. 7. 8.
E N D
Intro to Pipelining Chapter 6 P&H
Pipelining • Multiple instructions are overlapped in execution • Key to making modern processors fast • Example – laundry • wash load of clothes in machine • Dry load of washing in dryer • Fold dry load of washing • Flatmate puts clothes away
1 6 P M 7 8 9 1 0 1 1 1 2 2 A M T i m e T a s k o r d e r A B C D Example – laundry
1 6 P M 7 8 9 1 0 1 1 1 2 2 A M T i m e T a s k o r d e r A B C D Example – pipelined laundry
Pipelined CPU • Can apply same principles to CPU Design • MIPS instructions classically take 5 steps: • Fetch instructions from memory • Decode instruction and read registers • Execute the operation or calculate address • Access operand in data memory • Write result to a register
P r o g r a m 2 4 6 8 1 0 1 2 1 4 1 6 1 8 e x e c u t i o n T i m e o r d e r ( i n i n s t r u c t i o n s ) I n s t r u c t i o n D a t a l w $ 1 , 1 0 0 ( $ 0 ) R e g A L U R e g f e t c h a c c e s s I n s t r u c t i o n D a t a l w $ 2 , 2 0 0 ( $ 0 ) R e g A L U R e g 8 n s f e t c h a c c e s s I n s t r u c t i o n l w $ 3 , 3 0 0 ( $ 0 ) 8 n s f e t c h . . . 8 n s 1 4 2 4 6 8 1 0 1 2 T i m e I n s t r u c t i o n D a t a l w $ 1 , 1 0 0 ( $ 0 ) R e g A L U R e g f e t c h a c c e s s I n s t r u c t i o n D a t a l w $ 2 , 2 0 0 ( $ 0 ) 2 n s R e g A L U R e g f e t c h a c c e s s l w $ 3 , 3 0 0 ( $ 0 ) Example MIPS CPU P r o g r a m e x e c u t i o n o r d e r ( i n i n s t r u c t i o n s ) I n s t r u c t i o n D a t a 2 n s R e g A L U R e g f e t c h a c c e s s 2 n s 2 n s 2 n s 2 n s 2 n s
Designing Instruction sets for pipelining • All MIPS instructions are the same length • Only a few instruction formats • Source register fields in same place • Memory operands only appear in loads and stores • Can calculate address in execute stage • Operands must be aligned in memory
I F : I n s t r u c t i o n f e t c h I D : I n s t r u c t i o n d e c o d e / E X : E x e c u t e / M E M : M e m o r y a c c e s s W B : W r i t e b a c k r e g i s t e r f i l e r e a d a d d r e s s c a l c u l a t i o n 0 M u x 1 A d d A d d 4 A d d r e s u l t S h i f t l e f t 2 R e a d r e g i s t e r 1 A d d r e s s P C R e a d d a t a 1 R e a d Z e r o r e g i s t e r 2 I n s t r u c t i o n R e g i s t e r s A L U R e a d A L U 0 R e a d W r i t e d a t a 2 r e s u l t A d d r e s s 1 d a t a r e g i s t e r M I n s t r u c t i o n M u D a t a u m e m o r y W r i t e x m e m o r y x d a t a 1 0 W r i t e d a t a 1 6 3 2 S i g n e x t e n d Single Cycle Datapath
Pipeline Reg naming conventions • Named by two stages separated by that register e.g. ID/EX • Work trough stages of executing a store instruction • Instruction Fetch • PC + 4 -> IF/ID • PC + 4 -> PC • Imem[PC] -> IF/ID
LW instruction • Instruction Decode • IF/ID supplies 16 bit Immed which is signed extended -> ID/EX • Two regs are read -> ID/EX • PC + 4 from IF/ID -> ID/EX • Execute or address calculation • Contents reg and signed extended immediate added -> Ex/Mem • Memory Access • Address in EX/Mem used to get data value from data mem -> Mem/WB • Write back • Mem/Wb data val -> reg file
LW instruction • Any information that might be needed in a later stage must be passed via the pipeline registers • Each resource can only be used in a single pipeline stage (otherwise have structural hazard) • Bug in previous data-path • Which instruction provides address for destination register in lw • Dest reg numb needs to be passed through pipeline registers
0 M u x 1 I F / I D I D / E X E X / M E M M E M / W B A d d A d d 4 A d d r e s u l t S h i f t l e f t 2 n R e a d o i t r e g i s t e r 1 A d d r e s s P C c R e a d u r d a t a 1 t R e a d s n Z e r o I r e g i s t e r 2 I n s t r u c t i o n R e g i s t e r s A L U R e a d A L U m e m o r y 0 R e a d W r i t e A d d r e s s d a t a 2 r e s u l t 1 d a t a r e g i s t e r M M D a t a u u W r i t e x m e m o r y x d a t a 1 0 W r i t e d a t a 1 6 3 2 S i g n e x t e n d Updated Datapath
Pipelined Control • Start off by ignoring hazards
P C S r c 0 M u x 1 I F / I D I D / E X E X / M E M M E M / W B A d d A d d A d d 4 r e s u l t B r a n c h S h i f t R e g W r i t e l e f t 2 n R e a d M e m W r i t e o i r e g i s t e r 1 t P C A d d r e s s R e a d c u d a t a 1 r t R e a d A L U S r c s M e m t o R e g n Z e Z r e o r o r e g i s t e r 2 I I n s t r u c t i o n R e g i s t e r s A L U R e a d A L U m e m o r y 0 R e a d W r i t e d a t a 2 A d d r e s s r e s u l t 1 r e g i s t e r M d a t a M u D a t a u W r i t e x m e m o r y x d a t a 1 0 W r i t e d a t a I n s t r u c t i o n 6 1 6 [ 1 5 – 0 ] 3 2 S i g n A L U e x t e n d M e m R e a d c o n t r o l I n s t r u c t i o n [ 2 0 – 1 6 ] 0 M A L U O p I n s t r u c t i o n u [ 1 5 – 1 1 ] x 1 R e g D s t Datapath showing control signals
P C S r c I D / E X 0 M W B u E X / M E M x 1 C o n t r o l M W B M E M / W B E X M W B I F / I D A d d A d d 4 A d d e r e s u l t t i r B r a n c h W S h i f t g e e l e f t 2 t i R r W A L U S r c m g n R e a d e e o i M R t r e g i s t e r 1 A d d r e s s P C c o R e a d t u r m d a t a 1 t s R e a d e n Z e r o M I r e g i s t e r 2 I n s t r u c t i o n R e g i s t e r s A L U R e a d A L U m e m o r y 0 R e a d W r i t e d a t a 2 A d d r e s s r e s u l t 1 d a t a r e g i s t e r M M D a t a u u m e m o r y W r i t e x x d a t a 1 0 W r i t e d a t a I n s t r u c t i o n 1 6 3 2 6 [ 1 5 – 0 ] S i g n A L U M e m R e a d e x t e n d c o n t r o l I n s t r u c t i o n [ 2 0 – 1 6 ] 0 A L U O p M u I n s t r u c t i o n x [ 1 5 – 1 1 ] 1 R e g D s t Updated Datapath
Pipeline Hazards • Situations in pipelining where next instruction cannot execute in following clock cycle
Structural Hazards • Hardware cannot support combination of instructions we want to execute in same cycle • E.g. If used a washer/dyer combination or flatmate doing something else • E.g. MIPS Only a single memory available
Control Hazards • Need to make a decision based on the results of one instruction while it is still executing • Example – The branch instruction
Control Hazards • Can either: • Stall • Bubble(s) inserted into pipeline • Predict • Predict outcome of branch • Stall if prediction wrong • Delay branch • Execute instruction after branch regardless of whether branch is taken or not
Branch Prediction • Assume all branches will fail • Assume backwards branches always succeed while forward branches always fail • Use a dynamic branch predictor • Uses a table to record the outcomes of previous time branch instructions were executed • Gets about 90% accuracy • Longer pipelines amplify problem
P r o g r a m e x e c u t i o n 1 4 2 4 6 8 1 0 1 2 o r d e r T i m e ( i n i n s t r u c t i o n s ) I n s t r u c t i o n D a t a b e q $ 1 , $ 2 , 4 0 R e g A L U R e g f e t c h a c c e s s I n s t r u c t i o n D a t a a d d $ 4 , $ 5 , $ 6 R e g A L U R e g f e t c h a c c e s s 2 n s ( D e l a y e d b r a n c h s l o t ) I n s t r u c t i o n D a t a l w $ 3 , 3 0 0 ( $ 0 ) R e g A L U R e g f e t c h a c c e s s 2 n s 2 n s Delayed Branches • Used on MIPS R3000 processors
Data Hazards • Instruction depends on outcome of a previous instruction still in the pipeline • E.g.
P r o g r a m e x e c u t i o n 2 4 6 8 1 0 o r d e r T i m e ( i n i n s t r u c t i o n s ) a d d $ s 0 , $ t 0 , $ t 1 I F I D E X M E M W B s u b $ t 2 , $ s 0 , $ t 3 M E M I F I D E X M E M W B Forwarding
2 4 6 8 1 0 1 2 1 4 T i m e P r o g r a m e x e c u t i o n o r d e r ( i n i n s t r u c t i o n s ) I F I D l w $ s 0 , 2 0 ( $ t 1 ) E X M E M W B b u b b l e b u b b l e b u b b l e b u b b l e b u b b l e s u b $ t 2 , $ s 0 , $ t 3 I F I D W B E X M E M Forwarding
Data Hazards and Forwarding • Consider execution on following sequence of instructions: Sub $2, $1, $3 And $12, $2, $5 Or $13, $6, $2 Add $14, $2, $2 Sw $15, 100($2)
T i m e ( i n c l o c k c y c l e s ) C C 1 C C 2 C C 3 C C 4 C C 5 C C 6 C C 7 C C 8 C C 9 V a l u e o f r e g i s t e r $ 2 : 1 0 1 0 1 0 1 0 1 0 / – 2 0 – 2 0 – 2 0 – 2 0 – 2 0 P r o g r a m e x e c u t i o n o r d e r ( i n i n s t r u c t i o n s ) R e g s u b $ 2 , $ 1 , $ 3 I M R e g D M a n d $ 1 2 , $ 2 , $ 5 I M D M R e g R e g I M D M R e g o r $ 1 3 , $ 6 , $ 2 R e g a d d $ 1 4 , $ 2 , $ 2 I M D M R e g R e g s w $ 1 5 , 1 0 0 ( $ 2 ) I M D M R e g R e g Pipelined Dependencies
T i m e ( i n c l o c k c y c l e s ) C C 1 C C 2 C C 3 C C 4 C C 5 C C 6 C C 7 C C 8 C C 9 V a l u e o f r e g i s t e r $ 2 : 1 0 1 0 1 0 1 0 1 0 / – 2 0 – 2 0 – 2 0 – 2 0 – 2 0 V a l u e o f E X / M E M : X X X – 2 0 X X X X X V a l u e o f M E M / W B : X X X X – 2 0 X X X X P r o g r a m e x e c u t i o n o r d e r ( i n i n s t r u c t i o n s ) s u b $ 2 , $ 1 , $ 3 I M R e g D M R e g a n d $ 1 2 , $ 2 , $ 5 I M R e g D M R e g o r $ 1 3 , $ 6 , $ 2 I M R e g D M R e g a d d $ 1 4 , $ 2 , $ 2 I M R e g D M R e g s w $ 1 5 , 1 0 0 ( $ 2 ) I M R e g D M R e g
Hazard notation • IN our previous example the two pairs of hazard conditions are • 1a. EX/Mem.RegisterRd = ID/EX.RegisterRs • 1b. EX/Mem.RegisterRd = ID/EX.RegisterRt • 2a. Mem/WB.RegisterRd = ID/EX.RegisterRs • 2b. Mem/WB.RegisterRd = ID/EX.RegisterRs • Hazard on $2 between the sub and and instructions is: • EX/Mem.RegisterRd = ID/EX.RegisterRs = $2
Hazards • Above notation will do forwarding unnecessarily as some instruction do not write registers • Soln: can check WB control to check if register writes back • What about $0 • Do not forward as should always be zero
I D / E X E X / M E M M E M / W B R e g i s t e r s A L U D a t a m e m o r y M u x a . N o f o r w a r d i n g No Forwarding
I D / E X E X / M E M M E M / W B M u x R e g i s t e r s F o r w a r d A A L U M D a t a u m e m o r y M x u x R s F o r w a r d B R t R t M E X / M E M . R e g i s t e r R d u R d x F o r w a r d i n g M E M / W B . R e g i s t e r R d u n i t b . W i t h f o r w a r d i n g Forwarding
EX Hazard • If (EX/Mem.RegWrite) and (EX/Mem.RegRd != 0) and (EM/Mem.RegRd = ID/EX.RegRs) thenForwardA = 10 • If (EX/Mem.RegWrite) and (EX/Mem.RegRd != 0) and (EM/Mem.RegRd = ID/EX.RegRt) thenForwardB = 10
Mem Hazard • If (Mem/WB.RegWrite) and (Mem/WB.RegRd != 0) and (Mem/WB.RegRd = ID/EX.RegRs) thenForwardA = 01If (Mem/WB.RegWrite) and (Mem/WB.RegRd != 0) and (Mem/WB.RegRd = ID/EX.RegRt) thenForwardB = 01
What about: Add $1, $1, $2 Add $1, $1, $3 Add $1, $1, $4 • If (Mem/WB.RegWrite) and (Mem/WB.RegRd != 0) and (EX/Mem.RegRd != ID/EX.RegRs) and (Mem/WB.RegRd = ID/EX.RegRs) thenForwardA = 01If (Mem/WB.RegWrite) and (Mem/WB.RegRd != 0) and (EX/Mem.RegRd != ID/EX.RegRt) and (Mem/WB.RegRd = ID/EX.RegRt) thenForwardB = 01
I D / E X W B E X / M E M M W B C o n t r o l M E M / W B E X M W B I F / I D M n o i u t c x u r t s R e g i s t e r s n I D a t a I n s t r u c t i o n A L U P C m e m o r y M m e m o r y u x M u x I F / I D . R e g i s t e r R s R s I F / I D . R e g i s t e r R t R t I F / I D . R e g i s t e r R t R t M E X / M E M . R e g i s t e r R d u I F / I D . R e g i s t e r R d R d x F o r w a r d i n g M E M / W B . R e g i s t e r R d u n i t Updated datapath to resolve data hazards
Data Hazards and Stalls • Forwarding will not work where an instruction tries to read a register following a load instruction that writes the same register
Stalls • Hazard detection unit required at the ID stage • If (ID/EX.MemRead) and (ID/EX.RegRt = IF/ID.RegRs) or (ID/EX.RegRt = IF/ID.RegRt thenstall the pipeline