280 likes | 287 Views
Dive into the world of Minimal MIPS and the single-cycle pipeline model. Learn about control signals, datapath, clocking, and pipelining concepts in CS/COE 1541 course term 2174. Discover how basic instructions like lw, sw, add, sub, and or function within this architecture.
E N D
Pipelining - Implementation CS/COE 1541 (term 2174) Jarrett Billingsley
Class Announcements • Rough quiz/test schedule established • Idea is that we'll have one after each major segment of the class • Quizzes won't be a whole class... I don't think • First quiz will be pretty small • Grading breakdown updated... again • 20% homework • 20% projects • 24% quizzes (3 × 8%) • 36% exams (3 × 12%) • First homework will be assigned Wednesday CS/COE 1541 term 2174
"Minimal MIPS" CS/COE 1541 term 2174
It ain't real MIPS • For pedagogical (teaching) purposes... • Instructions we care about: • lw, sw, add, sub, and, or, slt, beq, andj • Other instructions are mostly variations on these • Let's just review the parts of the Minimal MIPS CPU and the control signals we'll need to make it work. CS/COE 1541 term 2174
The Minimal MIPS single-cycle pipeline • A more detailed view of the abstract pipeline shown last time! imm field + 4 PCSrc + Data Memory PC dst Register File Ins. Decoder src1 src2 Instruction Memory MemWrite RegWrite ALUOp RegDataSrc ALUSrc sxt imm field CS/COE 1541 term 2174
Control signals • Registers • RegDataSrc ("MemToReg") controls what data is written to them. • RegWrite says whether or not a register is being written. • src1, src2, and dst come from the instruction. • ALU • ALUSrc says what the second operand is (register/immediate) • ALUOp controls what the ALU will do (add, sub, and, or etc.) • Memory • MemWrite says whether or not we're writing to memory. • PC • PCSrc controls where the PC will get the next instruction from. CS/COE 1541 term 2174
How an add/sub/and/or/slt work add t0, t3, s0 imm field + 4 PCSrc next instruction + Data Memory PC t0 dst Register File t3 Ins. Decoder src1 s0 src2 Instruction Memory MemWrite disable RegWrite enable ALUOp RegDataSrc add from ALU ALUSrc from reg sxt imm field CS/COE 1541 term 2174
How an lw works lw s4, 12(s0) imm field + 4 PCSrc next instruction + Data Memory PC s4 dst Register File s0 Ins. Decoder src1 x src2 Instruction Memory MemWrite disable RegWrite enable ALUOp RegDataSrc from Mem add ALUSrc from imm 12 sxt imm field CS/COE 1541 term 2174
How an sw works sw t3, 8(sp) imm field + 4 PCSrc next instruction + Data Memory PC x dst Register File sp Ins. Decoder src1 t3 src2 Instruction Memory MemWrite enable RegWrite disable ALUOp RegDataSrc x add ALUSrc from imm 8 sxt imm field CS/COE 1541 term 2174
What about beq? • We compare numbers by subtracting. • But then we have to see if the result is 0. • If the current instruction is beq, AND the result is 0, we set PCSrc to use the branch target. • Otherwise, we set PCSrc to PC + 4. isBEQ When PCSrc is 1, we use the branch target. When it's 0, we go to the next instruction. Our instruction decoder outputs 1 for isBEQ when the instruction is beq and 0 otherwise. PCSrc isZero ALU CS/COE 1541 term 2174
How a beq works beq t0, t1, top imm field top Take green PC path when t0 == t1 + Take red PC path when t0 != t1 4 PCSrc + Data Memory PC x dst Register File t0 Ins. Decoder src1 t1 src2 Instruction Memory MemWrite disable RegWrite disable ALUOp RegDataSrc x sub ALUSrc from reg sxt imm field CS/COE 1541 term 2174
What about j? • We have to add another input to the PCSrcmux. PC+4 PC+4+imm jump target (now 2 bits) PCSrc CS/COE 1541 term 2174
Clocking CS/COE 1541 term 2174
The clock signal • The clock like an orchestra conductor. It keeps everything in sync. • The clock is a signal that alternates regularly between 0 and 1: 1 0 • There are many different ways of synchronizing things to the clock. • But really, we don't care. The important thing is that the clock event, whatever it is, stores data into registers/memories. time CS/COE 1541 term 2174
Clock speed A • Propagation delayis the amount of time it takes for an electrical signal to pass from a circuit's inputs to its outputs. • There are lots of factors which go into this... • If you've got a circuit like this: B IN OUT D Q EN D Q EN + 1 • Let's say registers take 2ns to propagate their value from D to Q after being clocked, and the adder takes 6ns to compute the sum. • How long does it take for data to flow from A's D to B's D, assuming A is clocked at time 0? CS/COE 1541 term 2174
Clock speed • If the propagation delay between A's D and B's D is 8ns, we can't run the clock any faster than that. • When the outputs are changing, we say they're invalid. • If we clocked faster, we'd store invalid values into registers. • So the fastest we can clock this circuit is 8ns (8 × 10-9s) between pulses, or... how many Hz (1/s)? • The critical path is the part of a circuit that has the longest propagation delay, and determines the overall clock speed. CS/COE 1541 term 2174
Pipelining (finally!) CS/COE 1541 term 2174
Multi-cycle instruction execution • Let's watch how an instruction flows through the datapath. Clock! Clock! Clock! MEM EX IF ID Add... Set all control signals... add Ins. Decoder Register File ALU Memory Memory Clock! Data flows back to registers... WB CS/COE 1541 term 2174
Pipelined instruction execution • Pipelining is just an extension of that idea! MEM EX IF ID add sw sub Ins. Decoder Register File ALU Memory Memory WB CS/COE 1541 term 2174
What did we gain? • Did we make the individual instructions faster? • Is performance better? Why? • How often are we completing instructions? • If the single-cycle datapath had an 8 ns clock cycle and this one has a 2 ns clock cycle: • How long does each instruction take? (in general) • How many instructions can we complete in 1 second? • Assuming everything was perfect, would we be able to go faster if we had 10 pipeline stages and a 1ns clock cycle? By how much? • We gotta be able to compare performance to know that we're making things faster! • If you're not writing it down, you're just screwing around. CS/COE 1541 term 2174
Ugly architectures (COUGH x86 COUGH) • Here's some garbage. • Well, it's just the purple and green parts that are awful. • x86 is gross and wasn't designed to be pipelinable. • Problems: • Instructions are variable length and difficult to decode • Memory accesses can be unaligned • Many instructions can access memory CS/COE 1541 term 2174
Microprocessor without Interlocked Pipeline Stages • That's what MIPS meant! And it's designed to be pipelined. • All instructions are the same length (32 bits). • Instruction formats are simple and regular. • Only load/store instructions can access memory, and they can only do one memory access. (inc [rax] in x86 does two.) • Memory accesses must be aligned, which means each load/store only needs ONE memory transfer. • This is because memory hardware can usually only access larger-than-byte blocks. • But we'll still run into issues. CS/COE 1541 term 2174
Timelining pipelining (ha) • It's useful to show how instructions will move through the CPU. add t0,t1,t2 add t3,t4,t5 add s0,s1,s2 add s3,s4,s5 IF IF IF IF MEM MEM MEM MEM ID ID ID ID WB WB WB WB EX EX EX EX CS/COE 1541 term 2174
Wow that was easy!!! • Now we're done with pipelining!................ • NO • WE'RE NOT • What if they were all lw instructions instead? CS/COE 1541 term 2174
Structural hazards • Two instructions need to use the same hardware at the same time. lw t0,0($0) lwt1,4($0) lwt2,8($0) lwt3,12($0) IF IF IF IF MEM MEM MEM MEM ID ID ID ID WB WB WB WB EX EX EX EX CS/COE 1541 term 2174
Data hazards • An instruction depends on the output of a previous one. add t0,t1,t2 sub s0,t0,t1 IF IF MEM MEM ID ID WB WB EX EX • The subhas to wait until the add's WB phase is over before it can do its EX phase • Or does it...? CS/COE 1541 term 2174
Control hazards • You don't know the outcome of a conditional branch. beq t0,$0,end add t0,t1,t2 IF IF MEM MEM ID ID WB WB EX EX • Uh oh, turns out we SHOULD have taken the branch... • What happens to the add instruction? • What could we have done instead? CS/COE 1541 term 2174
All this and more... • On the next episode of CS 1541! CS/COE 1541 term 2174