1 / 45

COMP541 Datapaths II & Single-Cycle MIPS

COMP541 Datapaths II & Single-Cycle MIPS. Montek Singh Apr 2, 2012. Topics. Complete the datapath Add control to it Create a full single-cycle MIPS! Reading Chapter 7 Review MIPS assembly language Chapter 6 of course textbook Or, Patterson Hennessy (inside flap). Top-Level CPU (MIPS).

harveyg
Download Presentation

COMP541 Datapaths II & Single-Cycle MIPS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COMP541Datapaths II &Single-Cycle MIPS Montek Singh Apr 2, 2012

  2. Topics • Complete the datapath • Add control to it • Create a full single-cycle MIPS! • Reading • Chapter 7 • Review MIPS assembly language • Chapter 6 of course textbook • Or, Patterson Hennessy (inside flap)

  3. Top-Level CPU (MIPS) reset clk clk memwrite dataadr readdata writedata pc[31:2] Instr Memory MIPS Data Memory instr

  4. Top-Level CPU: Verilog module top(input clk, reset, output … ); // add signals here for debugging wire [31:0] pc, instr, readdata, writedata, dataadr; wire memwrite; mips mips(clk, reset, pc, instr, memwrite, dataadr, writedata, readdata); // processor imem imem(pc[31:2], instr); // instr memory dmem dmem(clk, memwrite, dataadr, writedata, readdata); // data memory endmodule

  5. Top Level Schematic (ISE) imem MIPS dmem

  6. One level down: Inside MIPS module mips(input clk, reset, output [31:0] pc, input [31:0] instr, output memwrite, output [31:0] aluout, writedata, input [31:0] readdata); wire memtoreg, branch, pcsrc, alusrc, regdst, regwrite, jump; wire [4:0] alucontrol; // depends on your ALU wire [3:0] flags;// flags = {Z, V, C, N} controller c(instr[31:26], instr[5:0], flags, memtoreg, memwrite, pcsrc, alusrc, regdst, regwrite, jump, alucontrol); datapathdp(clk, reset, memtoreg, pcsrc, alusrc, regdst, regwrite, jump, alucontrol, flags, pc, instr, aluout, writedata, readdata); endmodule

  7. A Note on Flags • Book’s design only uses Z (zero) • simple version of MIPS • allows beq, bne, slt type of tests • Our design uses { Z, V, C, N } flags • Z = zero • V = overflow • C = carry out • N = negative • Allows richer variety of instructions • see next slide • wherever you see “zero” in these slides, it should probably read “flags”

  8. A Note on Flags -or- • 4 flags produced by ALU: • Z (zero): result is = 0 • big NOR gate • N (negative): result is < 0 • SN-1 • C (carry): indicates that most significant position produced a carry, e.g., “1 + (-1)” • Carry from last FA • V (overflow): indicates answer doesn’t fit • precisely: To compare A and B, perform A–B and use condition codes: Signed comparison: LT NV LE Z+(NV) EQ Z NE ~Z GE ~(NV) GT ~(Z+(NV)) Unsigned comparison: LTU C LEU C+Z GEU ~C GTU ~(C+Z)

  9. Datapath flags(3:0)

  10. MIPS State Elements • We’ll fill out the datapath and control logic for basic single cycle MIPS • first the datapath • then the control logic

  11. Single-Cycle Datapath: lw • Let’s start by implementing lw instruction

  12. Single-Cycle Datapath: lw • First consider executing lw • How does lw work? • STEP 1: Fetch instruction

  13. Single-Cycle Datapath: lw • STEP 2: Read source operands from register file

  14. Single-Cycle Datapath: lw • STEP 3: Sign-extend the immediate

  15. Single-Cycle Datapath: lw • STEP 4: Compute the memory address Note Control

  16. Single-Cycle Datapath: lw • STEP 5: Read data from memory and write it back to register file

  17. Single-Cycle Datapath: lw • STEP 6: Determine the address of the next instruction

  18. Let’s be Clear: CPU is Single-Cycle! • Although the slides said “STEP” … • … all that stuff is executed in one cycle!!! • Let’s look at sw next … • … and then R-type instructions

  19. Single-Cycle Datapath: sw • Write data in rt to memory • nothing is written back into the register file

  20. Single-Cycle Datapath: R-type instr • R-Type instructions: • Read from rs and rt • Write ALUResult to register file • Write to rd (instead of rt)

  21. Single-Cycle Datapath: beq • Determine whether values in rs and rt are equal • Calculate branch target address: BTA = (sign-extended immediate << 2) + (PC+4)

  22. Complete Single-Cycle Processor (w/control)

  23. Note: Difference due to Flags • Our Control Unit will be slightly different • … because of the extra flags • All flags (Z, V, C, N) are inputs to the control unit • Signals such as PCSrc are produced inside the control unit

  24. Control Unit • Generally as shown below • but some differences because our ALU is more sophisticated flags[3:0] Note: This will be different for our full-feature ALU! PCSrc Note: This will be 5 bits for our full-feature ALU!

  25. Review: Lightweight ALU from book

  26. Review: Lightweight ALU from book

  27. Review: Our “full feature” ALU Boolean Bidirectional Barrel Shifter Add/Sub 0 1 … • Full-feature ALU from COMP411: A B 5-bit ALUFN Sub Bool Shft Math OP 0 XX 0 1 A+B 1 XX 0 1 A-B X X0 1 1 0 X X1 1 1 1 X 00 1 0 B<<A X 10 1 0 B>>A X 11 1 0 B>>>A X 00 0 0 A & B X 01 0 0 A | B X 10 0 0 A ^ B X 11 0 0 A | B Sub Bool 1 0 Shft Math 1 0 R FlagsV,C N Flag Z Flag

  28. Review: R-Type instructions • Register-type • 3 register operands: • rs, rt: source registers • rd: destination register • Other fields: • op: the operation code or opcode (0 for R-type instructions) • funct: the function • together, op and funct tell the computer which operation to perform • shamt: the shift amount for shift instructions, otherwise itis 0

  29. Controller (2 modules) module controller(input [5:0] op, funct, input [3:0] flags, output memtoreg, memwrite, output pcsrc, alusrc, output regdst, regwrite, output jump, output [2:0] alucontrol); // 5 bits for our ALU!! wire [1:0] aluop; // This will be different for our ALU wire branch; maindecmd(op, memtoreg, memwrite, branch, alusrc, regdst, regwrite, jump, aluop); aludec ad(funct, aluop, alucontrol); assign pcsrc = branch & flags[3]; // flags = {Z, V, C, N} endmodule

  30. Main Decoder module maindec(input [5:0] op, output memtoreg, memwrite, branch, alusrc, output regdst, regwrite, jump, output [1:0] aluop); // different for our ALU reg [8:0] controls; assign {regwrite, regdst, alusrc, branch, memwrite, memtoreg, jump, aluop} = controls; always @(*) case(op) 6'b000000: controls <= 9'b110000010; //Rtype 6'b100011: controls <= 9'b101001000; //LW 6'b101011: controls <= 9'b001010000; //SW 6'b000100: controls <= 9'b000100001; //BEQ 6'b001000: controls <= 9'b101000000; //ADDI 6'b000010: controls <= 9'b000000100; //J default: controls <= 9'bxxxxxxxxx; //??? endcase endmodule Why do this? This entire coding may be different in our design

  31. ALU Decoder module aludec(input [5:0] funct, input [1:0] aluop, output reg [2:0] alucontrol); // 5 bits for our ALU!! always @(*) case(aluop) 2'b00: alucontrol <= 3'b010; // add 2'b01: alucontrol <= 3'b110; // sub default: case(funct) // RTYPE 6'b100000: alucontrol <= 3'b010; // ADD 6'b100010: alucontrol <= 3'b110; // SUB 6'b100100: alucontrol <= 3'b000; // AND 6'b100101: alucontrol <= 3'b001; // OR 6'b101010: alucontrol <= 3'b111; // SLT default: alucontrol <= 3'bxxx; // ??? endcase endcase endmodule This entire coding will be different in our design

  32. Control Unit: ALU Decoder This entire coding will be different in our design

  33. Control Unit: Main Decoder

  34. Note on controller • The actual number and names of control signals may be somewhat different in our/your design • compared to the one given in the book • because we are implementing more features/instructions • SO BE VERY CAREFUL WHEN YOU DESIGN YOUR CPU!

  35. Single-Cycle Datapath Example: or

  36. Extended Functionality: addi • No change to datapath

  37. Control Unit: addi

  38. Adding Jumps: j

  39. Control Unit: Main Decoder

  40. Review: Processor Performance Program Execution Time = (# instructions)(cycles/instruction)(seconds/cycle) = # instructions x CPI x TC

  41. Single-Cycle Performance • TC is limited by the critical path (lw)

  42. Single-Cycle Performance • Single-cycle critical path: • Tc = tpcq_PC + tmem + max(tRFread, tsext + tmux) + tALU + tmem + tmux + tRFsetup • In most implementations, limiting paths are: • memory, ALU, register file. • Tc = tpcq_PC + 2tmem + tRFread + tALU + tRFsetup + tmux

  43. Single-Cycle Performance Example Tc = tpcq_PC + 2tmem + tRFread + tmux + tALU + tRFsetup = [30 + 2(250) + 150 + 25 + 200 + 20] ps = 925 ps What’s the max clock frequency?

  44. Single-Cycle Performance Example • For a program with 100 billion instructions executing on a single-cycle MIPS processor, • Execution Time= # instructions x CPI x TC= (100 × 109)(1)(925 × 10-12 s)= 92.5 seconds

  45. Next Time • Next class: • We’ll look at multi-cycle MIPS • Adding functionality to our design • Next lab: • Implement single-cycle CPU!

More Related