160 likes | 257 Views
inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 26 – Single Cycle CPU Datapath II. Lecturer PSOE Dan Garcia www.cs.berkeley.edu/~ddgarcia. 98,389 UC Identity thefts?! .
E N D
inst.eecs.berkeley.edu/~cs61cCS61C : Machine Structures Lecture 26 – Single Cycle CPU Datapath II Lecturer PSOE Dan Garcia www.cs.berkeley.edu/~ddgarcia 98,389 UC Identity thefts?! A laptop was stolen from Graddivision with names, SS #, birth dates of almost 106 former grad students at UC Berkeley. The thief may not know what they have! Sensitive data allowed on portables? Good idea…NOT! newscenter.berkeley.edu/security/grad/
How to Design a Processor: step-by-step • 1. Analyze instruction set architecture (ISA) => datapath requirements • meaning of each instruction is given by the register transfers • datapath must include storage element for ISA registers • datapath must support each register transfer • 2. Select set of datapath components and establish clocking methodology • 3. Assemble datapath meeting requirements • 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. • 5. Assemble the control logic (hard part!)
Step 3: Assemble DataPath meeting requirements • Register Transfer Requirements Datapath Assembly • Instruction Fetch • Read Operands and Execute Operation
Next Address Logic Address Instruction Memory 3a: Overview of the Instruction Fetch Unit • The common RTL operations • Fetch the Instruction: mem[PC] • Update the program counter: • Sequential Code: PC = PC + 4 • Branch and Jump: PC = “something else” Clk PC Instruction Word 32
3b: Add & Subtract • R[rd] = R[rs] op R[rt] Ex.: addU rd,rs,rt • Ra, Rb, and Rw come from instruction’s Rs, Rt, and Rd fields • ALUctr and RegWr: control logic after decoding the instruction 31 26 21 16 11 6 0 op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits Rd Rs Rt ALUctr RegWr 5 5 5 busA Rw Ra Rb busW 32 Result 32 32-bit Registers ALU 32 32 busB Clk 32 • We’ve already defined register file, ALU
. . . . . . . . . . . . Clocking Methodology Clk • Storage elements clocked by same edge • Being physical devices, flip-flops (FF) and combinational logic have some delays • Gates: delay from input change to output change • Signals at FF D input must be stable before active clock edge to allow signal to travel within the FF, and we have the usual clock-to-Q delay • “Critical path” (longest path through logic) determines length of clock period
Register-Register Timing: One complete cycle Clk New Value Old Value PC Instruction Memory Access Time Rs, Rt, Rd, Op, Func Old Value New Value Delay through Control Logic ALUctr Old Value New Value RegWr Old Value New Value Register File Access Time busA, B Old Value New Value ALU Delay busW Old Value New Value Rd Rs Rt ALUctr Register Write Occurs Here RegWr 5 5 5 busA Rw Ra Rb busW 32 Result 32 32-bit Registers ALU 32 32 busB Clk 32
11 31 26 21 16 0 op rs rt immediate 6 bits 5 bits 5 bits 16 bits rd? 31 16 15 0 immediate 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 16 bits 16 bits Rd Rt RegDst Mux 3c: Logical Operations with Immediate • R[rt] = R[rs] op ZeroExt[imm16] ] What about Rt register read?? Rt? Rs ALUctr RegWr 5 5 5 busA Rw Ra Rb busW 32 Result 32 32-bit Registers ALU 32 32 busB Clk 32 Mux ZeroExt imm16 32 16 ALUSrc • Already defined 32-bit MUX; Zero Ext?
31 26 21 16 0 op rs rt immediate 6 bits 5 bits 5 bits 16 bits 3d: Load Operations • R[rt] = Mem[R[rs] + SignExt[imm16]] Example: lw rt,rs,imm16 Rd Rt RegDst Mux Rt Rs ALUctr RegWr 5 5 5 busA W_Src Rw Ra Rb busW 32 32 32-bit Registers ALU 32 32 busB Clk MemWr 32 Mux Mux WrEn Adr Data In 32 ?? Data Memory Extender 32 imm16 32 16 Clk ALUSrc ExtOp
31 26 21 16 0 op rs rt immediate 6 bits 5 bits 5 bits 16 bits 3e: Store Operations • Mem[ R[rs] + SignExt[imm16] ] = R[rt] Ex.: sw rt, rs, imm16 Rd Rt ALUctr MemWr W_Src RegDst Mux Rs Rt RegWr 5 5 5 busA Rw Ra Rb busW 32 32 32-bit Registers ALU 32 32 busB Clk 32 Mux Mux WrEn Adr Data In 32 32 Data Memory imm16 Extender 32 16 Clk ALUSrc ExtOp
31 26 21 16 0 op rs rt immediate 6 bits 5 bits 5 bits 16 bits 3f: The Branch Instruction • beq rs, rt, imm16 • mem[PC] Fetch the instruction from memory • Equal = R[rs] == R[rt] Calculate branch condition • if (Equal) Calculate the next instruction’s address • PC = PC + 4 + ( SignExt(imm16) x 4 ) else • PC = PC + 4
31 26 21 16 0 op rs rt immediate 6 bits 5 bits 5 bits 16 bits 4 Adder Mux PC Adder Clk Datapath for Branch Operations • beq rs, rt, imm16 Datapath generates condition (equal) Inst Address Cond nPC_sel Rs Rt RegWr 5 5 5 busA 32 Rw Ra Rb 00 busW 32 32 32-bit Registers Equal? busB Clk 32 imm16 PC Ext • Already MUX, adder, sign extend, zero
Inst Memory Adr Adder Mux Adder Putting it All Together:A Single Cycle Datapath Instruction<31:0> <0:15> <21:25> <16:20> <11:15> Rs Rt Rd Imm16 RegDst nPC_sel ALUctr MemWr MemtoReg Equal Rt Rd 0 1 Rs Rt 4 RegWr 5 5 5 busA Rw Ra Rb = 00 busW 32 32 32-bit Registers ALU 0 32 busB 32 0 PC 32 Mux Mux Clk 32 WrEn Adr 1 1 Data In Data Memory imm16 Extender 32 PC Ext Clk 16 imm16 Clk ExtOp ALUSrc
PC ALU Clk An Abstract View of the Implementation Control Ideal Instruction Memory Control Signals Conditions Instruction Rd Rs Rt 5 5 5 Instruction Address A Data Address Data Out 32 Rw Ra Rb 32 Ideal Data Memory 32 32 32-bit Registers Next Address Data In B Clk Clk 32 Datapath
Peer Instruction • Our ALU is a synchronous device • We could have used tri-state devices instead of a MUX to feed busW, the register write data line • The ALU is inactive for memory reads or writes. ABC 1: FFF 2: FFT 3: FTF 4: FTT 5: TFF 6: TFT 7: TTF 8: TTT
Summary: Single cycle datapath • 5 steps to design a processor • 1. Analyze instruction set => datapath requirements • 2. Select set of datapath components & establish clock methodology • 3. Assemble datapath meeting the requirements • 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. • 5. Assemble the control logic • Control is the hard part • Next time! Processor Input Control Memory Datapath Output