330 likes | 495 Views
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Single-Cycle Processor Design. Instructors: Randy H. Katz David A. Patterson http://inst.eecs.Berkeley.edu/~cs61c/fa10. Agenda. MIPS-lite Datapath Administrivia Technology Break CPU Timing. Agenda. MIPS- lite Datapath
E N D
CS 61C: Great Ideas in Computer Architecture (Machine Structures)Single-Cycle Processor Design Instructors:Randy H. KatzDavid A. Patterson http://inst.eecs.Berkeley.edu/~cs61c/fa10 Fall 2010 -- Lecture #26
Agenda • MIPS-lite Datapath • Administrivia • Technology Break • CPU Timing Fall 2010 -- Lecture #26
Agenda • MIPS-liteDatapath • Administrivia • Technology Break • CPU Timing Fall 2010 -- Lecture #26
31 26 21 16 11 6 0 op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits 31 26 21 16 0 op rs rt immediate 6 bits 5 bits 5 bits 16 bits 31 26 21 16 0 op rs rt immediate 6 bits 5 bits 5 bits 16 bits 31 26 21 16 0 op rs rt immediate 6 bits 5 bits 5 bits 16 bits The MIPS-lite Subset • ADDU and SUBU • addu rd,rs,rt • subu rd,rs,rt • OR Immediate: • ori rt,rs,imm16 • LOAD and STORE Word • lw rt,rs,imm16 • sw rt,rs,imm16 • BRANCH: • beq rs,rt,imm16 Fall 2010 -- Lecture #26
Register Transfer Language (RTL) • RTL gives the meaning of the instructions • All start by fetching the instruction {op , rs , rt , rd , shamt , funct} MEM[ PC ] {op , rs , rt , Imm16} MEM[ PC ] InstRegister Transfers ADDU R[rd] R[rs] + R[rt]; PC PC + 4 SUBU R[rd] R[rs] – R[rt]; PC PC + 4 ORI R[rt] R[rs] | zero_ext(Imm16); PC PC + 4 LOAD R[rt] MEM[ R[rs] + sign_ext(Imm16)]; PC PC + 4 STORE MEM[ R[rs] + sign_ext(Imm16) ] R[rt]; PC PC + 4 BEQ if ( R[rs] == R[rt] ) then PC PC + 4 + (sign_ext(Imm16) || 00) else PC PC + 4 Fall 2010 -- Lecture #26
Processor Design Process • Five steps to design a processor: Step 1: Analyze instruction set to determinedatapath requirements Step 2: Select set of datapath components & establish clock methodology Step 3: Assemble datapath components that meet the requirements Step 4: Analyze implementation of each instruction to determine setting of control points that realizes the register transfer Step 5: Assemble the control logic Fall 2010 -- Lecture #26
Step 1: Requirements of the Instruction Set • Memory (MEM) • Instructions & data (will use one for each) • Registers (R: 32 x 32) • Read RS • Read RT • Write RT or RD • PC • Extender (sign/zero extend) • Add/Sub/OR unit for operation on register(s) or extended immediate • Add 4 (+ maybe extended immediate) to PC • Compare registers? Fall 2010 -- Lecture #26
ALU 2. Decode/ Register Read 5. Register Write 1. Instruction Fetch 4. Memory 3. Execute Generic Steps of Datapath rd instruction memory registers PC rs Data memory rt +4 imm Fall 2010 -- Lecture #26
Adder ALU Step 2: Components of the Datapath CarryIn A • Combinational Elements • Storage Elements + Clocking Methodology • Building Blocks 32 Sum 32 B CarryOut 32 Select A 32 MUX Y 32 B 32 OP ALU Multiplexer A Adder 32 Result 32 B 32 Fall 2010 -- Lecture #26
ALU Needs for MIPS-lite + Rest of MIPS • Addition, subtraction, logical OR, ==: ADDU R[rd] = R[rs] + R[rt]; ... SUBU R[rd] = R[rs] – R[rt]; ... ORI R[rt] = R[rs] | zero_ext(Imm16)... BEQ if ( R[rs] == R[rt] )... • Test to see if output == 0 for any ALU operation gives == test. How? • P&H also adds AND, Set Less Than (1 if A < B, 0 otherwise) • ALU follows Chapter 5 Fall 2010 -- Lecture #26
Storage Element: Idealized Memory Write Enable Address • Memory (idealized) • One input bus: Data In • One output bus: Data Out • Memory word is found by: • Address selects the word to put on Data Out • Write Enable = 1: address selects the memoryword to be written via the Data In bus • Clock input (CLK) • CLK input is a factor ONLY during write operation • During read operation, behaves as a combinational logic block: Address valid Data Out valid after “access time” Data In DataOut 32 32 Clk Fall 2010 -- Lecture #26
Write Enable Data In Data Out N N clk Storage Element: Register (Building Block) • Similar to D Flip Flop except • N-bit input and output • Write Enable input • Write Enable: • Negated (or deasserted) (0): Data Out will not change • Asserted (1): Data Out will become Data In on positive edge of clock Fall 2010 -- Lecture #26
Storage Element: Register File RW RA RB Write Enable 5 5 5 • Register File consists of 32 registers: • Two 32-bit output busses: busA and busB • One 32-bit input bus: busW • Register is selected by: • RA (number) selects the register to put on busA (data) • RB (number) selects the register to put on busB (data) • RW (number) selects the register to be writtenvia busW (data) when Write Enable is 1 • Clock input (clk) • Clk input is a factor ONLY during write operation • During read operation, behaves as a combinational logic block: • RA or RB valid busA or busB valid after “access time.” busA busW 32 32 x 32-bit Registers 32 busB Clk 32 Fall 2010 -- Lecture #26
Next Address Logic Address Instruction Memory Step 3: Assemble DataPath Meeting Requirements • Register Transfer Requirements Datapath Assembly • Instruction Fetch • Read Operands and Execute Operation • Common RTL operations • Fetch the Instruction: mem[PC] • Update the program counter: • Sequential Code: PC PC + 4 • Branch and Jump: PC “something else” clk PC Instruction Word 32 Fall 2010 -- Lecture #26
Step 3: Add & Subtract • R[rd] = R[rs] op R[rt] (addu rd,rs,rt) • Ra, Rb, and Rw come from instruction’s Rs, Rt, and Rd fields • ALUctr and RegWr: control logic after decoding the instruction • … Already defined the register file & ALU 31 26 21 16 11 6 0 op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits Rd Rs Rt ALUctr RegWr 5 5 5 busA Rw Ra Rb busW 32 Result 32 x 32-bit Registers ALU 32 32 busB clk 32 Fall 2010 -- Lecture #26
Agenda • MIPS-lite • Administrivia • Technology Break • CPU Design Fall 2010 -- Lecture #26
Agenda • MIPS-lite • Administrivia • Technology Break • CPU Design Fall 2010 -- Lecture #26
Agenda • MIPS-lite • Administrivia • Technology Break • CPU Timing Fall 2010 -- Lecture #26
Clocking Methodology Clk • Storage elements clocked by same edge • Flip-flops (FFs) and combinational logic have some delays • Gates: delay from input change to output change • Signals at FF D input must be stable before active clock edge to allow signal to travel within the FF (set-up time), and we have the usual clock-to-Q delay • “Critical path” (longest path through logic) determines length of clock period . . . . . . . . . . . . Fall 2010 -- Lecture #26
ALU Register-Register Timing: One Complete Cycle Clk Old Value New Value PC Instruction Memory Access Time Rs, Rt, Rd, Op, Func Old Value New Value Delay through Control Logic ALUctr Old Value New Value RegWr Old Value New Value Register File Access Time busA, B Old Value New Value ALU Delay busW Old Value New Value ALUctr RegWr Rd Rs Rt 5 5 5 Register Write Occurs Here busA 32 Rw Ra Rb busW 32 RegFile busB 32 clk Fall 2010 -- Lecture #26
R[rt] = R[rs] op ZeroExt[imm16] op rs rt 31 16 15 0 immediate 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 16 bits 16 bits ALU Logical Operations with Immediate 31 26 21 16 0 15 immediate 6 bits 5 bits 5 bits 16 bits But we’re writing to Rt register?? ALUctr RegWr Rd Rs Rt 5 5 5 busA 32 Rw Ra Rb busW 32 RegFile busB 32 clk Fall 2010 -- Lecture #26
R[rt] = R[rs] op ZeroExt[imm16] op rs rt 31 16 15 0 immediate 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 16 bits 16 bits ALU Logical Operations with Immediate 31 26 21 16 0 immediate 6 bits 5 bits 5 bits 16 bits RegDst Rd Rt What about Rt register read?? 1 0 RegWr Rs Rt ALUctr 5 5 5 busA 32 Rw Ra Rb 32 RegFile busB 32 • Already defined 32-bit MUX; Zero Ext? 0 32 clk imm16 1 ZeroExt 16 32 ALUSrc Fall 2010 -- Lecture #26
31 26 21 16 0 op rs rt immediate 6 bits 5 bits 5 bits 16 bits ALU Load Operations • R[rt] = Mem[R[rs] + SignExt[imm16]]Example: lw rt,rs,imm16 RegDst Rd Rt 1 0 RegWr Rs Rt ALUctr 5 5 5 busA 32 Rw Ra Rb 32 RegFile busB 32 0 32 clk imm16 1 ZeroExt 16 32 ALUSrc Fall 2010 -- Lecture #26
31 26 21 16 0 op rs rt immediate 6 bits 5 bits 5 bits 16 bits 1 0 ALU 0 1 Load Operations • R[rt] = Mem[R[rs] + SignExt[imm16]] Example: lw rt,rs,imm16 MemtoReg ALUctr RegDst Rd Rt MemWr RegWr Rs Rt 5 5 5 busA 32 Rw Ra Rb busW 32 RegFile busB 32 0 32 clk ? 32 WrEn Adr imm16 Data In 1 Extender Data Memory 16 32 clk ALUSrc ExtOp Fall 2010 -- Lecture #26
add rd, rs, rt MEM[PC] Fetch the instruction from memory R[rd] = R[rs] + R[rt] The actual operation PC = PC + 4 Calculate the next instruction’s address 31 26 21 16 11 6 0 op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits RTL: The Add Instruction Fall 2010 -- Lecture #26
00 PC Instruction Fetch Unit at the Beginning of Add • Fetch the instruction from Instruction memory: Instruction = MEM[PC] • same for all instructions Inst Memory Instruction<31:0> nPC_sel Inst Address 4 Adder Mux Adder clk PC Ext imm16 Fall 2010 -- Lecture #26
31 26 21 16 11 6 0 op rs rt rd shamt funct 1 0 = ALU 0 0 WrEn Adr 1 1 Extender Data Memory Single Cycle Datapath during Add R[rd] = R[rs] + R[rt] Instruction<31:0> instr fetch unit nPC_sel=+4 RegDst=1 <0:15> <21:25> <16:20> <11:15> clk Rd Rt Rs Rt Rd Imm16 ALUctr=ADD zero RegWr=1 Rs Rt MemtoReg=0 5 5 5 MemWr=0 busA 32 Rw Ra Rb busW 32 RegFile busB 32 32 clk 32 imm16 Data In 16 32 clk ALUSrc=0 ExtOp=x Fall 2010 -- Lecture #26
00 PC Instruction Fetch Unit at End of Add • PC = PC + 4 • Same for all instructions except: Branch and Jump Inst Memory nPC_sel=+4 Inst Address 4 Adder Mux Adder clk PC Ext imm16 Fall 2010 -- Lecture #26
31 26 21 16 0 op rs rt immediate 1 0 = ALU 0 0 WrEn Adr 1 1 Data Memory Extender Single Cycle Datapath during Or Immediate • R[rt] = R[rs] OR ZeroExt[Imm16] Instruction<31:0> instr fetch unit nPC_sel= RegDst= <0:15> <21:25> <16:20> <11:15> clk Rd Rt Rs Rt Rd Imm16 ALUctr= zero RegWr= Rs Rt MemtoReg= 5 5 5 MemWr= busA 32 Rw Ra Rb busW 32 RegFile busB 32 32 clk 32 imm16 Data In 16 32 clk ALUSrc= ExtOp= Fall 2010 -- Lecture #26
R[rt] = R[rs] OR ZeroExt[Imm16] 31 26 21 16 0 op rs rt immediate 1 0 = ALU 0 0 WrEn Adr 1 1 Data Memory Extender Single Cycle Datapath during Or Immediate Instruction<31:0> nPC_sel=+4 instr fetch unit RegDst=0 <0:15> <21:25> <16:20> <11:15> clk Rd Rt Rs Rt Rd Imm16 ALUctr=OR zero RegWr=1 Rs Rt MemtoReg=0 5 5 5 MemWr=0 busA 32 Rw Ra Rb busW 32 RegFile busB 32 32 clk 32 imm16 Data In 16 32 clk ALUSrc=1 ExtOp=zero Fall 2010 -- Lecture #26
31 26 21 16 0 op rs rt immediate 1 0 = ALU 0 0 WrEn Adr 1 1 Data Memory Extender Single Cycle Datapath during Load • R[rt] = Data Memory {R[rs] + SignExt[imm16]} Instruction<31:0> instr fetch unit nPC_sel= RegDst= <0:15> <21:25> <16:20> <11:15> clk Rd Rt Rs Rt Rd Imm16 ALUctr= zero RegWr= Rs Rt MemtoReg= 5 5 5 MemWr= busA 32 Rw Ra Rb busW 32 RegFile busB 32 32 clk 32 imm16 Data In 16 32 clk ALUSrc= ExtOp= Fall 2010 -- Lecture #26
31 26 21 16 0 op rs rt immediate 1 0 = ALU 0 0 WrEn Adr 1 1 Data Memory Extender Single Cycle Datapath during Load • R[rt] = Data Memory {R[rs] + SignExt[imm16]} Instruction<31:0> instr fetch unit nPC_sel=+4 RegDst=0 <0:15> <21:25> <16:20> <11:15> clk Rd Rt Rs Rt Rd Imm16 ALUctr=ADD zero RegWr=1 Rs Rt MemtoReg=1 5 5 5 MemWr=0 busA 32 Rw Ra Rb busW 32 RegFile busB 32 32 clk 32 imm16 Data In 16 32 clk ALUSrc=1 ExtOp=sign Fall 2010 -- Lecture #26
1 0 00 ALU PC 0 1 Summary: Datapath’s Control Signals • ExtOp: “zero”, “sign” • ALUsrc: 0 regB; 1 immed • ALUctr: “ADD”, “SUB”, “OR” • MemWr: 1 write memory • MemtoReg: 0 ALU; 1 Mem • RegDst: 0 “rt”; 1 “rd” • RegWr: 1 write register ALUctr MemtoReg RegDst Rd Rt Inst Address MemWr nPC_sel RegWr Rs Rt 4 5 5 5 Adder busA 32 Rw Ra Rb 0 busW 32 RegFile busB 32 0 Mux 32 clk 1 Adder 32 WrEn Adr clk imm16 Data In PC Ext 1 Extender Data Memory 16 32 clk ALUSrc ExtOp imm16 Fall 2010 -- Lecture #26