1.3k likes | 1.31k Views
This review covers the design principles of MIPS processors, focusing on simplicity, fixed-size instructions, limited instruction set, and the implementation of combinatorial and sequential circuits.
E N D
Review: MIPS (RISC) Design Principles • Simplicity favors regularity • fixed size instructions • small number of instruction formats • opcode always the first 6 bits • Smaller is faster • limited instruction set • limited number of registers in register file • limited number of addressing modes • Make the common case fast • arithmetic operands from the register file (load-store machine) • allow instructions to contain immediate operands • Good design demands good compromises • three instruction formats
Sequential vs Combinational Circuits • Combinational logic circuits • output is a function of the present value of the inputs only. • When inputs are changed, the information about the previous inputs is lost • memoryless • E.g., • Sequential logic circuits • outputs are also dependent upon past inputs • has memory • flip flops/latches
Sequential vs Combinational Circuits • Combinational logic circuits • output is a function of the present value of the inputs only. • When inputs are changed, the information about the previous inputs is lost • memoryless • e.g., multiplexors. • Sequential logic circuits • outputs are also dependent upon past inputs • has memory • basically combinational circuits with the additional properties of storage (to remember past inputs) and feedback
RS Latches • An RS latch is a memory element with 2 inputs: - Reset (R) - Set (S) - 2 outputs: Q and Q Note: if inputs don’t change, outputs are held indefinitely.
RS Latches - Hold 0 1 0 0 0 0 0 1
RS Latches - Set 0 0 1 1 1 0 Reset 1 1 0 0 0 1
Clocks and Synchronous Circuits • Asynchronous operation : - the output state of RS latches changes occur directly in response to changes in the inputs. • Virtually all sequential circuits currently employ the notion of synchronous operation the output of a sequential circuit is constrained to change only at a time specified by a global enabling signal. This signal is generally known as the system clock
Transparent D Latches • modify the RS Latch such that its output state is only permitted to change when a valid enable signal (system clock) is present • Add a couple of AND gates in cascade with the R and S inputs that are controlled by an additional input known as the enable (EN) input
Race Around Condition only when J = K= clk = 1 Reset to 0 when input = 1 1, 0,1 = 1,0,1 1 1,0 1 1 0,1 1 = 0,1,0 1,0 Clk J K Qn+1 (Qn+1)’ 0 x xQn (Qn)’ 1 0 0 Qn (Qn)’ 1 0 1 0 1 (reset) 1 1 0 1 0 (set) 1 1 1 ? ?
Master-Slave Flip Flops • Easy to design sequential circuits if outputs change on: - rising (positive trending) - falling (negative trending) edges of a clock (i.e., enable) signal Can be done by combining two transparent D latches in a Master-Slave configuration.
Fetch PC = PC+4 Decode Exec The Processor: Datapath & Control • Our implementation of the MIPS is simplified • memory-reference instructions: lw, sw • arithmetic-logical instructions: add, sub, and, or, slt • control flow instructions: beq, j • Generic implementation • use the program counter (PC) to supply the instruction address and fetch the instruction from memory (and update the PC) • decode the instruction (and read registers) • execute the instruction • All instructions (except j) use the ALU after reading the registers How? memory-reference? arithmetic? control flow?
Aside: Clocking Methodologies • The clocking methodology defines when data in a state element is valid and stable relative to the clock • State elements - a memory element such as a register • Edge-triggered – all state changes occur on a clock edge • Typical execution • read contents of state elements -> send values through combinational logic -> write results to one or more state elements State element 1 State element 2 Combinational logic clock one clock cycle • Assumes state elements are written on every clock cycle; if not, need explicit write control signal • write occurs only when both the write control is asserted and the clock edge occurs
Add 4 Fetch PC = PC+4 Instruction Memory Exec Decode Read Address clock PC Instruction Fetch Phase • Fetching instructions involves • reading the instruction from the Instruction Memory • updating the PC value to be the address of the next (sequential) instruction • PC is updated every clock cycle, so it does not need an explicit write control signal just a clock signal • Reading from the Instruction Memory is a combinational activity, so it doesn’t need an explicit read control signal
Fetch PC = PC+4 Exec Decode Read Addr 1 Read Data 1 Register File Read Addr 2 Write Addr Read Data 2 Write Data Decoding Instructions • Decoding instructions involves • sending the fetched instruction’s opcode and function field bits to the control unit Control Unit Instruction • reading two values from the Register File • Register addresses (Read Addr 1 & Read Addr 2) • are contained in the instruction
31 25 20 15 10 5 0 R-type: op rs rt rd shamt funct RegWrite ALU control (3 bit code) Fetch PC = PC+4 Read Addr 1 Read Data 1 Register File Read Addr 2 overflow Instruction zero Exec Decode ALU Write Addr Read Data 2 Write Data Executing R Format Operations • R format operations (add, sub, slt, and, or) • perform operation (op andfunct) on values in rs and rt • store the result back into the Register File (into location rd) • Register File is not written every cycle (e.g., sw) • need an explicit write control signal (RegWrite) for the it.
RegWrite ALU control MemWrite overflow zero Read Addr 1 Read Data 1 Address Register File Read Addr 2 Instruction Data Memory Read Data ALU Write Addr Read Data 2 Write Data Write Data MemRead Sign Extend 16 32 Executing Load and Store Operations • Load and store operations involves • compute memory address by adding the base register (read from the Register File during decode) to the 16-bit signed-extended offset field in the instruction, e.g., sw $s3 4($t5) • store value (read from the Register File during decode) written to the Data Memory, • loadvalue, read from the Data Memory, written to the Register File $t5 4
op rs rt constant or address 6 bits 5 bits 5 bits 16 bits Branch Addressing • Branch instructions specify • opcode, two registers, target address • Most branch targets are near branch • - Forward or backward • PC-relative addressing • Target address = PC + offset × 4 • PC already incremented by 4 by this time
Executing Branch Operations • Branch operations involves • compare the operands read from the Register File during decode for equality (zeroALU output) • compute the branch target address by adding the updated PC to the 16-bit signed-extended offset field in the instr Why << 2? Branch target address Add Add 4 Shift left 2 ALU control (3 bit code) PC zero (to branch control logic) Read Addr 1 Read Data 1 Register File Read Addr 2 Instruction ALU Write Addr Read Data 2 Write Data Sign Extend 16 32
op address 26 bits 6 bits Jump Addressing • Jump (j and jal) targets could be anywhere in text segment • Encode full address in instruction • (Pseudo)Direct jump addressing • Target address = PC31…28 : (address × 4) Chapter 2 — Instructions: Language of the Computer — 31
Executing Jump Operations • Jump operation involves • replace the lower 28 bits of the PC with the lower 26 bits of the fetched instruction shifted left by 2 bits • Target address = PC31…28 : (address × 4) Add 4 4 Jump address Instruction Memory Shift left 2 28 Read Address PC Instruction 26
Creating a Single Datapath from the Parts • Assemble the datapath segments and add control lines and multiplexors as needed • Single cycle design – fetch, decode and execute each instructions in one clock cycle • no datapath resource can be used more than once per instruction, so some must be duplicated (e.g., separate Instruction Memory and Data Memory, several adders) • multiplexors needed at the input of shared elements with control lines to do the selection • write signals to control writing to the Register File and Data Memory • Cycle time is determined by length of the longest path
Multiplexors 2 Input 1 Bit Selector Device (2x1 MUX) • Here is a truth table definition of a “function” we wish to implement: • When S = 0, A is “selected” for output • When S = 1, B is “selected” for output
A B output S (control signal) • 2x1 MUX (Multiplexor) • What is the Boolean expression fora 2x1 MUX? Output = S • B + S • A • How do you implement this using gates?
Multiplexors (MUX) and ALUs - To select a source input for ALU From Register M U X A L U From instruction field Control signal
Add RegWrite ALUSrc ALU control MemWrite MemtoReg 4 ovf zero Read Addr 1 Instruction Memory Read Data 1 Address Register File Read Addr 2 Data Memory Read Address PC Instruction Read Data ALU Write Addr Read Data 2 Write Data Write Data MemRead Sign Extend 16 32 Fetch, R, and Memory Access Portions - multiplexor (mux) - ALU (adder)
31 25 0 J-type: op target address Adding the Control • Selecting the operations to perform (ALU, Register File and Memory read/write) • Controlling the flow of data (multiplexor inputs) 31 25 20 15 10 5 0 R-type: op rs rt rd shamt funct 31 25 20 15 0 • Observations • op field always in bits 31-26 • addr of registers to be read are always specified by the rs field (bits 25-21) and rt field (bits 20-16); for lw and swrs is the base register • addr. of register to be written is in one of two places – in rt (bits 20-16) for lw; in rd (bits 15-11) for R-type instructions • offset for beq, lw, and swalways in bits 15-0 I-Type: address offset op rs rt
Control The control unit is responsible for setting all the control signals so that each instruction is executed properly. — The control unit’s input is the 32-bit instruction word. — The outputs are values for the control signals in the datapath. Most of the signals can be generated from the instruction opcode alone, and not the entire 32-bit word. To illustrate the relevant control signals, we will show the route that is taken through the datapath by R-type, lw, swand beq instructions.
ALU Control Unit 0 Add Add 1 4 Shift left 2 PCSrc ALUOp Branch MemRead Instr[31-26] Control Unit MemtoReg MemWrite ALUSrc RegWrite RegDst ovf Instr[25-21] Read Addr 1 Instruction Memory Read Data 1 Address Register File Instr[20-16] zero Read Addr 2 Data Memory Read Address PC Instr[31-0] 0 Read Data 1 ALU Write Addr Read Data 2 0 1 Write Data 0 Instr[15 -11] Write Data 1 4 Instr[15-0] Sign Extend ALU control 16 32 Instr[5-0] 2
Bit I/O for ALU Control Unit 0 Add Add 1 4 Shift left 2 PCSrc ALUOp Branch MemRead Instr[31-26] Control Unit MemtoReg MemWrite ALUSrc RegWrite RegDst ovf Instr[25-21] Read Addr 1 Instruction Memory Read Data 1 Address Register File Instr[20-16] zero Read Addr 2 Data Memory Read Address PC Instr[31-0] 0 Read Data 1 ALU Write Addr Read Data 2 0 1 Write Data 0 Instr[15 -11] Write Data 1 4 Instr[15-0] Sign Extend ALU control 16 32 Instr[5-0] 2
R-type Instruction 31 25 20 15 10 5 0 R-type: op rs rt rd shamt funct 31 25 20 15 0
R-type Dataflow 0 Add Add 1 4 Shift left 2 PCSrc ALUOp Branch MemRead Instr[31-26] Control Unit MemtoReg ALUSrc MemWrite RegWrite RegDst ovf Instr[25-21][rs] Read Addr 1 Instruction Memory Instr[20-16][rt] Read Data 1 Address Register File zero Read Addr 2 Data Memory Read Address PC Instr[31-0] 0 Read Data 1 ALU Write Addr Read Data 2 0 1 Write Data 0 Instr[15 -11][rd] 1 Write Data ALUOp Instr[15-0] Sign Extend ALU control 16 32 Instr[5-0]
R type - Control Lines 0 0 0 0 1