230 likes | 457 Views
CHAPTER 5 THE PROCESSOR: DATAPATH AND CONTROL. Goals Understand how the various implementation strategies affect the clock rate and CPI of a machine See how the instruction set architecture determines many aspects of the hardware implementation
E N D
CHAPTER 5THE PROCESSOR: DATAPATH AND CONTROL Goals • Understand how the various implementation strategies affect the clock rate and CPI of a machine • See how the instruction set architecture determines many aspects of the hardware implementation • Present the multicycle implementations of an architecture that implements a subset of the MIPS instruction set • Present the implementation of a multicycle microprogrammed control unit S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
IMPLEMENTATION VS. PERFORMANCE Performance of a processor is determined by • Instruction count of a program • CPI • Clock cycle time (clock rate) The compiler and the instruction set of the processor determines the instruction count. The implementation of the processor determines the CPI and the clock cycle time. S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
A MULTICYCLE IMPLEMENTATION OF THE MIPS INSTRUCTION SET Focus will be on designing an architecture that implements a subset of the MIPS instruction set. The instructions considered are: • Memory reference instructions (load and store) lw $s2, 64($s5) and sw $s5, 32($s6) • Arithmetic and logical instructions (add, sub, and, or, etc.) add $s2, $s1, $s2 • Branch equal (beq) instruction beq $s1, $s2, L1 • Jump (j) instruction j L2 S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
MIPS Instruction Formats R - Format 31 0 op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
MIPS Instruction Formats I - Format 31 0 op rs rt address/immediate 6 bits 5 bits 5 bits 16 bits For load and store instructions Memory address = Index register + sign-extended pointed to by ‘rs’ offset address For branch instructions Target address = PC + (signed-extended offset address << 2) S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
MIPS Instruction Formats J - Format 31 26 25 0 op offset address 6 bits 26 bits For j instruction Target address = PC[31-28] (offset address << 2) S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
Multicycle Implementation The following figures are used to discuss the multicycle implementation of the MIPS architecture: Figure 5.25 The high-level overview of the multicycle datapath Figure 5.26 The multicycle datapath for the MIPS architecture Figure 5.27 The multicycle datapath with control lines Figure 5.28 The complete datapath and control lines for the multicycle implementation of the MIPS architecture Figure 5.13 Truth table for ALU control bits (inputs are ALUop and the function codes) S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
Figure 5.25 The high-level overview of the multicycle datapath S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
Figure 5.26 The multicycle datapath for the MIPS architecture S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
Figure 5.27 The multicycle datapath with control lines S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
Figure 5.28 The complete datapath and control lines for the multicycle implementation of the MIP architecture S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
ALUop & ALU Control Operation Desired ALU Action ALUop Function Code ALU Control Increment PC Addition 00 x x x x x x 0010 during fetch cycle Compute Memory Addition 00 x x x x x x 0010 address during load execution Compute Memory Addition 00 x x x x x x 0010 address during store execution Compute target Addition 00 x x x x x x 0010 address for beq Compare during Subtraction 01 x x x x x x 0110 beq execution R-type instruction Specified by 10 Given by the 9 remaining (9 instructions) instruction function code combinations format except 0010 & 0110 S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
Instruction Execution Clock Cycles Instruction Fetch Instruction Decode and Register Read S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
Instruction Execution Clock Cycles (Continued) Execution of the Memory Reference Instructions Load Instruction Execution Store Instruction Execution S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
Instruction Execution Clock Cycles (Continued) Arithmetic-Logical Instruction S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
Instruction Execution Clock Cycles (Continued) Branch Instruction Jump Instruction S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
MICROPROGRAMMED CONTROL UNIT Note: The discussion on “Microprogrammed control unit” is based on Section 5.8 and the Appendix C available on the CD. The microprogramming technique enables one to design the control unit as a microprogram. The microprogram uses a sequence of microinstructions to implements the machine Instructions. The microinstruction format used for our design is: ALU SRC1 SRC2 Register Memory PCWrite Sequencing control control control S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
MICROPROGRAM FOR OUR MIPS DESIGN Figure C.5.1 Values for each microinstruction field and the corresponding signals that are activated (From CD, Appendix C, Page 27) Figure C.5.2 Dispatch tables showing the contents in symbolic form and using the labels in the microprogram (From CD, Appendix C, Page 28) S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
Figure C.5.1 – Microinstruction Fields and Values S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
Figure C.5.2 Dispatch Tables S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
MICROPROGRAM FOR OUR MIPS DESIGN (Only the the R-format, load, store, beq, and the j instructions are included) Label ALU SRC1 SRC2 Register Memory PCWrite Sequencing S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
CHALLENGES IN IMPLEMENTING MORE COMPLEX ARCHITECTURES A high-performance implementation should ensure that • Simple instructions execute quickly • The burden of the complexities of the instruction set penalize primarily the complex and less frequently used instructions To accomplish this goal, Intel has employed a combination of hardwired control (to handle simple instructions) and microcoded control (to handle more complex instructions) in their architectures since the 486. S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
CHALLENGES IN IMPLEMENTING MORE COMPLEX ARCHITECTURES Pentium executes up to two instructions per clock and Pentium Pro executes up to four instructions per clock using an advanced pipelining technique called “superscalar” (more about this in Chapter 6). Pentium Pro employs a microinstruction format that is 72 bits wide. Hardwired control is used for instructions that require less than four microinstructions to implement the instruction. The control dispatches to a microcode control if the instruction takes more than four microinstructions to implement the instruction. S. Barua – CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu