510 likes | 1.01k Views
CS1104: Computer Organisation http://www.comp.nus.edu.sg/~cs1104. School of Computing National University of Singapore. PII Lecture 6: Processor: Datapath and Control. Datapath: Single-bus Organization Multiple-bus Organization MIPS: Multicycle Datapath and Control Stages of Instructions
E N D
CS1104: Computer Organisation http://www.comp.nus.edu.sg/~cs1104 School of Computing National University of Singapore
PII Lecture 6: Processor: Datapath and Control • Datapath: • Single-bus Organization • Multiple-bus Organization • MIPS: Multicycle Datapath and Control • Stages of Instructions • Datapath Walkthroughs • Processor and Logic Design Processor: Datapath and Control
PII Lecture 6: Processor: Datapath and Control • Reading: • Chapter 9 of textbook, which is Chapter 7 in “Computer Organization” by Hamacher, Vranesic and Zaky. • Optional reading: Chapter 5 in “Computer Organization & Design” by Patterson and Hennessy. Processor: Datapath and Control
Datapath Processor: Datapath and Control
Bus Processor Memory Devices Control Input Cache Datapath Output Registers Recap: Organisation Processor: Datapath and Control
Fundamental Concepts • Processor (CPU): the active part of the computer, which does all the work (data manipulation and decision-making). • Datapath: portion of the processor which contains hardware necessary to perform all operations required by the computer (the brawn). • Control: portion of the processor (also in hardware) which tells the datapath what needs to be done (the brain). Processor: Datapath and Control
Instruction Fetch Instruction Decode Operand Fetch Execute Result Store Next Instruction Fundamental Concepts (2) • Instruction execution cycle: fetch, decode, execute. • Fetch: fetch next instruction (using PC) from memory into IR. • Decode: decode the instruction. • Execute: execute instruction. Processor: Datapath and Control
Fundamental Concepts (3) • Fetch: Fetch next instruction into IR (Instruction Register). • Assume each word is 4 bytes and each instruction is stored in a word, and that the memory is byte addressable. • PC (Program Counter) contains address of next instruction. IR [[PC]] PC [PC] + 4 Processor: Datapath and Control
Internal processor bus Control signals Address line PC . . . Instruction decoder and control logic Memory bus MAR Data line MDR IR Constant 4 Y RO Select MUX : : Add ALU control lines R(n–1) Sub A B Carry-in ALU : XOR TEMP Z Single-bus Organization Processor: Datapath and Control
Instruction Execution • An instruction can be executed by performing one or more of the following operations in some specified sequence: • Transfer a word of data from one register to another or to the ALU (Arithmetic Logic Unit). • Perform an arithmetic or a logic operation and store the result in a register. • Fetch the contents of a given memory location and load them into a register. • Store a word of data from a register into a given memory location. Processor: Datapath and Control
Register Transfer • Register to register transfer: • For each register Ri, two control signals: • Riin used to load the data on the bus into the register. • Riout to place the register’s contents on the bus. • Example: To transfer contents of R1 to R4: • Set R1out to 1. This places contents of R1 on the bus. • Set R4in to 1. This loads data from the processor bus into R4. Processor: Datapath and Control
Internal processor bus Riin Yin X X Constant 4 Ri Y X Select Riout MUX Zin X A B ALU Z X Zout Register Transfer (2) Processor: Datapath and Control
Internal processor bus Riin Yin X X Constant 4 Ri Y X Select Riout MUX Zin X A B ALU Z X Zout Arithmetic/Logic Operation • ALU: Performs arithmetic and logic operations on its A and B inputs. • To perform R3 [R1] + [R2]: • R1out, Yin • R2out, SelectY, Add, Zin • Zout, R3in Processor: Datapath and Control
Add ALU control lines A B Sub ALU Carry-in : XOR Arithmetic/Logic Operation (2) • If there are n operations, do we need n ALU control lines? • We could use encoding, which requires log2n control lines for n operations. However, this will increase complexity and hardware (additional decoder needed). Processor: Datapath and Control
Memory-bus data lines Internal processor bus MDRinE MDRin X X MDR X X MDRoutE MDRout Reading a Word from Memory • Move (R1), R2 /* R2 [[R1]] • MAR [R1] • Start a Read operation on the memory bus • Wait for the MFC response from the memory • Load MDR from the memory bus • R2 [MDR] • MDR has four control signals: MDRin, MDRout, MDRinE and MDRoutE. Processor: Datapath and Control
Reading a Word from Memory (2) • Move (R1), R2 /* R2 [[R1]] • Sequence of control steps: • R1out, MARin, Read • MDRinE, WMFC • MDRout, R2in • WMFC: Wait for arrival of MFC (Memory-Function-Completed) signal. • MFC: To accommodate variability in response time, the processor waits until it receives an indication that the Read/Write operation has been completed. The addressed device sets MFC to 1 to indicate this. Processor: Datapath and Control
Storing a Word in Memory • Move R2, (R1) /* [R1] [R2] • Sequence of control steps: • R1out, MARin • R2out, MDRin, Write • MDRoutE, WMFC Processor: Datapath and Control
Executing a Complete Instruction • Add (R3), R1 /* R1 [R1] + [[R3]] • Adds the contents of a memory location pointed to by R3 to register R1. • Sequence of control steps: • PCout, MARin, Read, Select4, Add, Zin • Zout, PCin, Yin, WMFC • MDRout, IRin • R3out, MARin, Read • R1out, Yin, WMFC • MDRout, SelectY, Add, Zin • Zout, R1in, End Steps 1 – 3: Instruction fetch Processor: Datapath and Control
Multiple-Bus Organization • Single-bus structure: Control sequences are long as only one data item can be transferred over the bus in a clock cycle. • Figure on next slide shows a three-bus structure. • All registers are combined into a single block called register file with three ports: 2 outputs allowing 2 registers to be accessed simultaneously and have their contents put on buses A and B, and 1 input allowing data on bus C to be loaded into a third register. • Buses A and B are used to transfer source operands to the A and B inputs of ALU, and result transferred to destination over bus C. Processor: Datapath and Control
Bus A Bus B Bus C Bus A Bus B Bus C Incrementer Instruction decoder PC Register file IR Constant 4 MDR A ALU MAR R B Address line MUX Memory bus data lines Multiple-Bus Organization (2) Processor: Datapath and Control
Multiple-Bus Organization (3) • For the ALU, R=A (or R=B) means that its A (or B) input is passed unmodified to bus C. • Add R4, R5, R6 /* R6 [R4] + [R5] • Adds the contents of R4 and R5 to R6. • Sequence of control steps: • PCout, R=B, MARin, Read, IncPC • WMFC • MDRoutB, R=B, IRin • R4outA, R5outB, SelectA, Add, R6in, End Processor: Datapath and Control
Control step counter CLK Clock . . . IR Decoder/ encoder External inputs : : : : Condition codes Control signals . . . Control • Hardwired control or microprogrammed control. • Hardwired control: Memory bus data lines Processor: Datapath and Control
Control (2) • Microprogrammed control: • Control signals generated by a program. • Control word (CW) is a microinstruction that contains individual bits that represent the various control signals. • Vertical organization: highly encoded schemes that use compact codes to specify only a small number of control functions in each microinstruction. • Horizontal organization: minimally encoded scheme in which many resources can be controlled with a single microinstructions. • Popular in Complex Instruction Set Architectures (CISC) because complex instruction sets require complex controllers that can more easily be implemented as microprograms. Memory bus data lines Processor: Datapath and Control
Micro-instruction MDRout PCin PCout MARin Read IRjn Yin Select Add Zin Zout R1out R1in R3out WMFC End .. .. Control (3) • PCout, MARin, Read, Select4, Add, Zin • Zout, PCin, Yin, WMFC • MDRout, IRin • R3out, MARin, Read • R1out, Yin, WMFC • MDRout, SelectY, Add, Zin • Zout, R1in, End • Example of a horizontal organization scheme: Select=0: SelectY Select=1: Select4 Memory bus data lines Processor: Datapath and Control
MIPS: Multicycle Datapath and Control Adapted from D. Patterson’s CS61C http://www.cs.berkeley.edu/~pattrsn/61CF00 Copyright 2000 UCB Processor: Datapath and Control
Stages of a Datapath • Problem: a single, atomic block which “executes an instruction” (performs all necessary operations beginning with fetching the instruction) would be too bulky and inefficient. • Solution: break up the process of “executing an instruction” into stages, and then connect the stages to create the whole datapath. • Smaller stages are easier to design. • Easy to optimize (change) one stage without touching the others. Processor: Datapath and Control
Stages of a Datapath (2) • There is a wide variety of MIPS instructions: so what general steps do they have in common? • Stages • Instruction Fetch • Instruction Decode • ALU • Memory Access • Register Write Processor: Datapath and Control
Stages of a Datapath (3) • Stage 1: Instruction Fetch. • No matter what the instruction is, the 32-bit instruction word must first be fetched from memory (the cache-memory hierarchy). • Also, this is where we increment PC(that is, PC = PC + 4, to point to the next instruction; byte addressing so + 4). Processor: Datapath and Control
Stages of a Datapath (4) • Stage 2: Instruction Decode • Upon fetching the instruction, we next gather data from the fields (decode all necessary instruction data). • First, read the opcode to determine instruction type and field lengths. • Second, read in data from all necessary registers. • For add, read two registers. • For addi, read one register. • For jal, no read necessary. Processor: Datapath and Control
Stages of a Datapath (5) • Stage 3: ALU(Arithmetic-Logic Unit) • The real work of most instructions is done here: arithmetic (+, -, *, /), shifting, logic (&, |), comparisons (slt). • What about loads and stores? • lw $t0, 40($t1) • The address we are accessing in memory = the value in $t1 plus the value 40. • We do this addition at this stage. Processor: Datapath and Control
Stages of a Datapath (6) • Stage 4: Memory Access • Actually only the load and store instructions do anything during this stage; for the other instructions, they remain idle during this stage. • Since these instructions have a unique step, we need this extra stage to account for them. • As a result of the cache system, this stage is expected to be just as fast (on average) as the others. Processor: Datapath and Control
Stages of a Datapath (7) • Stage 5: Register Write • Most instructions write the result of some computation into a register. • Examples: arithmetic, logical, shifts, loads, slt • What about stores, branches, jumps? • They do not write anything into a register at the end. • These remain idle during this fifth stage. Processor: Datapath and Control
rd instruction memory PC registers rs Data memory rt ALU +4 imm 1. Instruction Fetch 2. Decode/ Register Read 3. Execute 4. Memory 5. Reg. Write Datapath: Generic Steps Processor: Datapath and Control
Datapath Walkthroughs: add • add $r3,$r1,$r2 # r3 = r1+r2 • Stage 1: Fetch this instruction, increment PC. • Stage 2: Decode to find that it is an add instruction, then read registers $r1 and $r2. • Stage 3: Add the two values retrieved in stage 2. • Stage 4: Idle (nothing to write to memory). • Stage 5: Write result of stage 3 into register $r3. Processor: Datapath and Control
3 1 2 ALU reg[1] add r3, r1, r2 instruction memory PC registers Data memory reg[1]+reg[2] reg[2] imm +4 Datapath Walkthroughs: add (2) Processor: Datapath and Control
Datapath Walkthroughs: slti • slti $r3,$r1,17 • Stage 1: Fetch this instruction, increment PC. • Stage 2: Decode to find it is an slti, then read register $r1. • Stage 3: Compare value retrieved in stage 2 with the integer 17. • Stage 4: Go idle. • Stage 5: Write the result of stage 3 in register $r3. Processor: Datapath and Control
x 1 3 ALU reg[1] slti r3, r1, 17 reg[1]-17 instruction memory PC registers Data memory 17 imm +4 Datapath Walkthroughs: slti (2) Processor: Datapath and Control
Datapath Walkthroughs: sw • sw $r3, 20($r1) • Stage 1: Fetch this instruction, increment PC. • Stage 2: Decode to find it is an sw, then read registers $r1 and $r3. • Stage 3: Add 20 to value in register $r1 (retrieved in stage 2). • Stage 4: Write value in register $r3 (retrieved in stage 2) into memory address computed in stage 3. • Stage 5: Go idle (nothing to write into a register). Processor: Datapath and Control
x 1 3 ALU reg[1] sw r3, 20(r1) reg[3] reg[1]+20 instruction memory PC registers Data memory 20 MEM[r1+20]<-r3 imm +4 Datapath Walkthroughs: sw (2) Processor: Datapath and Control
Why Five Stages? • Could we have a different number of stages? • Yes, and other architectures do. • So why does MIPS have five stages, if instructions tend to go idle for at least one stage? • There is one instruction that uses all five stages: the load. Processor: Datapath and Control
Datapath Walkthroughs: lw • lw $r3, 40($r1) • Stage 1: Fetch this instruction, increment PC. • Stage 2: Decode to find it is a lw, then read register $r1. • Stage 3: Add 40 to value in register $r1 (retrieved in stage 2). • Stage 4: Read value from memory address compute in stage 3. • Stage 5: Write value found in stage 4 into register $r3. Processor: Datapath and Control
reg[3] x 1 3 ALU r3<-MEM[r1+40] reg[1] lw r3, 40(r1) reg[1]+40 instruction memory PC registers Data memory 40 imm +4 Datapath Walkthroughs: lw (2) Processor: Datapath and Control
What Hardware Is Needed? • PC: a register which keeps track of address of the next instruction. • General Purpose Registers • Used in stages 2 (read) and 5 (write). • We are currently working with 32 of these. • Memory • Used in stages 1 (fetch) and 4 (R/W). • Cache system makes these two stages as fast as the others, on average. Processor: Datapath and Control
rd instruction memory PC registers rs Data memory ALU rt +4 imm opcode, funct Controller Datapath: Summary • Construct datapath based on register transfers required to perform instructions. • Control part causes the right transfers to happen. Processor: Datapath and Control
ALU ALU Control Where is Logic Design Used? • Combinational circuits for ALU and other parts of the datapath. • Different control signals are needed for different clock cycles and different instructions for the ALU, registers and other parts of the datapath. Sequential circuits. Processor: Datapath and Control
Start Instruction fetch/decode and register fetch R-type instructions Branch instruction Jump instruction Memory access instructions Where is Logic Design Used? (2) • High-level view of finite state machine control. • Sequential logic design can be used to assert the correct control signals at the correct times. Processor: Datapath and Control
Summary • Datapath is the hardware that performs operations necessary to execute programs. • Control instructs datapath on what to do next. • Datapath needs: • access to storage (general purpose registers and memory) • computational ability (ALU) • helper hardware (local registers and PC) Processor: Datapath and Control
Summary (2) • Five stages of datapath (executing an instruction): • 1: Instruction Fetch (Increment PC) • 2: Instruction Decode (Read Registers) • 3: ALU (Computation) • 4: Memory Access • 5: Write to Registers • ALL instructions must go through ALL five stages. • Datapath designed in hardware. Processor: Datapath and Control