350 likes | 494 Views
CS/COE0447 Computer Organization & Assembly Language. Chapter 5 Part 3 In-Class Exercises. For Reference. The following slides contain a subset of Chapter 5 Part 3 – the essentials, without the animations, discussion, and so on. You will get a copy of Figure 5.28 on Exam3 and the Final
E N D
CS/COE0447Computer Organization & Assembly Language Chapter 5 Part 3 In-Class Exercises
For Reference • The following slides contain a subset of Chapter 5 Part 3 – the essentials, without the animations, discussion, and so on. • You will get a copy of Figure 5.28 on Exam3 and the Final • Rather than trying to memorize the other slides, try to reconstruct them while looking at Figure 5.28 and thinking about how the instructions are executed
Multi-Cycle Execution: R-type • Instruction fetch • IR <= Memory[PC]; sub $t0,$t1,$t2 • PC <= PC + 4; • Decode instruction/register read • A <= Reg[IR[25:21]]; rs • B <= Reg[IR[20:16]]; rt • ALUOut <= PC + (sign-extend(IR[15:0])<<2); • Execution • ALUOut <= A op B; op = add, sub, and, or,… • Completion • Reg[IR[15:11]] <= ALUOut; $t0 <=ALU result
Multi-cycle Execution: lw • Instruction fetch • IR <= Memory[PC]; lw $t0,-12($t1) • PC <= PC + 4; • Instruction Decode/register read • A <= Reg[IR[25:21]]; rs • B <= Reg[IR[20:16]]; • ALUOut <= PC + (sign-extend(IR[15:0])<<2); • Execution • ALUOut <= A + sign-extend(IR[15:0]); $t1 +-12 (sign extended) • Memory Access • MDR <= Memory[ALUOut]; M[$t1 + -12] • Write-back • Load: Reg[IR[20:16]] <= MDR; $t0 <= M[$t1 + -12]
Multi-cycle Execution: sw • Instruction fetch • IR <= Memory[PC]; sw $t0,-12($t1) • PC <= PC + 4; • Decode/register read • A <= Reg[IR[25:21]]; rs • B <= Reg[IR[20:16]]; rt • ALUOut <= PC + (sign-extend(IR[15:0])<<2); • Execution • ALUOut <= A + sign-extend(IR[15:0]); $t1 + -12 (sign extended) • Memory Access • Memory[ALUOut] <= B; M[$t1 + -12] <= $t0
Multi-cycle execution: beq • Instruction fetch • IR <= Memory[PC]; beq $t0,$t1,label • PC <= PC + 4; • Decode/register read • A <= Reg[IR[25:21]]; rs • B <= Reg[IR[20:16]]; rt • ALUOut <= PC + (sign-extend(IR[15:0])<<2); • Execution • if (A == B) then PC <= ALUOut; • if $t0 == $t1 perform branch
Multi-cycle execution: j • Instruction fetch • IR <= Memory[PC]; j label • PC <= PC + 4; • Decode/register read • A <= Reg[IR[25:21]]; • B <= Reg[IR[20:16]]; • ALUOut <= PC + (sign-extend(IR[15:0])<<2); • Execution • PC <= {PC[31:28],IR[25:0],”00”};
A Finite State Machine to generate the control signals wrong; RegDst = 0; MemtoReg = 1
Question • In terms of the datapath in Figure 5.28, how is the branch target address calculated, and where is it stored? • In other words, fill in the following: ______ <= ______ + ______________
Answer • In terms of the datapath in Figure 5.28, how is the branch target address calculated, and where is it stored? ALUOut <= PC + (sign-extend(IR[15:0])<<2) Recall: beq is I-format: Opcode[31:26] rs[25:21] rt[20:16] imm[15:0] Quiz yourself until it’s easy: what is imm[15:0] for a beq instruction? Why is the above the branch target address?
Question • For which instruction(s) is the branch target address calculated? During which cycle(s)? • Ans: It is calculated during cycle 2 for all instructions. If the current instruction turns out to not be a branch, then the value was calculated for nothing. But, doing this means that branch instructions take 3 rather than 4 cycles. • Go back and look at the instruction execution steps for beq (slide 6). The ALU is not being used for anything else that cycle, so we can use it for this. The things added are the PC and information from the IR, both of which are available during this cycle. • Note: the hardware is decoding the instruction during this cycle, but the various fields ARE available, in the IR. The hardware HAS the current instruction; it just doesn’t “understand” it yet.
Question • What happens during the 3rd cycle for a memory access instruction? • Well, which are the memory access instructions? lw, sw in the subset covered in this chapter. [Others are lbu, lhu, lui] • _____ <= _______ + _______________
Answer • What happens during the 3rd cycle for a memory access instruction? • ALUOut <= A + sign-extend(IR[15:0]) • What does A contain? • rs, which was read into A 2nd cycle • What does ALUOut now contain? • The effective address for the memory access: the memory location we are reading from or writing to
Question • What happens during the first cycle for all instructions? • Ans: • IR <= Memory[PC] • The next instruction is read from memory, and stored in the Instruction Register (IR) • PC = PC + 4 • The PC is incremented to point to the next instruction
Question • Is the ALU needed to execute a j instruction? • No: its execution is • PC <= {PC[31:28],IR[25:0],”00”}
Question • Why does the j instruction require 3 cycles, since it doesn’t require rs, rt, or rd and does not require the ALU? • Cycle 1: instruction fetch • Cycle 2: decode instruction • Cycle 3: It is only by here that the instruction has been decoded, and the hardware “knows” it is a j (see slide 7)
Question • For a LW, what is ALUSrcB on each cycle? (Figure 5.28 is on slide 8) • Cycle 1: PC = PC+4, so ans = 01 • Cycle 2: Compute branch target address, so ans = 3 0b11 • Cycle 3: Compute effective address, so ans = 2 0b10 • Cycle 4: No use of ALU, so X • Cycle 5: No use of ALU, so X
Question • For a BEQ, what is ALUSrcB on each cycle? (Figure 5.28 is on slide 8) • Cycle 1: PC = PC+4, so ans = 01 • Cycle 2: Compute branch target address, so ans = 3 0b11 • Cycle 3: Perform A – B, so ans = 00
Question • For which instructions, during which cycles, is ALUSrcA = 0? • Well, it is 0 whenever something is added to the PC • All instructions, Cycle 1 PC = PC + 4 • All instructions, Cycle 2 • ALUOut <= PC + (sign-extend(IR[15:0])<<2) • That’s it!
Question • For which instructions, during which cycles, is ALUSrcA = 01? (not X) • Whenever the ALU is used to compute something and the top input is A (rs) • R-type, cycle 3 • ALUOut <= A op B • Memory access (lw, sw), cycle 3 • ALUOut <= A + sign-extend(IR[15:0]) • beq, cycle 3 • If (A == B): if A – B == 0
Question • Is there any instruction, cycle where one of ALUSrcA and ALUSrcB is X and the other one isn’t? • Nope. If the ALU is needed for something, it needs 2 operands. These two control signals choose the operands.
Question • In the FSM for generating the control signals, not all of the signals are shown in each state. Look at the machine, and figure out when a signal is not shown. • Hint: there are two cases • Case 1: a signal that is X (MUXs and ALU Op) • Case 2: a signal that is essentially an on/off switch whose value is 0 (and cannot be X) • PCWriteCond,PCWrite,MemWrite,IRWrite, RegWrite (can’t be X: incorrect writes!) • MemRead (can’t be X: we don’t want unneeded memory reads)
Question • In the FSM, which instruction, cycle is state 8? • Beq, cycle 3 • How about state 1? • All instructions, cycle 2 • How about state 7? • R-type, cycle 4
Question • In the FSM, when is the transition made from state 4 to state 0? • After a lw instruction finishes: we start over again in state 0 for the next instruction. • Note that this is an unconditional transition: After state 4 finishes, we always transition to state 0.
Question • In the FSM, where do we transition from state 1? • It depends on the opcode: • 2 if opcode == 0x23 or 0x2b • 6 if opcode = 0b000000 • 8 if opcode = 0x4 • 9 if opcode = 0x2
This can stay the same. The current values of ALUOp1 and ALUOp0 determine what the ALU does. • Before, ALUOp0 and ALUOp1 depended only on the opcode. Now, as with the other control signals, they also need to depend on which cycle it is. And this (from single-cycle datapath)? • This needs to change. The signals depend on opcode plus which cycle it is. And, it is under the control of the clock. We won’t look at further details of the FSM implementation. Control Unit ALU Control Unit
from the single-cycle datapath • What we do want to do is figure out what values ALUOp1 and ALUOp0 have for each instruction, and for each cycle • Note: they are X if the ALU is not used during a cycle.
from the single-cycle datapath • The ALU is used during cycles 1 and 2 for all instructions • PC = PC+4 (c1); Calculate the branch target address (c2) • It is used to calculate the effective address for lw,sw on cycle 3 • It is used to perform the op for R-type instructions on cycle 3 • It is used to perform the comparison for beq instructions on cycle 3
Remember this from the single-cycle datapath? • It is used to perform the op for R-type instructions on cycle 3. So, what should ALUOp1 and ALUOp2 be for R instructions on cycle 3? They should be 10, so that the funct field determines what the ALU does.
from the single-cycle datapath • ALU is used to calculate the effective address for lw,sw on cycle 3. So, on cycle 3, ALUOp1 and ALUOp0 should be … 00
from the single-cycle datapath • What should ALUOp1 and ALUOp0 be for a beq instruction (in the multicycle datapath?) • Bad question! You should ask me, during which cycle? I’ll turn things around: during which cycle should they have the 01 value listed in the table above for BEQ? During cycle 3! that is when the subtraction is performed to do the comparison
from the single-cycle datapath • The ALU is used during cycles 1 and 2 for all instructions • PC = PC+4 (c1); Calculate the branch target address (c2) • Which value should ALUOp have during cycles 1 and 2? In both cases: 00. Then, the ALU will perform addition.
from the single-cycle datapath The ALU is used during cycles 1 and 2 for all instructions In both cases: 00. Then, the ALU will perform addition. • No! The funct field is the bottom 6 bits of the IR, whatever they happen to be! Is the R-type ADD an option?