1 / 34

Midterm Review 2

CprE 381 Computer Organization and Assembly Level Programming, Fall 2013. Midterm Review 2. Dr. Zhao Zhang Iowa State University. Announcement. No quiz today No homework this Friday Exam on Monday 9:00-9:50 HW9 deadline extended to next Friday HW8 solutions will be posted today.

eugene
Download Presentation

Midterm Review 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CprE 381 Computer Organization and Assembly Level Programming, Fall 2013 Midterm Review 2 Dr. Zhao Zhang Iowa State University

  2. Announcement • No quiz today • No homework this Friday • Exam on Monday 9:00-9:50 • HW9 deadline extended to next Friday • HW8 solutions will be posted today Chapter 1 — Computer Abstractions and Technology — 2

  3. Exam 2 Coverage • Coverage: Ch. 4, The Processor • Datapath and control • Simple MIPS pipeline • Data hazards and forwarding • Load-use hazard and pipeline stall • Control hazards • Arithmetic will NOT be covered • Will be covered in the final exam • Final exam is comprehensive Chapter 1 — Computer Abstractions and Technology — 3

  4. Question Styles and Coverage • Short answer • True/False or multi-choice • Design and Analysis • Signal values in the datapath and control • Identify critical path • Support a new MIPS instruction • Performance analysis and optimization • Identify pipeline bubbles in program execution • Reorder instructions to improve performance • And others Chapter 1 — Computer Abstractions and Technology — 4

  5. Nine-Instruction MIPS • They’re enough to illustrate the most aspects of CPU design, particularly datapath and control design • Some questions will use it as the baseline design Memory reference: LW and SW Arithmetic/logic: ADD, SUB, AND, OR, SLT Branch: BEQ, J Chapter 1 — Computer Abstractions and Technology — 5

  6. Datapath With Jumps Added Chapter 4 — The Processor — 6

  7. The Control • Control signals for the nine-instruction implementation Note: “R-” means R-format Chapter 1 — Computer Abstractions and Technology — 7

  8. ALU Control • Truth table for ALU Control • Extend it as a secondary control unit in projects B & C, with more control signal output Chapter 4 — The Processor — 8

  9. Extend the Single-Cycle Processor For each instruction, do we need • Any new or revised datapath element(s)? • Any new control signal(s)? Then revise, if necessary, • Datapath: Add new elements or revise existing ones, add new connections • Control Unit: Add/extend control signals, extend the truth table • ALU Control: Extend the truth table Chapter 1 — Computer Abstractions and Technology — 9

  10. 000011 address 31:26 25:0 Support JAL jal target PC = JumpAddr R[31] = PC_plus_4 PC_plus_4 = PC+4 JumpAddr = PC_plus_4[31:28] & Inst[25:0] & “00” Chapter 1 — Computer Abstractions and Technology — 10

  11. Support JAL Make what changes tothe datapath? Chapter 4 — The Processor — 11

  12. Support JAL • Analyze the instruction execution • Writes register $ra ($31) • Update PC with jump target • This part already done for supporting J • Analyze datapath • Needs another input, fixed at 31, to “Write register” port of register file • Needs another input, PC+4, to “Write data” port of register file • Revise control • Add a “link” signal • The (main) control unit can tell it by reading the opcode Chapter 1 — Computer Abstractions and Technology — 12

  13. SCPv1 + JAL • Revises the two muxes • Add another input • Extend the select signals • Alternatively, use extra mux Chapter 4 — The Processor — 13

  14. Control Signals • Control signals for the nine-instruction implementation • Add a new row for jal • Extend RegDst • Add a control line link Chapter 1 — Computer Abstractions and Technology — 14

  15. Control Signals • Control signals for the nine-instruction implementation • Extend control input to RegDst Mux: RegDst & Link • Extend control input to MemtoReg Mux: MemtoReg & Link Chapter 1 — Computer Abstractions and Technology — 15

  16. Simple Pipeline • Add pipeline registers hold information produced in each cycle Chapter 4 — The Processor — 16

  17. Pipelined Control Chapter 4 — The Processor — 17

  18. Hazards • Situations that prevent starting the next instruction safely in the next cycle • The simple pipeline won’t work correctly • Structure hazards • A required resource is busy • Data hazard • Need to wait for previous instruction to complete its data read/write • Control hazard • Deciding on control action depends on previous instruction Chapter 4 — The Processor — 18

  19. Data Hazards Program with data dependence sub $2, $1,$3and $12,$2,$5or $13,$6,$2add $14,$2,$2sw $15,100($2) Program with control dependence beq $1, $3, +4addi $2, $2, 1 addi $4, $4, 1 Chapter 1 — Computer Abstractions and Technology — 19

  20. Data Forwarding sub $2, $1,$3 # MEM=>EX forwarding and $12,$2,$5 # WB =>EX forwarding or $13,$6,$2 add $14,$2,$2 sw$15,100($2) • IF IDEX MEM WB or and sub … … AND gets forwarded new $2 value add or and sub … sw add or and sub SUB gets forwardednew $2 value Chapter 1 — Computer Abstractions and Technology — 20

  21. Data Forwarding Paths Chapter 4 — The Processor — 21

  22. Detecting the Need to Forward • Input • rs and rt from EX • rd and RegWrite from MEM • rd and RegWrite from WB • Output • FwdA, FwdB • Caveats • Check RegWrite • Check if rd = 0 • Forwarding from MEM wins over WB Review slides and textbook for details Chapter 4 — The Processor — 22

  23. Load-Use Data Hazard lw $s0, 20($t1) sub $t2, $s0, $t3 • Can’t always avoid stalls by forwarding • Must stall pipeline by one cycle Chapter 4 — The Processor — 23

  24. Datapath with Hazard Detection Chapter 4 — The Processor — 24

  25. Hazard Detection Unit • Input • rs and rt from ID • rt and MemRead from EX • Output • PCWrite, IF/IDWrite (0 for holding instructions) • Select signal to a MUX to insert bubble in EX Read slides/textbook for details Chapter 4 — The Processor — 25

  26. Pipeline Stall • The nop has all control signals set to zero • It does nothing at EX, MEM and WB • Prevent update of PC and IF/ID register • Using instruction is decoded again (OK) • Following instruction is fetched again (OK) • 1-cycle stall allows MEM to read data for lw • Can subsequently forward from WB to EX Chapter 4 — The Processor — 26

  27. Code Scheduling to Avoid Stalls • Reorder code to avoid use of load result in the next instruction • C code for A = B + E; C = B + F; lw $t1, 0($t0) lw $t2, 4($t0) add $t3, $t1, $t2 sw $t3, 12($t0) lw $t4, 8($t0) add $t5, $t1, $t4 sw $t5, 16($t0) lw $t1, 0($t0) lw $t2, 4($t0) lw $t4, 8($t0) add $t3, $t1, $t2 sw $t3, 12($t0) add $t5, $t1, $t4 sw $t5, 16($t0) stall stall 13 cycles 11 cycles Chapter 4 — The Processor — 27

  28. Control Hazards • Branch determines flow of control • Two branch outcomes: Taken or Not-Taken • The CPU doesn’t recognize a branch until it reaches the end of the ID stage • Every cycle, the CPU has to fetch one instruction Chapter 4 — The Processor — 28

  29. Control Hazards • The MIPS pipeline in textbook always predict “not-taken” • Pipeline flush on every taken branch • OK to flush because mis-fetched instructions don’t write to register/memory • But this incurs pipeline bubbles (performance penalty) • The revised MIPS pipeline move branch comparison to the ID stage • Doable for BEQ and BNE • Reduce pipeline bubbles from 3 to 1 per taken branch • Complicate data forwarding and hazard detection Chapter 4 — The Processor — 29

  30. Revised MIPS Pipeline Chapter 4 — The Processor — 30

  31. Revised MIPS Pipeline Note: Branch does nothing in EX, MEM and WB Chapter 4 — The Processor — 31

  32. Performance Penalty • Any pipeline bubbles? loop: addi $1, $1, -1 lw $1, addr add $4, $5, $6 add $4, $5, $6 beq $1, $zero, loop beq $1, $4, target Chapter 1 — Computer Abstractions and Technology — 32

  33. Delayed Branch Delayed branch may remove the one-cycle stall • The instruction right after the beq is executed no matter the branch is taken or not (sub instruction in the example) • Alternatingly saying, the execution of beq is delayed by one cycle sub $10, $4, $8 beq $1, $3, 7 beq $1, $3, 7 => sub $10, $4, $8 and $12, $2, $5 and $12, $2, $5 Must find an independent instruction, otherwise • May have to fill in a nop instruction, or • Need two variants of beq, delayed and not delayed Chapter 1 — Computer Abstractions and Technology — 33

  34. Other Topics • Exception handling • Multi-issue pipeline Those topics will be covered in the final exam • Exam 2 will NOT cover them Chapter 1 — Computer Abstractions and Technology — 34

More Related