1 / 25

CS 230: Computer Organization and Assembly Language

CS 230: Computer Organization and Assembly Language. Aviral Shrivastava. Department of Computer Science and Engineering School of Computing and Informatics Arizona State University. Slides courtesy: Prof. Yann Hang Lee, ASU, Prof. Mary Jane Irwin, PSU, Ande Carle, UCB. Announcements.

pekelo
Download Presentation

CS 230: Computer Organization and Assembly Language

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics Arizona State University Slides courtesy: Prof. Yann Hang Lee, ASU, Prof. Mary Jane Irwin, PSU, Ande Carle, UCB

  2. Announcements • Alternate Project • Submit Nov 24 • Quiz 5 • Thursday, Nov 19, 2009 • Pipelining • Finals • Tuesday, Dec 08, 2009 • Please come on time (You’ll need all the time) • Open book, notes, and internet • No communication with any other human

  3. Benefits of Pipelining Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10 Dec/Reg • Exec Mem • Wr Ifetch Dec/Reg • Exec Mem • Wr Ifetch Dec/Reg • Exec Mem • Wr pipelined Ifetch Dec/Reg • Exec Mem • Wr • Pipeline latches: pass the status and result of the current instruction to next stage • Comparison: Clock Single- cycle inst. Dec/Reg • Exec Ifetch Mem Ifetch sw lw

  4. Branch Hazards • So far, we’ve limited discussion of hazards to: • Arithmetic/logic operations • Data transfers • Also need to consider hazards involving branches: • Example: • 40: beq $1, $3, 28 • 44: and $12, $2, $5 • 48: or $13, $6, $2 • 52: add $14, $2, $2 • 72: lw $4, 50($7) • How long will it take before the branch decision takes effect? • What happens in the meantime?

  5. Branch signal determined in MEM stage PCSrc ID/EX M u WB EX/MEM x M MEM/WB Control WB ALUOp EX M WB IF/ID e d ALUSrc t h a i r g c e W n e R a R m m r o e B e e t t Add i M m r M W e M g Add e Shift R left 2 Read t s reg 1 D Read PC g e address Read R Read data 1 Read reg 2 Zero addr Read Write Read M data Instruction reg data 2 u ALU Write M Memory x u addr Write x data Write Data data Memory Inst[15-0] Sign ALU extend control Inst[20-16] M u Inst[15-11] x Registers

  6. Pipeline impact on branch • If branch condition true, must skip 44, 48, 52 • But, these have already started down the pipeline • They will complete unless we do something about it • How do we deal with this? • We’ll consider 2 possibilities

  7. Dealing w/branch hazards: always stall • Branch taken • Wait 3 cycles • No proper instructions in the pipeline • Same delay as without stalls (no time lost)

  8. Dealing w/branch hazards: always stall • Branch not taken • Still must wait 3 cycles • Time lost • Could have spent cycles fetching and decoding next instructions

  9. Assume branch not taken • On average, branches are taken ½ the time • If branch not taken… • Continue normal processing • Else, if branch is taken… • Need to flush improper instruction from pipeline • Cuts overall time for branch processing in ½

  10. Flushing unwanted instructions from pipeline • Useful to compare w/stalling pipeline: • Simple stall: inject bubble into pipe at ID stage only • Change control to 0 in the ID stage • Let “bubbles” percolate to the right • Flushing pipe: must change inst. In IF, ID, and EX • IF Stage: • Zero instruction field of IF/ID pipeline register • Use new control signal IF.Flush • ID Stage: • Use existing “bubble injection” mux that zeros control for stalls • Signal ID.Flush is ORed w/stall signal from hazard detection unit • EX Stage: • Add new muxes to zero EX pipeline register control lines • Both muxes controlled by single EX.Flush signal • Control determines when to flush: • Depends on Opcode and value of branch condition

  11. Flushing Pipeline IF.Flush EX.Flush Flush Pipeline Hazard ID.Flush Detection Unit ID/EX 0 M WB EX/MEM u M x M MEM/WB u Control WB x 0 EX M M WB IF/ID u x 0 PC Branch Decision

  12. Assume “branch not taken”…and branch is not taken… • Execution proceeds normally – no penalty

  13. Assume “branch not taken”…and branch is taken… • Bubbles injected into 3 stages during cycle 5

  14. Reservation Table Picture Assume Branch Not Taken and Correct 40: beq $1, $3, 72 44: and $12, $2, $5 48: or $13, $6, $2 52: add $14, $2, $2 … 72: lw $4, 50($7) • Another way of looking at it… No penalty 3 cycle penalty Assume Branch Not Taken and NOT Correct (FYI, branch Freq ~ 20%; & 3 cycle penalty 50% of time)

  15. Branch Penalty Impact • Assume 16% of all instructions are branches • 4% unconditional branches: 3 cycle penalty • 12% conditional: 50% taken • For a sequence of N instructions (assume N is large) • N cycles to initiate each • 3 * 0.04 * N delays due to unconditional branches • 0.5 * 3 * 0.12 * N delays due to conditional taken • Also, an extra 4 cycles for pipeline to empty • Total: • 1.3*N + 4 total cycles (or 1.3 cycles/instruction) (CPI) • 30% Performance Hit!!! (Bad thing)

  16. Branch Penalty Impact • Some solutions: • In ISA: branches always execute next 1 or 2 instructions • Instruction so executed said to be in delay slot • See SPARC ISA • (example – loop counter update) • In organization: move comparator to ID stage and decide in the ID stage • Reduces branch delay by 2 cycles • Increases the cycle time

  17. Branch Prediction • Prior solutions are “ugly” • Better (& more common): guess in IF stage • Technique is called “branch predicting”; needs 2 parts: • “Predictor” to guess where/if instruction will branch (and to where) • “Recovery Mechanism”: i.e. a way to fix your mistake • Prior strategy: • Predictor: always guess branch never taken • Recovery: flush instructions if branch taken • Alternative: accumulate info. in IF stage as to… • Whether or not for any particular PC value a branch was taken next • To where it is taken • How to update with information from later stages

  18. A Branch Predictor

  19. Branch History Table

  20. Branch Prediction Information • One bit predictor: • Use result from last time we saw this instruction • Problem: • Even if branch is almost always taken, we will be wrong at least twice • 1st time we the instruction • 1st time the branch is not taken • Also, 1st time branch is taken again after than • And if branch alternates b/t taken, not taken… • We get 0% accuracy • Can we do better? Yep.

  21. Branch Prediction Information • How to do better? • Keep a “counter” in each entry of the number of times taken in the last N times executed • Keep information about the “pattern” of previous branches • Book’s scheme: a “2-bit saturating counter” • Increment when branch is taken • Decrement when branch is not taken • Don’t increment or decrement above or below a max/min count • Use sign of count as predictor

  22. Book’s 2 Bit Branch Counter

  23. Computing Performance • Program assumptions: • 23% loads and in ½ of cases, next instruction uses load value • 13% stores • 19% conditional branches • 2% unconditional branches • 43% other • Machine Assumptions: • 5 stage pipe with all forwarding • Only penalty is 1 cycle on use of load value immediately after a load) • Jumps are totally resolved in ID stage for a 1 cycle branch penalty • 75% branch prediction accuracy • 1 cycle delay on misprediction

  24. The Answer: • CPI penalty calculation: • Loads: • 50% of the 23% of loads have 1 cycle penalty: .5*.23=0.115 • Jumps: • All of the 2% of jumps have 1 cycle penalty: 0.02*1 = 0.02 • Conditional Branches: • 25% of the 19% are mispredicted for a 1 cycle penalty: 0.25*0.19*1 = 0.0475 • Total Penalty: 0.115 + 0.02 + 0.0475 = 0.1825 • Average CPI: 1 + 0.1825 = 1.1825

  25. Yoda says… Death is a natural part of life. Rejoice for those around you who transform into the Force. Mourn them do not. Miss them do not

More Related