1 / 27

CS433: Computer System Organization

CS433: Computer System Organization. Luddy Harrison Lecture 15 Branch Prediction. 1-bit Branch Prediction Buffer. Predict: If BPB entry is 0, fetch PC+1 If BPB entry is 1, fetch L Update: If branch is taken, BPB := 1 If branch is not taken, BPB := 0. State Diagram of 1-bit Predictor.

Download Presentation

CS433: Computer System Organization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS433: Computer System Organization Luddy Harrison Lecture 15 Branch Prediction

  2. 1-bit Branch Prediction Buffer Predict:If BPB entry is 0, fetch PC+1If BPB entry is 1, fetch L Update:If branch is taken, BPB := 1If branch is not taken, BPB := 0

  3. State Diagram of 1-bit Predictor

  4. Twice Mispredicted Loop Branches M: ADD R1, R2, R3 L: ADD R4, R5, R6 MUL R7, R8, R9 SUB R11, R11, #1BNE L SUB R10, R10, #1 BNE M

  5. Sequence of Predictions

  6. 2-bit Predictor • Add some “stickiness” or “memory” to the predictor • cause it to move more slowly from one prediction state (predict taken vs. predict not taken) • Another bit of state does quite a lot

  7. 2-Bit Predictor

  8. State Diagram of 2-Bit Predictor

  9. Prediction Accuracy: 12-bit index + 2-bit state

  10. 12-bit index + 2-bit state vs infinite buffer (2-bit state)

  11. More State Bits? • Increasing the number of state bits beyond 2 does not seem to help much. • Increasing the number of state bits too much will cause the predictor to be stuck in an incorrect state for branches that change their tendency to branch during execution • YYYYYYYNNNNNNNNNYYYYYYYYNNNNNN

  12. Applying the Prediction • The earliest time we can begin using the prediction is when • the prediction bits are available • the branch target is available • The earliest time we can know whether we have predicted correctly is when • the branch condition is resolved • The difference between these times is roughly what is saved by a correct prediction • If the branch target is available late, the window of savings is reduced

  13. Correlating Predictors • The prediction is a function of the last k branch outcomes • The branch history buffer is indexed by • m bits taken from address of branch • k bits of branch history • i.e., m + k bits all told • The branch history buffer has 2m+k

  14. Correlating Predictors • The prediction is a function of the last k branch outcomes • The branch history buffer is indexed by • m bits taken from address of branch • k bits of branch history • i.e., m + k bits all told • Each entry in the branch history buffer has q bits (i.e., is a q-bit predictor) • The branch history buffer has 2m+k q bits of storage

  15. Correlating predictor with2 history bits and 2 state bits (2,2)

  16. Comparison of 2-bit predictors

  17. Local versus Global

  18. Hashing Correlation For the same amount of table storage, we can get better associativity in the case of fewer branches but highly correlated behavior.

  19. Tournament Predictor • Move “toward” the other predictor when • I am wrong • He is right • Stay put when I am right and he is right, or I am wrong and he is wrong.

  20. Tournament predictor local vs global

  21. Local 2-bit vs. Correlating vs. Tournament

  22. Alpha 21264 Branch Predictor • Tournament predictor (4K x 2) chooses between global and local • Global has 4K 2-bit entries indexed by last 12 branch outcomes XORed with address • Local is also a two-level predictor • 1K x 10 branch history buffer (last 10 outcomes for indexed branch) indexed by address • The selected 10-bit history is XORed with address to index a table of 3-bit entries

  23. Alpha 21264 Predictor

  24. Branch Target Buffer • Contains an entry for each branch that is predicted taken • Indexed by PC of (potential) branch • If not in table, it is taken to mean • either not a branch • or not predicted taken • in either case, continue fetching from PC + k • BTB gets us the branch target address early

  25. Branch Target Buffer

  26. BTB Handling State Chart

  27. Questions Concerning BTBs • Can BTB be combined with branch prediction machinery introduced earlier in this lecture? How? • What kind of branches can a BTB accelerate that are out of the reach of ordinary branch predictors?

More Related