1 / 24

Computer Architecture Processing of control transfer instructions, part II

Computer Architecture Processing of control transfer instructions, part II. Ola Flygt Växjö University http://w3.msi.vxu.se/users/ofl/ Ola.Flygt@msi.vxu.se +46 470 70 86 49. Outline. 8.4.4 Extent of speculativeness Recocery from misprediction 8.4.5 Branch penalty

vala
Download Presentation

Computer Architecture Processing of control transfer instructions, part II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computer ArchitectureProcessing of control transfer instructions, part II Ola Flygt Växjö University http://w3.msi.vxu.se/users/ofl/ Ola.Flygt@msi.vxu.se +46 470 70 86 49

  2. Outline • 8.4.4 • Extent of speculativeness • Recocery from misprediction • 8.4.5 Branch penalty • 8.5 Multiway branching • 8.6 Guarded Execution CH01

  3. Extent of speculative processing

  4. Extent of speculative processing

  5. Recovery from a misprediction: Basic Tasks

  6. Necessary activities to allow or to shorten recovery from a misprediction

  7. Frequently employed schemes for shortening recovery from a misprediction

  8. shortening recovery from a misprediction: needs

  9. Using two instruction buffers in the supersparc to shorten recovery from a misprediction:

  10. Using three instruction buffers in the Nx586 to shorten recovery from a misprediction:

  11. 8.4.5 Branch penalty for taken guesses depends onbranch target accessing schemes

  12. Compute/fetch scheme for accessing branch targets {IFAR vs. PC}

  13. BTAC scheme for accessing branch targets {associative search for BA, if found get BTA} {0-cycle branch: BA=BA-4}

  14. BTIC scheme: store next BTA

  15. BTIC scheme: calculate next BTA

  16. Successor index in the I-cache scheme to access the branch target path {index: next I, or target I}

  17. Successor index in the I-cache scheme: e.g. The microachitecture of the UltraSparc

  18. Predecode unit: detects branches, BTA, make predictions (based on compiler’s hint bit), set up I-cache Next address

  19. Branch target accessing trends

  20. 8.5 Multiway branching: {two IFA’s or PC’s}

  21. Threefold multiway branching: only one correct path!

  22. 8.6 Guarded Execution • a means to eliminate branches • by conditional operate instructions • IF the condition associated with the instruction is met, • THEN perform the specified operation • ELSE do not perform the operation • e.g. original • beg r1, label // if (r1) = 0 branch to label • move r2, r3 // move (r2) into r3 • label: … • e.g. guarded • cmovne r1, r2, r3 // if (r1) != 0, move (r2) into r3 • … • Convert control dependencies into data dependencies

  23. Eliminated branches by full and restricted guarding {full: all instruction guarded, restricted: ALU inst guarded}

  24. Guarded Execution: Disadvantages • guarding transforms instructions from both the taken and the not-taken paths into guard instruction • increase number of instructions • by 33% for full guarding • by 8% for restricted guarding • {more instructions more time and space} • guarding requires additional hardware resourcesif an increase in processing time is to be avoided • VLIW

More Related