220 likes | 380 Views
EE8365/CS8203 ADVANCED COMPUTER ARCHITECTURE. A Survey on BRANCH PREDICTION METHODOLOGY By, Baris Mustafa Kazar Resit Sendag. Outlines. Introduction Static Branch Prediction Schemes: A Brief View Dynamic Branch Prediction One-Bit Scheme:BHT To provide target instructions quickly: BTBs
E N D
EE8365/CS8203 ADVANCED COMPUTER ARCHITECTURE A Survey on BRANCH PREDICTION METHODOLOGY By, Baris Mustafa Kazar Resit Sendag
Outlines • Introduction • Static Branch Prediction Schemes: A Brief View • Dynamic Branch Prediction • One-Bit Scheme:BHT • To provide target instructions quickly: BTBs • Bimodal Branch Prediction (Two-Bit Prediction Scheme) • Two-level Branch Prediction • Global History Schemes: GAg, GAs, GAp • Per-Address History Schemes: PAg, PAs, PAp • Per-Set Address Schemes: SAg, SAs, Sap • Correlation Branch Prediction • More on Global Branch Prediction: Gselect and Gshare • Hybrid Branch Predictors
Outlines (cont.) • Hybrid Branch Predictors • Where did the idea come from? • Branch Classification • An alternative Selection Mechanism • Reducing PHT Interference • Bi-Mode Branch Predictor • The Agree Predictor • The Skewed Branch Predictor • The YAGS Branch Prediction Scheme • Concluding Remarks
Introduction • Pipeline flushes due to branch mispredictions is one of the most serious problems facing the designer of a deeply pipelined, super-scalar processor. Many branch predictors have been proposed to alleviate this problem… • compile-time schemes (static scheduling) • Focus on hardware-basedprediction schemes (dynamic scheduling)
Static Branch Prediction Schemes: A Brief View • Program behavior (i.e. branch direction) • Profile information collected • Some schemes: • Always not taken, • Always Taken, • Op-code Based, • Backward Taken and Forward not Taken.
Dynamic Prediction Schemes (DPS) • One-Bit Scheme • Branch Prediction Buffer or Branch History Table (BHT) • indexed by lower bits of the branch instruction address • prediction bit • To provide target instructions quickly: BTBs • Lee and Smith (1984) • Special instruction cache designed to store the target instructions
DPS (cont.) • Two-Bit Prediction Scheme • Bimodal (usually taken or usually not taken) • 2-bit Saturating Counter, Smith-1981 • Two-level Branch Predictors • Yeh and Patt, 1991 • correlating predictors • First-level is the history of the last k branches encountered. • Second-level is the history of branch behavior of the last j occurences of that unique pattern of the last k branches. • Branch History Register and Pattern History Table (PHT) • Run-time collection of the history • Performs better than the other schemes given previously (up to97%)
Two-Level Branch Predictors (cont.) • Two-level Predictors are classified into 3 classes(Yeh and Patt, 1993) • Global History Schemes (GAg, GAs, GAp ) • The first-level branch history is the actual last k branches. • Only one global history register (GHR) • updated with the results from all branches
Two-Level Branch Predictors (cont.) • Per-address History Schemes • Local Branch Prediction • The-first level history refers to the last k occurences of the same branch instruction • The branch prediction is independent of other branches’ execution history
Two-Level Branch Predictors (cont.) • Per-set History Schemes • The first-level history means the last k occurences of the branch instructions from the same sub-set. • The set attribute • The prediction is influenced by the other branches in the same set
DPS (cont.) • Comparison results for Two-level Branch Prediction Classes • Comparison is made upon the performance and cost effectiveness • Global History Schemes performs better on integer programs • Per-address history schemes performs better on floating point prog. • PAs is the most cost effective among low-cost schemes • Correlation Branch Prediction • Pan, So, and Rahmeh, 1992 • GAp and GAs
DPS (cont.) • More on Global Branch Prediction • Local Branch Prediction: history of each branch independently • Global Branch Prediction: combined history of all recent branches • Gselect: Global Predictor with Index Selection, Pan, So, Rahmeh, 1992. • PHT is indexed by concatenations of global history and branch address • performs better than either bimodal or global prediction • Gshare: Global Predictor with Index Sharing, McFarling 1993. • PHT is indexed by XOR of global history and branch address
Hybrid Predictors : Combining Branch Predictors • McFarling 1993 • Different prediction schemes have different advantages • Combined Predictor: Bimodal and Gshare • 2 bit-counter is used to select one of the predictor • performs always better than either predictor alone • 98.1% vs 97.1% • The idea of combining predictors was introduced first time.
Hybrid Predictors : Branch Classification • Chang, Hao, Yeh, and Patt, 1994 • Partitions a program’s branches into sets or branch classes • Classes are based on run-time and compile-time info. • Associates each branch class with the most suitable predictor • 2-bit counter is used to select the branch predictor
An Alternative Selection Mechanism • Single-scheme predictors and selection mechanisms • 2-level Selector, Chang, Hao, and Patt, 1995 • The concept of Two-level Branch Prediction is embodied. • The performance of 2-level BPS(Branch Predictor Selector) is shown to be better than 2-bit counter BPS mechanism of McFarling
Reducing Pattern History Table Interference • The main problem, which reduces the prediction rate in the global schemes is aliasing. • Neutral aliasing- no mispredictions • Destructive aliasing-misprediction
Reducing PHT Interference: The Agree Predictor • Sprangle, Chappel, Alsup, Patt, 1997 • Assigns a biasing bit to each branch in BTB • The PHT info is changed with the bias bit. • Hopes highly biased behavior of the branches is seen the first time a branch is introduced into the BT. • Neutral aliasing
Reducing PHT Interference: Bi-Mode Predictors • Lee, Chen, Mudge, 1997 • Tries to replace destructive aliasing with neutral aliasing • It splits the PHT table into even parts • choice PHT • direction PHTs: Taken and Not taken • Xored indexing of direction PHTs
Reducing PHT Interference: The Skewed Branch Predictor • Michaud, Seznec, Uhlig, 1997 • Lack of Associativity in PHT • conflict aliasing, rather than capacity • set associative PHT? (tags, etc) • Skewing Function • splits PHT into 3 banks • uses unique hashing function per bank • majority vote • partial updating of the banks
Reducing PHT Interference: The YAGS Branch Predictor • Eden, Mudge, 1998 • Yet Another Global Scheme (YAGS) • Combines the strong points of several previous schemes • introduces tags into the PHT that allows it to be reduced without sacrificing key branch outcome information. The size reduction more than offsets the cost of the tags. • Gives better prediction accuracy for the SPEC95 benchmark suite than several leading prediction schemes, for the same cost.
Conclusions • The Branch Prediction Methodology is studied. • 2-level Branch Prediction was the most important step on the topic. • Hybrid predictors, combining the advantages of the single-predictors are the most effective ones in branch prediction • The selection of the predictors in the Hybrid predictors requires a good study of branch behavior and depends to a great extend upon the programs. • Branch classification could be a promising method for Hybrid predictors. • Using 2-level BPS gives better performance than the 2-bit BPS
Conclusions (cont.) • Bi-Mode and Agree predictors that suggest splitting of the PHT into two branch streams have done a good job in reducing the aliasing in global schemes. • YAGS scheme further reduces the aliasing by combining the strong points of previous schemes