240 likes | 341 Views
TAGE-SC-L Branch Predictors. André Seznec INRIA/IRISA. The TAGE-SC-L branch predictor Sorry, nothing really new. TAGE, JILP 2006 Considered as state -of-the-art global history predictor Can be augmented with small adjunct predictors Loop predictor : CBP -2 (2006 )
E N D
TAGE-SC-L Branch Predictors André Seznec INRIA/IRISA
The TAGE-SC-L branch predictorSorry, nothing really new .. • TAGE, JILP 2006 • Considered as state-of-the-art global historypredictor • Can beaugmentedwithsmalladjunctpredictors Looppredictor: CBP-2 (2006) Statistical Corrector + LoopPredictor, Global historyCBP-3 (2011) Local historyMicro 2011
Optimized all parameters • Number, size, width of the tables • Types of the histories for the statistical components All that for decreasing the misprediction number by 3% !!
Global, local, skeleton histories (Main) TAGE Predictor PPC +Global history Prediction + Confidence Stat. Cor. LoopPredictor
TAGE: multiple tables, global history predictor The set of history lengths forms a geometric series Capture correlation on very long histories {0, 2, 4, 8, 16, 32, 64, 128} most of the storage for short history !!
TAGE: Tagged and prediction by the longest history matching entry h[0:L1] pc pc pc h[0:L2] pc h[0:L3] ctr ctr ctr tag tag tag u u u 1 1 1 1 1 1 1 =? =? =? 1 1 prediction Tagless base predictor
Miss Hit Pred =? =? 1 1 1 1 1 1 1 =? 1 Hit 1 Altpred
Prediction computation • General case: • Longest matching component provides the prediction • Special case: • Many mispredictions on newly allocated entries: weak Ctr On many applications, Altpred more accuratethan Pred • Property dynamically monitored through 4-bit counters
A tagged table entry Tag U Ctr • Ctr: 3-bit prediction counter • U: 2-bit counters • Was the entry recently useful ? • Tag: partial tag
Allocate entries on mispredictions • Allocate entries in longer historylength tables • On tables with U unset • Set Ctr to Weak and U to 0 • Limited storage budget: • Allocate 2 entries for 256Kbits • Allocate 1 or 2 for 32Kbits • UNLIMITED STORAGE BUDGET: • multiple entries allocated in different tables
Managing the (U)seful counter • Increment when avoids a misprediction • (Pred = taken) & (Alt ≠ taken) • 256K: Global decrement if « difficult » to allocate • 32K: Probabilistic decrement when conflict • Unlimited: don’t care
Adjunct predictors • TAGE tracks strong correlation with the global branch history • Small adjunct predictors to capture some missed correlation: • Loop predictor • Statistical Corrector
The loop predictor • Predict loop with constant number of iterations: • 16/32 entries • less than 5 bytes per entry • Capture loops with long bodies and/or irregular internal branches S: 1.2 % M: 1 % U:0.4% Good tradeoff for the Championship Implementation: Not that great
The Statistical Corrector predictor • Branches with poor correlation with global history: • Sometimes better predicted by a single wide PC indexed counter than by TAGE • More generally, track cases such that: • « In this case (PC, history, prediction), TAGE is likely (>50 %) to mispredict »
Small predictor: very limited budget for the SC predictor • Just track the statistically PC biased branches • « TAGE predicts this direction on this branch, but in most cases this was wrong » • The corrector filter: A small partially tagged associative table 1.5 % misp. reduction: Much simpler than a loop predictor
Medium predictor • « Statistically » correlated branches: • Not strongly correlated with the global history, but exhibit a bias • better predicted by averaging than tags • neural tags • Branches correlatedwith local history, • but irregular global history pattern (on other branches) • TAGE does not learn the pattern
MultiGehl Statistical Correlator Predictor H + LH PC Pred + Gehl-like Prediction + ctr value TAGE Stat. Corr. H PC Local hist.
Why does it work • The bias table indexedwith PC+TAGE output: • Correct (most of the time) • High counter value • Dominates, not many updates • Wrong • Othercounterscanbetrained • Correlation (if itexists) canbecaptured
MultiGehl Statistical Correlator Predictor for the Championship + RAS associatedhistory + 2 different local histories + simple choser 6.8 % mispreduction Prediction + ctr value TAGE Stat. Corr. H PC Local hist.
« Realistic » 256 Kbits TAGE-SC-L • « Only » • 12 equal size TAGE tables + • (local hist., global hist.) 4-tables SC • + loop predictor • No history tuning Only 2.8 % extra mispredictions
SC for Unlimited predictor • GEHL based SC predictor: • Use any form of history information • Very long global • Mutiple local • « Skeleton » global history • ignore some branches • Recycle old ideas from the MAC-RHSP predictor (2004)
SC for unlimited predictor • 460 predictor tables + 10 choser tables • Globally about 20 % less misp. than TAGE alone • If one removes only : • The bias: 1.6 % for a single table • All global history components: 3.7 % • All local history components: 3.9 % • The choser: 3.2 %
Conclusion • TAGE-SC-L fits (nearly) all storage sizes • 32Kbits ≈ 64Kbits CBP1 champion on CBP1 traces • 256Kbits ≈ 512Kbits CBP3 champion on CBP4 traces • Unlimited predictor: • poTAGE-SC does better