1 / 44

TAGE-SC-L Again MTAGE-SC

TAGE-SC-L Again MTAGE-SC. André Seznec INRIA/IRISA. Where do these predictors come from ?. GEHL: CBP 2004 , ISCA 2005 TAGE: JILP 2006, CBP 2006 Statistical correlation : CBP 2011 Combining more info: Micro 2011, CBP 2014, Micro 2015 O ptimizing everything : CBP 2016

jilld
Download Presentation

TAGE-SC-L Again MTAGE-SC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TAGE-SC-L AgainMTAGE-SC André Seznec INRIA/IRISA

  2. Where do these predictors come from ? • GEHL:CBP 2004 , ISCA 2005 • TAGE:JILP 2006, CBP 2006 • Statisticalcorrelation:CBP 2011 • Combining more info:Micro 2011, CBP 2014, Micro 2015 • Optimizingeverything: CBP 2016 • Unlimited:CBP 2014 CBP 2016

  3. Around 2002 • Introduction of perceptron predictor (Jimenez01) • State-of-the-art : EV8 predictor • Lagging behind perceptron on a few benchmarks • + with EV8-like: • some applications would benefit from 100+ history bits Both able to handle « long » global histories: 30+ branches

  4. CBP 2004 GEOMETRIC HISTORY LENGTH PREDICTOR

  5. A Multiple length global history predictor T0 T1 T2 Σ L(0) T3 L(1) L(2) T4 L(3) L(4) With a limited number of tables

  6. Underlying idea • H and H’ two history vectors equal on N bits, but differ on bit N+1 • e.g. L(1)NL(2) • Branches (A,H) and (A,H’) biased in opposite directions Table T2 should allow to discriminate between (A,H) and (A,H’)

  7. GEometric History Length predictor The set of history lengths forms a geometric series {0, 2, 4, 8, 16, 32, 64, 128} What is important:L(i)-L(i-1) is drastically increasing Spends most of the storage for short history !!

  8. GEHL (CBP 2004) • Neural inspired • Use of 200+ bits of global history • Narrow counters • Dynamic threshold update

  9. TAgged GEometric history length predictor JILP 2006 TAGE

  10. At CBP 2004, only neural predictors apart PPM-like predictor (Michaud 2004) but .. The update policy was poor

  11. TAGE (JILP 2006) • Partial tag match • almost .. • Geometric history length • Very effective update policy

  12. TAGE: Tagged and prediction by the longest history matching entry h[0:L1] pc pc pc h[0:L2] pc h[0:L3] ctr ctr ctr tag tag tag u u u 1 1 1 1 1 1 1 =? =? =? 1 1 prediction Tagless base predictor

  13. Miss Hit Pred =? =? 1 1 1 1 1 1 1 =? 1 Hit 1 Altpred

  14. Prediction computation • General case: • Longest matching component provides the prediction • Special case: • Many mispredictions on newly allocated entries: weak Ctr On many applications, Altpred more accuratethan Pred • Property dynamically monitored through 4-bit counters

  15. A tagged table entry Tag U Ctr • Ctr: 3-bit prediction counter • U: 1 or 2-bit counters • Was the entry recently useful ? • Tag: partial tag

  16. Allocate entries on mispredictions • Allocate entries in longer history length tables • On tables with U unset • Set Ctr to Weak and U to 0 • Limited storage budget: • Allocate 2 entries (when 15 to 20 different history lengths)

  17. Managing the (U)seful counter • Increment when avoids a misprediction • (Pred = taken) & (Altpred ≠ taken) Becomes « useful » • Global decrement when it becomes « difficult » to allocate: • Many possible heuristics (« difficult » ≈ 2/3 of the entries useful)  CBP 2016 heuristics: ≈ 0.5 % MPKI

  18. TAGE vs GEHL: • At equal sizes: ≈ 10 % MPKI reduction May vary with individual benchmarks !

  19. Optimizations for CBP2016 • Sharing storage space • Small hist. sharing a bank-interleaved table • Small tag (8 bits) • Long hist. sharing a bank-interleaved table • Longer tag (12 bits) • Partial associativity • 2 banks for medium hist. Lengths ≈ 2 % MPKI reduction

  20. Statistical Corrector (Global history) CBP2011 TAGE + (G)SC

  21. From CBP 2011,«the Statistical Corrector targets » • Branches with poor correlation with history: • Sometimes better predicted by a single wide PC indexed counter than by TAGE • More generally, track cases such that: • « For this (PC, history, prediction), TAGE is likely (>50 %) to mispredict » statistically

  22. TAGE-GSC ( CBP 2011)(was named a posteriori in Micro 2015) ≈3-5% MPKI red. PC +Global history (Main) TAGE Predictor Prediction + Confidence Stat. Cor. PPC + Globhist Just a global hist neural predictor: + tables indexed with PC, TAGE pred. and confidence

  23. Confidence for TAGE (HPCA 2011) • The value of the counter providing the prediction: Saturated = high confidence Intermediate= medium confidence Weak = low confidence

  24. Why does it work • The bias tables indexedwith PC+TAGE outputs: • Correct (most of the time) • High counter value • Dominates, not many updates • Wrong • Othercounterscanbetrained • (Statistical) Correlation (if itexists) canbecaptured

  25. Optimizations for CBP 2016 • Use TAGE confidence for indexing SC ≈ 1 % MPKI red. • On (very) low SC confidence: • May use TAGE prediction (if high conf, ..) ≈ 0.4 % MPKI red.

  26. The beauty of neural predictors Micro 2011, CBP 2014, Micro 2015 TAGE-SC

  27. From Compaq in 1999 OK, I cheated with loops • I learnt: • Use global history • Avoid local history Did manage to submitonly global historyat CBP 2004, 2006 and 2011

  28. Speculative history must be managed !? • Local history: • table of histories (unspeculatively updated) • must maintain a speculative history per inflight branch: • Associative search, etc ?!? • Global history: • Append a bit on asinglehistory register • Use of a circular buffer and just a pointer to speculatively manage the history

  29. Would not have won CBP 2014 without using local history

  30. How to use local histories with TAGE+(G)SC • Add the local history tables in the neural SC • as in the perceptron [Jimenez2002] ≈ 0.9 % MPKI reduction with 2Kbits on the 8KB predictor ≈ 2.5 % MPKI reduction with 28Kbits on the 64KB predictor I DO NOT ADVOCATE FOR LOCAL HISTORIES IN REAL HARDWARE PROCESSORS

  31. The beauty of neural predictors • TAGE-SC: • Just the right framework to test information vectors • Add extra tables: some benefit ! continue to explore

  32. Can add extra components in SC • IMLI-based components Micro2015 • Capture correlation in multidimensional loops • Very disappointing results essentially no benefit on CBP5 traces • Other forms of history: • E.g. only backward branches

  33. + a loop predictor (just in case) TAGE-SC-L

  34. Loop predictor • Can predictloop exit • for loopswith large iterationnumbers • regularnumber of iterations • Limited storage budget (a few entries) • But marginal benefit I DO NOT ADVOCATE FOR LOCAL HISTORIES IN REAL HARDWARE PROCESSORS

  35. TAGE-SC-L summary for CBP-5 Most of the budget on global hist. correlation: -TAGE with ≈ 1200 br. for 64 KB and ≈ 400 br. for 8KB -optimize the storage sharing -optimize the allocation Track the statistical correlation with a neural component: -use TAGE prediction AND confidence -incorporate other forms of history (even local history if you are trying to win CBP-5)

  36. TAGE-SC-L is still far from the predictability limits MTAGE-SC

  37. poTAGE-SC: the previous champion poTAGE+COLT (Michaud2014) and TAGE-SC-L

  38. poTAGE + COLT (Michaud2014) TAGE predictors a (PC + 5 pred) indexed table Global history Local history 1 Local history 2 COLT selection Local History 3 Frequency Use TAGE concept on other forms of hist.

  39. Unlimited TAGE-SC Statistical Corrector TAGE predictor Global history Bias GEHL RHSP Final choser other GEHL and perceptrons ...

  40. poTAGE-SC TAGE predictors Statistical Corrector Global history Bias GEHL Local history 1 RHSP Local history 2 Final choser COLT selection other GEHL and perceptrons Local History 3 ... Frequency

  41. MTAGE-SC TAGE predictors Statistical Corrector Global history Bias GEHL Local history 1 RHSP Local history 2 Final choser TAGE prediction combiner Local History 3 ... other GEHL and perceptrons Frequency Global backwardhistory

  42. MTAGE-SC TAGE predictors Statistical Corrector Global history Bias GEHL Local history 1 ≈ 5 % MPKI reduction over poTAGE-SC RHSP Local history 2 Final choser other GEHL and perceptrons TAGE prediction combiner Local History 3 ... Frequency Leverages confidence from SC and TAGE pred. combiner Global backwardhistory TAGE prediction combiner: COLT pred + neural combination of outputs pred + confidence Global backward history: to capture long path correlation, but eliminate intermediate branches A few extra history forms: IMLI, ..

  43. Seems that I am not making progress !! • CBP 2006 misp. rate: • 32KB L-TAGE ≈ 1.22 GTL • CBP 2014 misp.rate: • 32KB TAGE-SC-L ≈ 1.40 poTAGE-SC • CBP 2016 misp.rate: • 64KB TAGE-SC-L ≈ 1.55 MTAGE-SC Not the same traces, but ..

  44. Conclusion • TAGE-SC-L fits limited storage sizes: • Most significant optimizations over CBP 2014 • Use of TAGE confidence as index for SC • Sharing and partial associativity • MTAGE-SC: • Predictability limits even (a little bit) further that previously expected

More Related