230 likes | 412 Views
Looking for limits in branch prediction with the GTL predictor. André Seznec IRISA/INRIA/HIPEAC. Motivations. Geometric history length predictors introduced in 2004-2006 OGEHL, CBP-1, dec. 2004 TAGE, JILP ’06, feb. 2006 Storage effective Exploits very long global histories
E N D
Looking for limits in branch predictionwith the GTL predictor André Seznec IRISA/INRIA/HIPEAC
Motivations • Geometric history length predictors introduced in 2004-2006 • OGEHL, CBP-1, dec. 2004 • TAGE, JILP ’06, feb. 2006 • Storage effective • Exploits very long global histories • Were defined with possible implementation in mind • What are the limits of accuracy that can be captured with these schemes ? • How do they compare with unconstrained prediction schemes ?
Geometric history length predictors: global history +multiple lengths TO T1 T2 ? L(0) T3 L(1) L(2) T4 L(3) L(4)
GEometric History Length predictor The set of history lengths forms a geometric series Capture correlation on very long histories {0, 2, 4, 8, 16, 32, 64, 128} most of the storage for short history !! What is important:L(i)-L(i-1) is drastically increasing
Combining multiple predictions • Neural inspired predictors • Use a (multiply)-add tree • Partial matching • Use tagged tables and the longest matching history O-GEHL, CBP-1 TAGE, JILP’ 06
TO T1 T2 ∑ T3 L(1) L(2) T4 L(3) L(4) CBP-1 (2004): O-GEHL Final computation through a sum L(0) Prediction=Sign 256Kbits: 12 components 3.670 misp/KI
JILP ‘06: TAGElongest matching history =? =? =? 1 1 1 1 1 1 1 1 1 256Kbits: 3.358 misp/KI
What is global history • conditional branch history: • path confusion on short histories • path history: • Direct hashing leads to path confusion • Represent all branches in branch history • Use path AND direction history
Using a kernel history and a user history • Traces mix user and kernel activities: • Kernel activity after exception • Global history pollution • Solution: use two separate global histories • User history is updated only in user mode • Kernel history is updated in both modes
Accuracy limits for TAGE • Varying the predictor size, the number of components, the tag width, the history length. • Allowing multiple allocations The best accuracy on distributed traces: 3.054 misp/KI • History length around 1,000 • 15-20 components • No need for tags wider than 16 bits
Accuracy limits for GEHL • Varying the predictor size, the number of components, the history length, counter width • (slightly) improving the update policy • and fitting in the two hours simulation rule on the distributed traces: 2.842 misp/KI • 97 components • 8 bits counter • 2,000 bits global history
GEHL vs TAGE • Realistic implementation parameters (storage budget, number of components) • TAGE is more accurate than (O-)GEHL • Unlimited budget, huge number of components • GEHL is more accurate than TAGE
Will it be sufficient to win The Championship ? • GEHL history length: 2,000 • 97 components 2.842 misp/KI
A step further: hybrid GEHL-TAGE • On a few benchmarks, TAGE is more accurate than GEHL, • Let us try an hybrid GEHL-TAGE predictor
GEHL mux TAGE Meta = egskew Hybrid GEHL-TAGE Branch/path history + PC • Inherit from: • Agree/bimode, YAGS, 2bcgskew,
GEHL+TAGE • GEHL provides the main prediction: • also used as the base predictor for TAGE (YAGS inspired) • TAGE records when GEHL fails: {prediction, address, history} (agree/bimode, YAGS inspired) • Meta selects between GEHL and TAGE (2bcgskew inspired)
Let us have fun !! GEHL history length: 400 TAGE history length: 100,000 2.774 misp/KI
Might still be unsufficient GEHL history length: 400 TAGE history length: 100,000 2.774 misp/KI
Adding a loop predictor • The loop predictor captures the number of iterations of a loop • When successively encounters 8 times the same number of iterations, the loop predictor provides the prediction. • Advantage: • Very reliable
Branch/path history + PC GEHL mux mux TAGE Meta = egskew Loop predictor GTL predictor confidence + static prediction on first occurrence
Hope this will be sufficient to win the Championship !! GTL GEHL, 97 comp., 400 hist. + TAGE, 19 comp., 100,000 hist + loop predictor 2.717 misp/KI
Geometric History Length predictorsand limits on branch prediction • Unlimited budget, huge number of components • GEHL is more accurate than TAGE • Very old correlation can be captured: • On two benchmarks, using 10,000 history is really helping • Does not seem to be a lot of potential extra benefit from local history • We did not find any interesting extra scheme apart loop prediction • Loop prediction, very marginal apart gzip