230 likes | 285 Views
The Predictability of Computations that Produce Unpredictable Outcomes. T OR A AMODT (aamodt@eecg.utoronto.ca Andreas Moshovos Paul Chow Electrical and Computer Engineering University of Toronto Canada. Outcome-Based Prediction. History of Outcomes leading up to Branch “X”:
E N D
The Predictability of Computations that Produce Unpredictable Outcomes TOR AAMODT (aamodt@eecg.utoronto.ca Andreas Moshovos Paul Chow Electrical and Computer Engineering University of Toronto Canada
Outcome-Based Prediction History of Outcomes leading up to Branch “X”: TNTTNTT ...NTN... TNTTNTT History Next time we encounter X after “TNTTNT” we can predict “T” Outcome of Branch X Why this works: Locality in the outcome stream The Predictability of Computations that Produce Unpredictable Outcomes
Problem • Unpredictable Branches THE Problem. • No Outcome-Locality The Predictability of Computations that Produce Unpredictable Outcomes
Operation-Based Prediction • Find locality in the computations that produce the outcome add ld slt bne The Predictability of Computations that Produce Unpredictable Outcomes
This Work • First work that looks at the fundamental program behaviour that would facilitate operation-based prediction. • Related work… • Characterization of slices • Prefetching loads / pre-execution of branches The Predictability of Computations that Produce Unpredictable Outcomes
Ideally... • Slice (i.e., slice trace) will always be the same. • Slice will contain very few operations spanning large portion of original program. • Easy (fast) to pre-compute. The Predictability of Computations that Produce Unpredictable Outcomes
Terminology • Lead : earliest instruction in slice • Target : branch we want to precompute add ld slt bne The Predictability of Computations that Produce Unpredictable Outcomes
What Should a Slice be? FETCH ... COMMIT • Commited Instructions • 32, 64, 128, or 256 window • Ignore Control Flow • retain side-effect of JAL on $r31 • Memory Dependence • follow resolved load-store dependence: M • Restrict # Instructions • R = max 1/4, U = “no restriction” older The Predictability of Computations that Produce Unpredictable Outcomes
Methodology • 12 programs from SPEC2000 • Baseline Outcome Prediction Hardware • 64K Gshare + 64K bimodal w/ 64K selector • 64 entry RAS • sim-outorder (SimpleScalar 3.0): • 8-way, 128 entry RUU, 64 entry-fetch buffer • 64K dual LI, 256K unified L2 • 64 entry LSQ • Perfect Memory Disambiguation The Predictability of Computations that Produce Unpredictable Outcomes
Measuring Slice Locality • locality(1) = Probability same slice was seen last time. High value of locality(1) indicates that last-operation based slice prediction would work well. • locality(N) = Probability same slice seen in last N unique slices. The Predictability of Computations that Produce Unpredictable Outcomes
Measuring Slice Locality • Save the FOUR unique, most recent slice traces per static branch (only on misprediction). • Each time a mispredicted branch is encountered check whether the slice trace was the most recent, 2nd most recent, etc... The Predictability of Computations that Produce Unpredictable Outcomes
Measuring Slice Locality • All results are weighted averages. • Result for each static branch weighted proportionally to the number of times the operation-based predictor mispredicted it. • Characteristics of branches that cause most mispredictions emphasized. The Predictability of Computations that Produce Unpredictable Outcomes
Unrestricted Slices : 32UM Better Locality gcc equake ammp bzip Saving ONE slice captures most of locality. The Predictability of Computations that Produce Unpredictable Outcomes
Restricted vs. Unrestricted Better Locality 32UM 32RM gcc equake ammp bzip Most slices have few instructions. The Predictability of Computations that Produce Unpredictable Outcomes
Effect of Memory Dependence Better Locality 64R 64RM gcc equake ammp bzip Tracking Dependence Does Not Affect Locality Much. The Predictability of Computations that Produce Unpredictable Outcomes
Window Size Better Locality 32RM 64RM 128RM 256RM gcc equake ammp bzip Locality good even for large windows. The Predictability of Computations that Produce Unpredictable Outcomes
Effect of Selection Context 128RM Better Locality On Mispredict Always gcc equake ammp bzip Focusing on Mispredictions Improves Locality. The Predictability of Computations that Produce Unpredictable Outcomes
Idealized Predictor Lead PC • Spawn and execute instantaneously when lead operation is encountered. • Store up to 4 slice traces per lead operation The Predictability of Computations that Produce Unpredictable Outcomes
Idealized Predictor • Match operations & register dependencies as instructions are fetched. • After matching there is usually only one prediction per target, if any (>80% of time)... • Tie-breaker #1: longest lead-target distance. • Tie-breaker #2: most recently detected slice. The Predictability of Computations that Produce Unpredictable Outcomes
Correcting Mispredictions 32RM 64RM 128RM gcc equake ammp bzip High Coverage of Mispredicted Branches The Predictability of Computations that Produce Unpredictable Outcomes
Interaction with Outcome-Based Predictor 32RM 64RM 128RM gcc equake ammp bzip Very Little Destructive Interference The Predictability of Computations that Produce Unpredictable Outcomes
Summary • Slice-locality for mispredicted branches • average of 70% for restricted slices on a 64 entry window following load-store dependencies (12 SPEC2000 benchmarks). • Accuracy of idealized predictor • 74% of mispredicted branches eliminated The Predictability of Computations that Produce Unpredictable Outcomes
Conclusion • First work that looks at the fundamental program behaviour, slice-locality, that would facilitate predicting slice traces to pre-execute outcomes. • SPEC2000 benchmarks show very high slice-locality for mispredicted branches. The Predictability of Computations that Produce Unpredictable Outcomes