270 likes | 455 Views
Cache-Miss Prediction. Mostly No Machine (MNM). Robert Kenney, Kai Ting, Ezra Harrington. Just Say No: Benefits of Early Cache Miss Determination. Memik, Reinman, and Mangione-Smith 2003 HPCA. Outline. Motivation MNM Overview Details and Analysis of MNMs Replacement MNM (RMNM)
E N D
Cache-Miss Prediction Mostly No Machine (MNM) Robert Kenney, Kai Ting, Ezra Harrington
Just Say No: Benefits of Early Cache Miss Determination • Memik, Reinman, and Mangione-Smith • 2003 HPCA
Outline • Motivation • MNM Overview • Details and Analysis of MNMs • Replacement MNM (RMNM) • Common-Address MNM (CMNM) • Simulation Environment • Simulation Results • Additional Experiments
Motivation for Cache Miss Prediction • Clock speed increases => Memory latency more harmful • Levels of cache increasing • Predicting misses results in fewer cache accesses
Our Motivation • Verify MNM results presented in paper • Miss coverage • CPI reductions • Study benefits of MNM on 3 levels of cache • Study nature of cache miss prediction
MNM Operation • Store information about current or previous cache contents • Produces a “Miss” or “Maybe” • Never produces false “Miss” • Would result in an unnecessary access to a slower cache
MNM Operation (cont) • Separate MNM for each level of cache contained in one module • No MNM for L1 • MNM accessed in parallel with L1, or • MNM accessed after miss in L1 (saves power) • MNM produces information about which caches to access
MNM Operation (cont) • When a cache level is skipped • Next “Maybe” is searched • If Miss, next “maybe” searched, etc. • Retrieved data is written to bypassed cache
Replacement MNM Operation • Contains information about previous cache contents • Cache of addresses • Address of replaced block cached in RMNM • Incoming block is invalidated in RMNM, if necessary
Common-Address MNM • Uses spatial locality of accesses to improve miss prediction. • Two-level prediction scheme • Virtual-tag finder • Virtual tag registers with masks • Table of saturating counters • Indexed by {index of VT reg | N index bits} • Counters reset to zero on cache flushes
Common-Address MNM (cont) • On access to CMNM: • Two ways to predict a “miss” • No match in VT match in VT finder • Table entry is “000” • On update to cache and CMNM: • Masks reduced until match found • Counter incremented when data added to cache • Counter decremented when data evicted from cache
Perfect MNM • 100% coverage • Ideal performance gain obtainable by cache miss prediction
Simulation Environment • Used SimpleScalar to simulate MNMs • Modified sim-cache and sim-outorder to handle up to 5 levels of cache • Implemented three MNM modules • Recreated exact simulation environment used in paper…as best we could • Six benchmarks ran • Four integer • Two floating point
Quantifying MNM Benefits • Coverage • Misses predicted / Total misses • Cycle Savings
Paper Critiques • Not enough information in paper to reproduce results • Fast-forward info not stated • Latency of MNM access was uniform • RMNM invalidations with updates • CMNM update can be cumbersome and variable • Cache update latency info not stated • Little in-depth analysis of nature and characteristics of MNM • Left out cases when performance degraded
Saturation Study • Ran gcc and mesa with CMNM_8_10 and RMNM_2048_4 for varying instruction lengths (no flushes) • CMNM handles compulsory and capacity cache misses better than RMNM
Performance gain vs. MNM access latency • Measured benefits of miss prediction vs. cost of predicting for equake • DL1 hit latency: 2 cycles, DL2 hit latency: 8 cycles • PMNM: 100% coverage
Coverage vs. Cache Size & Associativity • One MNM for DL2 only • Varied properties of DL2 • MNM and DL1 remained constant • Sims run on equake • RMNM works best for low associativity (Conflict Cache Misses) • CMNM handles different associativities better
Summary • Mostly No Machine predicts cache misses • Replacement MNM • Common-Address MNM • Installation in SimpleScalar • Results • Analyses