1 / 26

Cache-Miss Prediction

Cache-Miss Prediction. Mostly No Machine (MNM). Robert Kenney, Kai Ting, Ezra Harrington. Just Say No: Benefits of Early Cache Miss Determination. Memik, Reinman, and Mangione-Smith 2003 HPCA. Outline. Motivation MNM Overview Details and Analysis of MNMs Replacement MNM (RMNM)

sal
Download Presentation

Cache-Miss Prediction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cache-Miss Prediction Mostly No Machine (MNM) Robert Kenney, Kai Ting, Ezra Harrington

  2. Just Say No: Benefits of Early Cache Miss Determination • Memik, Reinman, and Mangione-Smith • 2003 HPCA

  3. Outline • Motivation • MNM Overview • Details and Analysis of MNMs • Replacement MNM (RMNM) • Common-Address MNM (CMNM) • Simulation Environment • Simulation Results • Additional Experiments

  4. Motivation for Cache Miss Prediction • Clock speed increases => Memory latency more harmful • Levels of cache increasing • Predicting misses results in fewer cache accesses

  5. Our Motivation • Verify MNM results presented in paper • Miss coverage • CPI reductions • Study benefits of MNM on 3 levels of cache • Study nature of cache miss prediction

  6. MNM Operation • Store information about current or previous cache contents • Produces a “Miss” or “Maybe” • Never produces false “Miss” • Would result in an unnecessary access to a slower cache

  7. MNM Operation (cont) • Separate MNM for each level of cache contained in one module • No MNM for L1 • MNM accessed in parallel with L1, or • MNM accessed after miss in L1 (saves power) • MNM produces information about which caches to access

  8. MNM Operation (cont) • When a cache level is skipped • Next “Maybe” is searched • If Miss, next “maybe” searched, etc. • Retrieved data is written to bypassed cache

  9. Replacement MNM Operation • Contains information about previous cache contents • Cache of addresses • Address of replaced block cached in RMNM • Incoming block is invalidated in RMNM, if necessary

  10. Common-Address MNM • Uses spatial locality of accesses to improve miss prediction. • Two-level prediction scheme • Virtual-tag finder • Virtual tag registers with masks • Table of saturating counters • Indexed by {index of VT reg | N index bits} • Counters reset to zero on cache flushes

  11. Common-Address MNM (cont)

  12. Common-Address MNM (cont) • On access to CMNM: • Two ways to predict a “miss” • No match in VT match in VT finder • Table entry is “000” • On update to cache and CMNM: • Masks reduced until match found • Counter incremented when data added to cache • Counter decremented when data evicted from cache

  13. Perfect MNM • 100% coverage • Ideal performance gain obtainable by cache miss prediction

  14. Simulation Environment • Used SimpleScalar to simulate MNMs • Modified sim-cache and sim-outorder to handle up to 5 levels of cache • Implemented three MNM modules • Recreated exact simulation environment used in paper…as best we could • Six benchmarks ran • Four integer • Two floating point

  15. Quantifying MNM Benefits • Coverage • Misses predicted / Total misses • Cycle Savings

  16. RMNM Results

  17. CMNM Results

  18. Paper Critiques • Not enough information in paper to reproduce results • Fast-forward info not stated • Latency of MNM access was uniform • RMNM invalidations with updates • CMNM update can be cumbersome and variable • Cache update latency info not stated • Little in-depth analysis of nature and characteristics of MNM • Left out cases when performance degraded

  19. Did we match his results?

  20. Did we match his results?

  21. Did we match his results?

  22. Saturation Study • Ran gcc and mesa with CMNM_8_10 and RMNM_2048_4 for varying instruction lengths (no flushes) • CMNM handles compulsory and capacity cache misses better than RMNM

  23. Performance gain vs. MNM access latency • Measured benefits of miss prediction vs. cost of predicting for equake • DL1 hit latency: 2 cycles, DL2 hit latency: 8 cycles • PMNM: 100% coverage

  24. Coverage vs. Cache Size & Associativity • One MNM for DL2 only • Varied properties of DL2 • MNM and DL1 remained constant • Sims run on equake • RMNM works best for low associativity (Conflict Cache Misses) • CMNM handles different associativities better

  25. Summary • Mostly No Machine predicts cache misses • Replacement MNM • Common-Address MNM • Installation in SimpleScalar • Results • Analyses

  26. Questions?

More Related