1 / 24

Exploiting Spatial Locality in Data Caches using Spatial Footprints

Exploiting Spatial Locality in Data Caches using Spatial Footprints. Sanjeev Kumar, Princeton University Christopher Wilkerson, MRL, Intel. Memory. Spatial Locality in Caches. Current approach: Exploit spatial locality within a cache line Small cache line Lower bandwidth Less pollution

berget
Download Presentation

Exploiting Spatial Locality in Data Caches using Spatial Footprints

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exploiting Spatial Locality in Data Caches using Spatial Footprints Sanjeev Kumar, Princeton University Christopher Wilkerson, MRL, Intel

  2. Memory Spatial Locality in Caches • Current approach: Exploit spatial locality within a cache line • Small cache line • Lower bandwidth • Less pollution • Big cache line • Exploit more locality • Fewer tags • Current caches use 32 byte lines Access { Cache Line Princeton University

  3. Spatial Locality in Caches contd. • Spatial locality exhibited varies • Across applications • Within an application • Using fixed size line to exploit locality • Inefficient use of cache resources • Less than half the data fetched is used • Wastes bandwidth, Pollutes cache • Limited benefit of spatial locality Princeton University

  4. Outline • Introduction • Spatial Footprint Predictors • Practical Considerations • Future Work • Related Work • Conclusions Princeton University

  5. Access 0100101100100010 Memory Spatial Footprint Predictor (SFP) • Exploit more spatial locality • Need to reduce pollution • Fetch words selectively • Requires accurately predicting spatial footprint of a block • Bit Vector • 0100 1011 0010 0010 Princeton University

  6. Record spatial footprints Use footprint history to make predictions Lookup table based on Nominating data address Nominating instruction address Combination L1 data caches Nominating Access (NA) Memory Spatial Footprint Predictor contd. 0100101100100010 Princeton University

  7. Use large cache lines Fetch specific words leave holes for words that were not fetched Might decrease bandwidth Increases miss ratio missed lines under-utilization of cache Cache Line Cache Memory Simple Approach Princeton University

  8. Line { Sector Cache Memory Our Approach • Regular cache with small lines • 8 bytes i.e. one word • Exploit spatial locality at sector granularity • 16 lines i.e. 128 bytes • Spatial Footprint Predictor • Fetch 1-16 lines in a sector on a miss Princeton University

  9. When to Record/Predict footprints? • Sectors in memory are active or inactive • Active • Record footprints • cache miss in an inactive sector • Inactive • Use history to predict • Cache miss on a line that is marked used (footprint) in an active sector • Use infinite size tables Princeton University

  10. Access Memory Recording Footprints SF NA SF 1001 ... 1000 ... 0100101100100010 0100 ... Done 6 0100 ... Active Sector Table Spatial Footprint History Table Cache (Records FP) (Stores FP) Princeton University

  11. Memory Predicting Footprints SF Fetch Lines 1001 ... 1000 ... 0100101100100010 0100 ... Predicted Footprint Access Spatial Footprint History Table Cache Princeton University

  12. The default footprint predictor • When SFP has no prediction • No history • Evicted from Spatial Footprint Table • Picks a single line size for the application • Based on the footprints observed Princeton University

  13. Cache Parameters 16 KB L1 4-way associative 8 bytes per line 16 lines per sector Cache simulator Miss Ratios Fetch bandwidth 12 Intel MRL traces gcc and go (SPEC) Transaction processing Web server PC applications word processors spreadsheets Normalized results 16KB conventional cache with 32 byte line Experimental Setup Princeton University

  14. Experimental Evaluation Normalized Miss Ratios Princeton University

  15. GCC Comparison Princeton University

  16. GCC Comparison Contd. • Comparing SFP cache to • Conventional caches with varying line sizes • Comparable to best miss ratio (using 64 byte lines) • Close to lowest bandwidth (using 8 byte lines) • Bigger conventional cache • Comparable to a 32 KB Cache Princeton University

  17. Outline • Introduction • Spatial Footprint Predictors • Practical Considerations • Future Work • Related Work • Conclusions Princeton University

  18. Decoupled Sectored Cache • Seznec et. al. • Proposed to improve sectored L2 cache • Decouple tag array from data array • Dynamic mapping: no longer one-to-one • Multiple lines from the same sector share tags • Flexible: Data and tag array can be of different sizes and associativities Princeton University

  19. Practical Considerations • Reasonable Spatial Footprint History Table • 1024 entries • Reduce Tag Storage • Use Decoupled Sectored Cache • Same number of tags as a conventional cache with 32 byte lines • Both data and tag array are 4-way associative Princeton University

  20. Experimental Evaluation Normalized Miss Ratios Princeton University

  21. Cost • Additional Space • 9 KB • Can be reduced by • Using partial tags • Compressing footprints • Time • Most predictor actions are off the critical path • Little impact on cache access latency Princeton University

  22. Future Work • Improve miss ratios further • Infinite tables: 30% • Practical Implementation: 18% • Reduce the additional memory required • Better coarse grained predictor • L2 Caches Princeton University

  23. Related Work • Przybylski et. al., Smith et. al. • Statically best line size, fetch size • Gonzalez et. al. • Dual data cache: temporal and spatial locality • Numeric codes • Johnson et. al. • Dynamically pick line size per block (1 KB) Princeton University

  24. Conclusions • Spatial Footprint Predictors • Decrease miss ratio (18% on average) • Reduce bandwidth usage • Little impact on cache access latency • Can use fine-grain predictor for data caches Princeton University

More Related