270 likes | 453 Views
Increasing the Cache Efficiency by Eliminating Noise. Philip A. Marshall. Outline. Background Motivation for Noise Prediction Concepts of Noise Prediction Implementation of Noise Prediction Related Work Prefetching Data Profiling Conclusion. Background. Cache Fetch On Cache Miss
E N D
Increasing the Cache Efficiency by Eliminating Noise Philip A. Marshall
Outline • Background • Motivation for Noise Prediction • Concepts of Noise Prediction • Implementation of Noise Prediction • Related Work • Prefetching • Data Profiling • Conclusion
Background • Cache Fetch • On Cache Miss • Prefetch • Exploiting Spatial Locality • Cache words are fetched in blocks • Fetch neighboring block(s) on a cache miss • Results in fewer cache misses • Fetches words that aren’t needed
Background • Cache noise • Words that are fetched into the cache but never used • Cache utilization • The fraction of words in the cache that are used • Represents how efficiently the cache is used
Motivation for Noise Prediction • Level 1 data cache utilization is ~57% for SPEC2K benchmarks [2] • Fetching unused words: • Increases bandwidth requirements between cache levels • Increases hardware and power requirements • Wastes valuable cache space [2] D. Burger et. al., Memory bandwidth limitations of future microprocessors, Proc. ISCA-23, 1996
Motivation for Noise Prediction • Cache block size • Larger blocks • Exploit spatial locality better • Reduce cache tag overhead • Increase bandwidth requirements • Smaller blocks • Reduced cache noise • Any block size results in suboptimal performance
Motivation for Noise Prediction • Sub-blocking • Only portions of the cache blocks are fetched • Decreases tag overhead by associating one tag with many sub-blocks • Words fetched must be in contiguous blocks of fixed size • High miss-rate and cache noise for non-contiguous access patterns
Motivation for Noise Prediction • By predicting which words will actually be used, cache noise can be reduced • But: • Fetching fewer words could increase the number of cache misses
Concepts of Noise Prediction • Selective fetching • For each block, fetch only the words that are predicted to be accessed • If no prediction is available, fetch the entire block • Uses a valid bit for each word and a words usage bit to track which words have been used
Concepts of Noise Prediction • Cache Noise Predictors • Phase Context Predictor (PCP) • Based on the usage pattern of the most recently evicted block • Memory Context Predictor (MCP) • Based on the MSBs of the memory address • Code Context Predictor (CCP) • Based on the MSBs of the PC
Concepts of Noise Prediction • Prediction table size • Larger tables decrease the probability of “no predictions” • Smaller tables use less power • A prediction is considered successful if all the needed words are fetched • If extra words are fetched, still considered a success
Concepts of Noise Prediction • Improving Prediction • Miss Initiator Based History (MIBH) • Keep separate histories according to which word in the block caused the miss • Improves predictability if relative position of words accessed is fixed • Example: looping through a struct and accessing only one field
Concepts of Noise Prediction • Improving Prediction • OR-ing Previous Two Histories (OPTH) • Increases predictability by looking at more than the most recent access • Reduces cache utilization • OR-ing more than two accesses reduces utilization substantially
Results • Empirically, CCP provides the best results • MIBH greatly increases predictability • OPTH improves predictability only marginally while increasing cache noise • Cache utilization increased from 57% to 92%
Related Work • Existing work focuses reducing cache misses, not on improving utilization • Sub-blocked caches used mainly to decrease tag overhead • Some existing work on prediction of which sub-blocks to load in a sub-blocked cache • No existing techniques for predicting and fetching non-contiguous words
Prefetching • Prefetching improves the cache miss rate • Commonly, prefetching is implemented by also fetching the next block on a cache miss • Prefetching increases noise and increases bandwidth requirements
Prefetching • Noise prediction leads to more intelligent prefetching but requires extra hardware • On average, prefetching with noise prediction leads to less energy consumption • In the worst case, energy requirements increase
Data Profiling • For some benchmarks there are a low number of predictions • The predictor table is too small to hold all the word usage histories • Don’t increasing table size, profile the data • Profiling increases prediction rate by ~7% • Gains aren’t as high as expected
Analysis of Noise Prediction • Pros • Small increase in miss rate (0.1%) • Decreased power requirements in most cases • Decreased bandwidth requirements between cache levels • Adapts effective block size to access patterns • Dynamic technique but profiling can be used • Scaleable to different predictor sizes
Analysis of Noise Prediction • Cons • Increased hardware overhead • Increases power in the worst case • Not all programs benefit • Profiling provides limited improvement
Other Thoughts • How were benchmarks chosen? • 6 of 12 integer and 8 of 14 floating point SPEC2K benchmarks were used • Not all predictors were examined equally • 22-bit MCP predictor performed slightly poorer than a 28-bit CCP • 28-bit MCP? • How can the efficiency of the prediction table be increased?