180 likes | 382 Views
Performance Evaluation of Cache Replacement Policies for the SPEC CPU2000 Benchmark Suite. Hussein Al-Zoubi. Overview. Introduction Common cache replacement policies Experimental methodology Evaluating cache replacement policies: questions and answers Conclusion. Introduction.
E N D
Performance Evaluation of Cache Replacement Policies for the SPEC CPU2000 Benchmark Suite Hussein Al-Zoubi
Overview • Introduction • Common cache replacement policies • Experimental methodology • Evaluating cache replacement policies: questions and answers • Conclusion
Introduction • Increasing speed gap between processor and memory • Modern processors include multiple levels of caches, cache associativity increases • Replacement policy: Which block to discard when the cache is full
Introduction...cont. • Optimal Replacement (OPT) algorithm: replace cache memory block whose next reference farthest away in the future, infeasible • State-of-the-art processors employ various policies
Introduction...cont. • Random • LRU (Least Recently Used) • Round-robin (FIFO – First-In-First-Out) • PLRU (Pseudo Least Recently Used) : reduce the hardware cost by approximating the LRU mechanism
Introduction...cont. • Our goal: explore and evaluate common cache replacement policies • how existing policies relate to OPT • effect on instruction and data caches • how good are pseudo techniques in approximating true LRU
Common cache replacement policies…cont. • Random policy: simpler, but at the expense performance. Linear Feedback Shift Register (LFSR) • Round Robin (or FIFO) replacement: replacing oldest block in cache memory. Circular counter
Experimental methodology • sim-cache and sim-cheetah simulators • Alpha version of the SimpleScalar • original simulators modified to support additional pseudo-LRU replacement policies • sim-cache simulator modified to print interval statistics per specified number of instructions
Evaluating cache replacement policies: questions and answers • Q: How much associativity is enough for state-of-the-art benchmarks? • A: For data cache, performance gain for transition from a direct mapped to a two-way set associative cache • For instruction cache, OPT replacement policy benefits from increased associativity. realistic policies don’t exploit more than 8 ways, or in some cases even more than 2 ways
Evaluating cache replacement policies: questions and answers…cont. • Q: How much space is there for improvement for each specific benchmark and cache configuration?
Evaluating cache replacement policies: questions and answers…cont. • Q: Do replacement policies behave differently for different types of memory references, such as instruction and data? • A: In general, LRU policy has better performance than FIFO and Random with some exceptions
Evaluating cache replacement policies: questions and answers…cont. • Q: Can dynamic change of replacement policy reduce the total number of cache misses? • A: If one policy better than the other, it stays consistently better
Evaluating cache replacement policies: questions and answers…cont. • Can we use most recently used information for cache way prediction?
Evaluating cache replacement policies: questions and answers…cont. • Q: How good are pseudo LRU techniques at approximating true LRU? • A: PLRUm and PLRUt very efficient in approximating LRU policy and close to LRU during whole program execution
Conclusion • Eliminating cache misses extremely important for improving overall processor performance • Cache replacement policies gain more significance in set associative caches • Gap between LRU and OPT replacement policies, up to 50%, new research to close the gap is necessary