130 likes | 145 Views
Learn how to improve cache performance by exploiting critical instructions and data placement techniques. This research explores energy-delay trade-offs and the benefits of classifying instructions as critical or non-critical. Discover the Hot-and-Cold microarchitecture approach and its impact on cache efficiency.
E N D
Hot-and-Cold: Using Criticality in the Design of Energy-Efficient Caches Rajeev Balasubramonian, University of Utah Viji Srinivasan, IBM T.J. Watson Sandhya Dwarkadas, University of Rochester Alper Buyuktosunoglu, IBM T.J. Watson
All Instructions are not Created Equal • Critical instructions – lie on the program critical path • Non-critical instructions – can be slowed without • increasing execution time • Potential to improve cache performance (?) • [Srinivasan ’01] [Fisk ’99] • Prioritization policies [Fields ’01] [Tune ’01] • Energy-efficient ALUs [Seng ’01]
Energy-Delay Trade-Offs • Example energy-delay trade-off techniques: • Voltage scaling, transistor sizing, way prediction, serial-access • Gated-ground cells, high Vt Transistor sizing Variable threshold voltage
Exploiting Criticality • Design two static banks – • hot bank: fast and high power • cold bank: slow and low power • Instructions have to be classified as critical or not • and • Data has to be placed in one of two banks • Energy-efficient ALUs are easier to handle as there is no • associated storage
Criticality Metric • Oldest-N: The N oldest instructions in the queue • are critical • Younger instructions are likely to be on • mispredicted paths or can tolerate latencies • N can be varied based on program needs • Minimal hardware overhead • Behavior comparable to more complex metrics
Data Classification Exclusively critical Exclusively non-critical
Hot-and-Cold Microarchitecture Dispatch Bank Predictor Issue Queue Cold bank Criticality Counters Hot bank Placement Predictor L2
Results Summary • Bank mispredict rate of 9.5% • Criticality mismatch rate of 26% • Performance loss = 2.7% (data reorganization) • + (0.8 x slowdown) • L1 cache energy savings of 37%
Related Work • Recent split-cache organization by Abella and • Gonzalez [ICCD’03] Base Slow Fast • Data allocation based on criticality of accessing • instruction
Conclusions • Data and instruction classification is reasonably • accurate • Overhead from contention is non-trivial • Results are worthwhile in limited settings • The use of criticality for data cache reorganization • yields little benefit