60 likes | 216 Views
Reducing Cache Traffic and Energy with Macro Data Load. Lei Jin and Sangyeun Cho*. Dept. of Computer Science University of Pittsburgh. Motivation. Data cache access is a frequent event 20~40% of all instructions access data cache
E N D
Reducing Cache Traffic and Energywith Macro Data Load Lei Jin and Sangyeun Cho* Dept. of Computer Science University of Pittsburgh
Motivation • Data cache access is a frequent event • 20~40% of all instructions access data cache • Data cache energy can be significant (~16% in StrongARM chip [Montanaro et al. 1997]) • Reducing cache traffic leads to energy savings • Existing thoughts • Store-to-load forwarding • Load-to-load forwarding • Use available resources to keep data for reuse • LSQ [Nicolaescu et al. 2003] • Reorder buffer [Önder and Gupta 2001]
Macro Data Load (ML) • Previous works are limited by exact data matching • Same address and same data type • Exploit spatial locality in cache-port-wide data • Accessing port-wide data is free • Naturally fits datapath and LSQ width • Recent processors support 64 bits • Many accesses are less than 64 bits w/o ML w/ ML
ML Potential CINT2k CFP2k • ML uncovers more opportunities • ML especially effective with limited resource MiBench
ML Implementation • Architectural changes • Relocated data alignment logic • Sequential LSQ-cache access • Net impact • LSQ becomes a small fully associative cache with FIFO replacement
Result: Energy Reduction • Up to 35% (MiBench) energy reduction! • More effective than previous techniques CINT CFP MiBench