120 likes | 212 Views
Improving Energy Efficiency by Making DRAM Less Randomly Accessed. Hai Huang , Kang G. Shin, Charles Lefurgy, Tom Keller University of Michigan IBM Austin Research Lab. Overview. Continual increase in the power budget allocated to main memory (i.e., DRAM)
E N D
Improving Energy Efficiency by Making DRAM Less Randomly Accessed Hai Huang, Kang G. Shin, Charles Lefurgy, Tom Keller University of Michigan IBM Austin Research Lab
Overview • Continual increase in the power budget allocated to main memory (i.e., DRAM) • E.g., in a mid-range IBM eServer system, 40% of the total system energy is consumed by its main memory subsystem • By passively monitoring memory traffic and managing the power, existing power management techniques are not fully exploiting deeper power-saving states =>Actively shape memory traffic to enable existing techniques to save more energy
Passive Monitoring Memory Traffic • Why is passively monitoring memory traffic inefficient? • Memory accesses are random – good for performance, bad for energy consumption! • Idle time between consecutive memory accesses is often too short for use of the deeper power-saving state • Randomness is mostly due to OS’s arbitrary virtual-to-physical mapping
Passive memory traffic management Rank 0 Rank 0 Rank 1 Rank 1 time High-power Low-power Ultra Low-power Active memory traffic management time Example: Active vs. Passive
How to Shape Memory Traffic • Essentially, we need to artificially create disparity in access frequency among different memory ranks • Hot Ranks and Cold Ranks • Disparity in access frequency can be created by finding and migrating frequently-accessed pages to a subset of memory ranks • Hot ranks: contain frequently-accessed pages • Cold ranks: contain infrequently-accessed and unmapped pages • Page migration can be done by system software
First level page table Process Second level page table Modify PT Operating System Time triggers Migration thread Migrate (old_page, new_page) Implementation Hot ranks Rank 0 MC page counter Rank 1 Rank 2 Cold ranks Rank 3
Issues with Page Migration • There is a cost associated with each page migration Memory access frequency Is often highly skewed!!! 6% pages causes 75% accesses 14% pages causes 90% accesses Not all pages need to be migrated
Evaluation • Simulators • Mambo [IBM] – A full-machine simulator, cycle-accurate, supports PowerPC architecture • Memsim [IBM] – Detailed trace-driven main memory simulator, written in CSIM • Workloads • Low memory-intensive workload: SPECjbb + bzip + crafty • High memory-intensive workload: SPECjbb + art + mcf • SPECjbb: simulating 8 warehouses • SPEC2000 benchmarks: using Reference input set
Summary of Results • Energy: • Actively shaping memory traffic saves 35% more energy than passively monitoring • Performance: • Low memory-intensive workload: small impact on performance • High memory-intensive workload: significantly degrades performance due to more contention on hot ranks • Cost: • Use hardware counters, or • Software page faults
Conclusion • Actively shaping memory traffic allows existing power management techniques to more effectively save power • Highly-skewed page accesses are observed • Alternative main memory design: • Use high-performance/highly-parallel ranks as hot ranks • Use low-performance/low-power ranks as cold ranks • Allows frequently-accessed pages to be accessed faster • Allows memory ranks that hold infrequently-accessed and unmapped pages to consume less energy