120 likes | 272 Views
IBM Research – Austin Heather Hanson Karthick Rajamani. What computer architects need to know about memory throttling WEED 2010 June 20, 2010. Outline. Memory throttling overview Experimental platform System configuration Memory throttling implementation
E N D
IBM Research – Austin Heather Hanson Karthick Rajamani What computer architects need to know about memory throttlingWEED 2010June 20, 2010
Outline • Memory throttling overview • Experimental platform • System configuration • Memory throttling implementation • Memory throttling characterization • Bandwidth • Power • Performance • Summary
Memory throttling in a nutshell • Memory throttling is a power-performance knob that: • Impacts memory reference rates of both instruction and data streams • controls power • can be used for safety or optimization • regulate DIMM temperatures • enforce memory power budgets • Memory throttling restricts read & write traffic • directly controls memory power • indirectly affects processors and other components • Several implementation styles in commercial systems • insert periodic idle cycles • allow arbitrary number of transactions up to power (estimated) threshold • run + hold windows • enforce read & write quotas [this paper] • first N transactions to proceed in time window • any further requests wait until next time period
quota-style memory throttling reads & writes proceed as requested up to N requests per period Example: N = 6 Up to 6 transactions serviced per period, regardless of request timing Comparison to clock throttling run-hold clock throttling regular frequency during run portion; clock halted during hold portion Nth request in each period; additional requests would be queued for later service
POWER6 Memory Throttling • IBM JS12 blade system • Processor • POWER6 • 1 socket x 2 cores per processor socket • 3.8 GHz frequency (fixed in these experiments) • SLES10 linux • Memory • 16 GB capacity • 8 DIMMS x 2 GB each • DDR2 • 667 MHz bus • Quota-style memory throttling • N transactions per M memory cycles 100% throttle level == unthrottled • Time period is faster than thermal and power supply timescales
Memory throttle characterization methodology • Sweep throttle settings • Set throttle • Run steady-behavior benchmark • DAXPY (double A * X plus Y) • FPMAC (floating-point multiply accumulate) • RandomMemory (generate random addresses) • SPECPower_ssj2008 calibration phase (peak throughput for warehouse transactions) • Record sensor data, 256ms per sample • Memory power • Memory reads & writes • Instruction throughput • And other sensors not shown here • Decrement throttle • Repeat for full range of throttle settings • Repeat throttle sweep for multiple benchmarks and memory footprints • Microbenchmarks: L1 cache contained and main memory footprints • SPECPower_ssj2008: behaves as nearly contained in on-chip caches • Calculate median sensor data for each permutation {benchmark, footprint, throttle}
transition between linear & saturated regions Memory throttle effect on bandwidth saturated linear
A closer look at RandomMemory-DIMM • uses less bandwidth than other benchmarks at same throttle levels • also less bandwidth than its own saturation level • Simply measuring bandwidth at a single/current throttle level is not enough to identify a region of operation • less than max could be saturated or transition region • ….a controller will not be able to accurately predict the effect on bandwidth of a throttle level change • …or predict the effect on power or performance Subtle but very important point about transition region Actual bandwidth < max bandwidth bandwidth restrictions pipeline starvation reduced request rate
Memory Poweris basically linear with bandwidth, so this chart looks familiar….
performance power Throttling effects relative to each benchmark • Generally more performance reduction than power reduction (in %) • Throttling alone doesn’t affect static portion of memory power • Leveraging idle low-power modes of memory can alter positively the power-performance curve for memory request rate throttling. • Possible to waste energy from longer execution time • Larger bandwidth demands larger effect from throttling • Conversely, power reduction only when performance is impacted. L1-contained DAXPY: throttling has no effect DIMM-sized DAXPY: drastic effect
Summary • Memory throttling is a power-performance knob available in commercial systems • Memory controller restricts read & write bandwidth • caps memory power • controls DIMM temperature • Mileage may vary • power and performance management depend on bandwidth demand • throttling a low-bandwidth workload doesn’t reduce much power • potential to use more energy due to increased execution time • use highly throttled settings with caution • Effective tool for power capping • power constrained configurations • thermal safety • power shifting
Acknowledgements • IBM Research – Austin • IBM Systems & Technology Group • Memory characterization: Joab Henderson, Kenneth Wright • EnergyScale firmware: Guillermo Silva, Andrew Geissler