370 likes | 608 Views
Disk Drive Roadmap from the Thermal Perspective A Case for Dynamic Thermal Management. Sudhanva Gurumurthi Anand Sivasubramaniam, Vivek Natarajan Computer Systems Lab Pennsylvania State University.
E N D
Disk Drive Roadmap from the Thermal PerspectiveA Case for Dynamic Thermal Management Sudhanva Gurumurthi Anand Sivasubramaniam, Vivek Natarajan Computer Systems Lab Pennsylvania State University
Power Demands of Data Centers“What matters most to the computer designers at Google is not speed but power – low-power – because data centers can consume as much electricity as a city”, Eric Schmidt, CEO, Google • Data centers consume several Megawatts of power • Electricity bill • $4 billion/year • Disks account for 27% of computing-load costs • Difficult to cool at high power-densities Sources: 1. “Intel’s Huge Bet Turns Iffy”, New York Times article, September 29, 2002 2. “Power, Heat, and Sledgehammer, Apr. 2002. 3. “Heat Density Trends in Data Processing, Computer Systems, and Telecommunications Equipment”, 2000.
Data Center Cooling Costs • Data center of a large financial institution in New York City • Power consumption ~ 4.8 MW Source: “Energy Benchmarking and Case Study – NY Data Center No. 2”, Lawrence Berkeley National Lab, July 2003.
Temperature Affects Disk Drive Reliability • Heat-Related Problems • Thermal-tilt of disk stack and actuator arms • Out-gassing of spindle/voice-coil motor lubricants • Wear-out of bearings • Hard disk operating 5 C above normal temperature 10-15% more likely to fail • Disk drive design constrained by the thermal-envelope
Source: Hitachi GST Technology Overview Charts, http://www.hitachigst.com/hdd/technolo/overview/storagetechchart.html
Data-Rate Capacity Increase RPM Shrink Platter (Dia)4.6 (RPM)2.8 (# Platters) Temperature Thermal-Constrained Design Data Rate =~ (Linear-Density)*(RPM)*(Diameter) 1 platter Can we stay on this roadmap? Lower Capacity Lower Data Rate 40% Annual IDR Growth Increase RPM Power =~(# Platters)*(RPM)2.8*(Diameter)4.6
Outline • Introduction • Modeling • The Roadmap • Dynamic Thermal Management • Conclusions
Modeling • Baseline input parameters • Linear-Density (BPI) • Track-Density (TPI) • Characteristics Modeled • Capacity • Performance • Temperature
Capacity Model • Cmax = ηxnsurfxπ(ro2-ri2)(BPIxTPI) • Stroke-Efficiency:η < 1 • Spare tracks, recalibration tracks etc. • Assumed η = 2/3 [CMRR] • User-accessible capacity needs to be derated due to: • Zoned-Bit Recording (ZBR) • Servo Overheads • ECC Overheads
Performance Models • Parameters Modeled • IDR • Seek-time • IDR • IDR experienced by outermost zone • Seek-time • Uses linear-interpolation based on track-to-track, average, and full-stroke times [Worthington’95] • Accurate for seeks longer than 10 cylinders
Validation • Compared modeled vs. actual capacity and IDR using 13 disks from 4 different manufacturers from 1999-2002 • Inputs: BPI, TPI, RPM, Platter-size, Number of platters • Assumed all disks have 30 zones.
Source: Hitachi GST Technology Overview Charts, http://www.hitachigst.com/hdd/technolo/overview/storagetechchart.html
Change in BPI and TPI Trends • Slowdown in BPI • Difficult to lower fly-height • Requires higher recording media coercivity • Smaller grain sizes suffer from superparamagnetic effects • Slowdown in TPI • Narrower tracks more susceptible to media noise • Inter-track interference • Increase in track edge-effects with narrower tracks • Bit-Aspect Ratios (BPI/TPI) dropping • Larger slowdown in BPI • Long-term areal density growth expected to slowdown to 40-50% • 1 Tb/in2 disk expected to be available in 2010 [DS2]
Capturing BPI and TPI Trends • Studied published work on designing Terabit areal-density disks. • Chose design with most conservative assumptions about BPI • Scaled BPI and TPI CGRs to achieve 1 Tb/in2 areal density in 2010 • BPI CGR = 14% • TPI CGR = 28% • Areal-density CGR = 46%
Thermal Model • Extension of work by Eibeck et al. at the University of California • Components Modeled: • Internal air • Spindle-assembly • Arm-assembly • Drive base and cover • Drive completely enclosed • External temperature maintained constant
Modeling the Heat-Transfer • Newton’s Law of Cooling: dQ/dt = hAΔT • Internal Air Heat = Heat convected by solid components + viscous dissipation – heat lost through drive cover
Drive Parameters • Materials • Proprietary data • Assumed platters, arms, and spindle-hub composed of Aluminum • Geometry • Modeling and measurement • Voice-coil motor (VCM) power • Used published data from IBM [Sri-Jayantha’95] • External air temperature • Assumed 28 C for single-platter configuration
The Thermal-Envelope Thermal Envelope
Outline • Introduction • Modeling • Formulating a Disk-Drive Roadmap • The Roadmap • Dynamic Thermal Management • Conclusions
Drive RPM Areal Density ≥ 1 Tb/in2 BPI CGR = 30% TPI CGR = 50% BPI CGR = 14% TPI CGR = 28%
Drive Temperature Thermal-Envelope
Outline • Introduction • Modeling • Formulating a Disk-Drive Roadmap • The Roadmap • Dynamic Thermal Management • Conclusions
Dynamic Thermal Management (DTM) • To boost performance while still working within the thermal-envelope by dynamic activity-control • How much do higher RPMs benefit application I/O performance?
Applications Studied • Five commercial I/O traces • Openmail (HP Labs) • OLTP Application (UMass Repository) • Web Search-Engine (UMass Repository) • TPC-C (Penn State) • TPC-H (IBM Research) • Attempted to re-create the disk-system on which the trace was collected in DiskSim
30-60% Performance Boost for 10,000 RPM Increase
SPM+VCM On Thermal Slack RPM VCM Off DTM Solution 1:Exploiting Thermal Slack T E M P E R A T U R E Thermal-Envelope TIME
DTM Solution 2:Activity Throttling • Thermal-design assuming an average-case operation • Basic idea • Disk services requests at its peak-performance configuration • Throttle disk activities if thermal-envelope may be exceeded
VCM On VCM Off Approach 1:Seek Throttling T E M P E R A T U R E Thermal-Envelope TIME
VCM On VCM Off VCM Off+ Low RPM Approach 2:(Seek+RPM) Throttling T E M P E R A T U R E Thermal-Envelope TIME
Throttling-Ratio 2.6” 40% IDR Growth to 2005 2.6” 40% IDR Growth to 2007 • tcool – Disk undergoing throttling • theat – Disk operating at maximal performance configuration • Throttling-Ratio = (theat/tcool)
Summary • Need aggressive RPM increases to sustain IDR growth • Scaling BPI and TPI more difficult • Lower Signal-to-Noise ratios at higher densities increase ECC overheads • IDR growth would get affected due to heat dissipation • 40% growth rate cannot be maintained beyond 2007 even for 1.6” platter-size • Expected to slowdown to 14% • Possible to buy back performance with Dynamic Thermal Management (DTM).