Conserving Energy in RAID Systems with Conventional Disks

Conserving Energy in RAID Systems with Conventional Disks Dong Li, Jun Wang Dept. of Computer Science & Engineering University of Nebraska-Lincoln Peter Varman Dept. of Electrical and Computer Engineering Rice University

References [1] S. Gurumurthi, A. Sivasubramaniam, M. Kandemir, and H. Franke, “DRPM: dynamic speed control for power management in server class disks,” ISCA’03 [2] D. Colarelli and D. Grunwald, “Massive arrays of idle disks for storage archives,” in Proceedings of Super Computing’ 2002 [3] E. Pinheiro and R. Bianchini, “Energy conservation techniques for disk array-based servers,” in Proceedings of the 18th International Conference on Supercomputing, 2004 [4] E. Varki, A. Merchant, J. Z. Xu, and X. Z. Qiu, “Issues and challenges in the performance analysis of real disk arrays,” IEEE Transactions on Parallel and Distributed Systems, 2004. [5] D. Li and J. Wang, “EERAID: Energy-efficient redundant and inexpensive disk array,” in Proceedings of 11th ACM SIGOPS European Workshop, 2004. [6] D. Li, H. Cai, X. Yao, and J. Wang, “Exploiting redundancy to construct energy-efficient, high-performance RAIDs,” Tech. Rep. TR-05-07-04, Computer Science and Engineering Department, University of Nebraska Lincoln, 2005.

Outline • Introduction • Motivation • eRAID Design • Evaluation • Leveraging eRAID • Conclusions

Introduction Motivation Design Leveraging Conclusions Evaluation Introduction • Energy-efficient storage system, total cost of ownership (TCO), … • Short request inter-arrival time • Long disk state switch time of conventional disks • Current solutions: multi-speed disks[1] • Create long idle period for conventional disks • unbalance workloads • Two approaches • Relocating data: MAID[2], PDC[3] • Redirecting requests: EERAID[5]

Introduction Motivation Design Leveraging Conclusions Evaluation Motivation • Major limitations of state of the art • few workable solutions for conventional disk based systems • single performance measurement • no differentiation of workload time criticality • Three observations • redundant information of RAID systems • spare service capacity • queueing model

Introduction Motivation Design Leveraging Conclusions Evaluation eRAID Design • Main idea • spin down, partially or entirely, mirror disks to standby • read, write • Features • soft solution --- no hardware change • consider two performance metrics • Research issue • maximize energy saving • without violating predefined performance degradation limits for both throughput and response time • assume workloads have little change between two consecutive time windows

Introduction Motivation Design Leveraging Conclusions Evaluation Solving for Performance Degradation • Our approach: using queueing models to do predictions • model RAID-1 system and get performance measures • examine how the input parameters are changed • get new performance measures with changed input parameters • compare these two results • Four workloads: synchronous read (SR), asynchronous read (AR), synchronous write (SW) and asynchronous write (AW) • Real system: HP SureStore E Disk Array FC60

Introduction Motivation Design Leveraging Conclusions Evaluation Read Load Models

Introduction Motivation Design Leveraging Conclusions Evaluation Read Load Performance Computing • The possible changes of input parameters: • disk access probability • disk service time --- negligible • Synchronous read load: • Mean Value Analysis (MVA) technique • eRAID --- double access probabilities of corresponding primary disks • Asynchronous read load: • no throughput degradation for stable systems • eRAID --- double work loads of corresponding primary disks

Introduction Motivation Design Leveraging Conclusions Evaluation Write Load Model • Controller cache • write back policy • FC60: two-threshold write back policy • destage_threshold, max_ditry • Disk array: M/M/1/K queueing model[4]

Introduction Motivation Design Leveraging Conclusions Evaluation Write Load Performance Computing • Dirty data arrival rate d • SW load: d= * cache_miss_rate • : max throughput with infinite cache size • AW load: d= * cache_miss_rate •  independent with the system • The possible changes of input parameters: • service rate: N/2 => (N-2i)/2 • maximum queue length • cache miss rate --- unnoticeable

Ebase = Eactive+Eidle N disks EeRAID= Eactive+Eidle+Estandby+Eswitch (N-i) disks i disks Introduction Motivation Design Leveraging Conclusions Evaluation Solving for Energy Saving • N-disk RAID1 • Time window length T • Request number R • Mean service time t • Asyn. load: 2=1 • Sync. load: 2<1

Introduction Motivation Design Leveraging Conclusions Evaluation Control Algorithm • Time-window • Solve multi-constraint problem: • select LFU disks • Conservative control

Introduction Motivation Design Leveraging Conclusions Evaluation Evaluation • Disk power model: IBM Ultrastar 36Z15 • Simulator: augmented Disksim • Traces: Cello99 and TPC-C20 • 8-disk RAID1 system • Two scenarios

Introduction Motivation Design Leveraging Conclusions Evaluation Preliminary Results

Introduction Motivation Design Leveraging Conclusions Evaluation Leveraging eRAID • Associate a load threshold f (1/2<f<1) for each disk • when primary disk load exceeds f, spin up mirror disk to share the load • conventional mirrored layout: spin up one mirror disk • our new layout: spin up less than one mirror disk • Layout files of one primary disk to a set of mirror disks

Introduction Motivation Design Leveraging Conclusions Evaluation An example: N=10 and f=2/3

Introduction Motivation Design Leveraging Conclusions Evaluation Conclusions • An energy saving policy, eRAID, for conventional disk based RAID-1 systems • 30% energy-saving without violating predefined performance constraints • A new data layout scheme for further energy-saving • Limitations • circumscribed by the accuracy of queueing models • approximated input parameters, e.g. process number and mean process delay • conservative control

Thank you! Questions?

Creating Disk Idle Period in RAID-5: An Example • 4-disk RAID 5 system • A parity group containing data stripe 1, 2, 3 and parity stripe p that are saved in disk 1, 2, 3 and 4 respectively • There is a read request for stripe 1. To service such a read, we could either read stripe 1 from disk 1, or read stripe 2, 3 and p, then calculate stripe 1 on the fly by an XOR calculation. • More details can be found in our technical report[6]

Conserving Energy in RAID Systems with Conventional Disks

Conserving Energy in RAID Systems with Conventional Disks

Presentation Transcript

Energy Conserving Transducers

RAID Redundant Arrays of Independent Disks

Conventional Energy Issues

I/O, Disks, and RAID

Disks and RAID

RAID Systems

Raid: redundant arrays of inexpensive disks INDEPENDENT

Conventional Energy

Conserving Energy

RAID: Redundant Array of Inexpensive Disks

Disks and RAID

RAID (Redundant Arrays of Independent Disks)

Redundant Array of Independent Disks (RAID)

Conventional Energy

RAID (redundant array of inexpensive disks)

Conventional Energy Systems

Characterizing and Conserving Energy Consumption in Mobile P2P Systems

RAID Redundant Array of Independent Disks

Conserving Energy in the Landscape

Conserving Energy

Redundant Array of Inexpensive Disks (RAID)