220 likes | 341 Views
RIMAC: Redundancy-based hierarchical I/O cache architecture for energy-efficient, high-performance storage systems. Xiaoyu Yao and Jun Wang Computer Architecture and Storage System Laboratory (CASS) University of Nebraska - Lincoln. Big Picture.
E N D
RIMAC: Redundancy-based hierarchical I/O cache architecture for energy-efficient, high-performance storage systems Xiaoyu Yao and Jun Wang Computer Architecture and Storage System Laboratory (CASS) University of Nebraska - Lincoln
Big Picture • Current energy-efficient storage solutions promising: • Saving energy at the cost of performance • Saving energy by using DRPM disk • New RIMAC: Redundancy-based hierarchical I/O cache architecture • Making storage cache aware of redundancy • Solving the performance problem with power aware request transformation University of Nebraska-Lincoln
Outline • Background & Motivation • Why? • How does RIMAC differ? • RIMAC: Redundancy-based Hierarchical I/O Cache Architecture • Evaluation • Conclusion University of Nebraska-Lincoln
Energy Issues of Internet Data Center 27% of Total Energy* Internet Storage System Application Servers … SAN Web Servers … Router … Switch 70%/Year … Database Servers * From WP’02 http://www.max-t.com University of Nebraska-Lincoln
Backend Storage System • High performance SCSI disks • Small disk array as building block • RAID-1, mirrored disk array • RAID-5, parity disk array • Multi-level I/O cache • Large storage cache • Moderate RAID controller cache University of Nebraska-Lincoln
Related Work University of Nebraska-Lincoln
Motivations • Server workload characteristics • Dispersed idle period • High performance vs. energy conservation • Long “Passive spin-up” delay in conventional disks (10-15 seconds) • Exploiting existing infrastructure to consolidate the short idle period • Internal redundancy in disk array • Multi-level I/O cache University of Nebraska-Lincoln
RIMAC - Redundancy • Identifying sources of “passive spin-up” • Non-blocking read • Derivative read due to parity update • Dirty block flushing [Zhu et. al. HPCA’04] • Exploiting inherent redundancy to untouched sources of “passive spin-up” • 1/N redundancy in RAID-5, • Requests on standby disks are transformed to active disk accesses University of Nebraska-Lincoln
RIMAC - Cooperative Cache • Deploying parity exclusive cache • Storage cache: user data • RAID controller cache: parity • Leveraging redundancy exploitation in cache • High performance power-aware request transformation in multi-level I/O cache • Larger effective storage cache size with new placement/replacement algorithm University of Nebraska-Lincoln
4 5 P1 2 1 3 4 6 5 P2 8 7 P3 9 10 P4 11 12 Sample Scenario – Transformable Read in Cache (TRC) F R O N T - E N D Response XOR 6 Up-Half Storage Cache … Read (addr=6, len=1) Bottom-Half Parity Cache P2 P3 RIMAC Disk1 Disk2 Disk3 Disk4 Storage System …… Idle/Active Standby Idle/Active Idle/Active University of Nebraska-Lincoln
4 8 P1 2 1 3 4 6 5 P2 8 7 P3 9 10 P4 11 12 Sample Scenario – Transformable Read on Disk (TRD) F R O N T - E N D Response XOR 6 Up-Half Storage Cache … Read (addr=6, len=1) Bottom-Half Parity Cache P2 P3 RIMAC Disk1 Disk2 Disk3 Disk4 Storage System …… Idle/Active Standby Idle/Active Idle/Active University of Nebraska-Lincoln
Power-aware Request Transformation Storage Cache PU-DA-C PU-CA-C TRC Write (PUPA) Parity Cache Read PU-DA-D PU-CA-D TRD Disks University of Nebraska-Lincoln
3 2 P1 1 6 P2 4 5 P3 7 9 8 10 P4 11 12 PUPA - Parity Update with Power-Aware • Direct Access: • P2’ = 5’ XOR 5 XOR P2 • Complementary Access: • P2’ = 5’ XOR 6 XOR 4 Write (addr=5, len=1) University of Nebraska-Lincoln
Cache Placement/Replacement Algorithms • Storage Cache • LRU with N-1 constraints • Compatible with MQ, LIRS, ARC algorithm • Parity Cache • Parity stripe only • Second chance replacement algorithm University of Nebraska-Lincoln
Evaluation • Trace driven simulation • Disksim 2.0 • 3-state disk power models (IBM 36Z15) • RIMAC front-end, bottom-half and upper-half implementation with 5000 lines of C code • Workloads • Cello99 from HP: file server • TPC-D from HP: decision support • SPC-SE from SPC: search engine University of Nebraska-Lincoln
System Performance Cello99 TPC-D SPC-SE 20-30% 2-6% 5-14% 30% • Larger cache does improve performance University of Nebraska-Lincoln
Energy Consumption Cello99 TPC-D SPC-SE 14-15% 15-16% 33-34% • Larger cache may not save more energy University of Nebraska-Lincoln
Effects of Read Policies Cello99-64 MB TPC-D 128 MB SPC-SE 256 MB 6.9% 10.1% 4.1% 49.5% 12.8% 33.8% University of Nebraska-Lincoln
Effects of Power Aware Parity Update Policies TPC-D 128 MB Cello99-64 MB Parity Hit Ratio 13.8% 83.7% University of Nebraska-Lincoln
Anatomy of Energy Consumption Cello99-64 MB University of Nebraska-Lincoln
Conclusions • RIMAC: Redundancy based Hierarchical I/O cache architecture with minimum overhead • Address an open problem - “passive spin-up” in energy-efficient server storage systems by power-aware request transformation both in caches and on disks • Reduce energy cost by up to 33% and improve performance by up to 30% University of Nebraska-Lincoln
Thank you! University of Nebraska-Lincoln