870 likes | 883 Views
Learn about the Gray-box approach to deconstruct storage arrays, utilizing the Shear Algorithm to identify characteristics such as disk number, chunk size, and layout scheme for optimized file system performance and array management.
E N D
Timothy E. Denehy, John Bent, Florentina I. Popovici, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin, Madison Deconstructing Storage Arrays
Gray-box Research • Computer systems becoming more complex • Transistors • Lines of code • Each component is becoming more complex • Interactions between subsystems can affect • Performance • Reliability • Power • Security
Gray-box Research • Interfaces remain the same • Changes can be difficult and impractical • Support multiple platforms or legacy systems • Commercial acceptance for wide-spread adoption • Hardware and software phenomenon • IA-32 instruction set, POSIX OS, SCSI storage • Problem: lack of information
Gray-box Solution • Treat target system as a gray-box • General characteristics are known • Extract information from an existing interface • e.g. determine cache contents • Exploit information to control system behavior • e.g. access cached data first
Gray-box Information Techniques • Make assumptions about target system • Observe system inputs and outputs • Statistical methods • Draw inferences about internal structure • Microbenchmarks and probes • Parameterize system components • Observe system under controlled input
Gray-box Applications • Gray-box techniques have been used to identify • Memory hierarchy parameters [Saavedra and Smith] • Processor cycle time [Staelin and McVoy] • Low-level disk characteristics [Worthington et al.] • Buffer cache replacement algorithms [Burnett et al.] • File system data structures [Sivathanu et al.] • storage array characteristics: Shear
Shear • Software tool that automatically determines the important properties of a storage array • Enables file system performance tuning with knowledge of storage array characteristics • Acts as a management tool to help configure, monitor, and maintain storage arrays
Outline • Introduction • Shear • Background • Algorithm • Case Studies • Performance: Stripe-aligned Writes • Management: Detecting Misconfiguration, Failure • Conclusion
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Shear Goals • Determine storage array characteristics SCSI SCSI
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Shear Goals • Determine storage array characteristics • Number of disks SCSI SCSI
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Shear Goals • Determine storage array characteristics • Number of disks • Chunk size SCSI SCSI
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Shear Goals • Determine storage array characteristics • Number of disks • Chunk size • Layout and redundancy scheme SCSI SCSI RAID-0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 24 25 26 27 28 29 30 31 Shear Goals • Determine storage array characteristics • Number of disks • Chunk size • Layout and redundancy scheme SCSI SCSI RAID-1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 P P P P 0 1 2 3 4 5 6 7 8 9 10 11 P P P P 16 20 21 22 23 17 18 19 12 13 14 15 P P P P 32 33 34 35 24 25 26 27 28 29 30 31 P P P P 36 37 38 39 40 41 42 43 44 45 46 47 Shear Goals • Determine storage array characteristics • Number of disks • Chunk size • Layout and redundancy scheme SCSI SCSI RAID-5
Shear Motivation • Performance • Tune file systems to array characteristics • Management • Verify configuration • Detect failure
Shear Techniques • Microbenchmarks and probes • Controlled, random access read and write patterns • Measure response time of access patterns • Measure steady-state performance • Statistical clustering • Automatically classify fast and slow regimes • Identify patterns that utilize only a single disk
Shear Assumptions • Storage array • Layout follows a repeatable pattern • Composed of homogeneous disks • System • Able to bypass the file system and buffer cache • Little traffic from other processes
Outline • Introduction • Shear • Background • Algorithm • Case Studies • Performance: Stripe-aligned Writes • Management: Detecting Misconfiguration, Failure • Conclusion
Shear Algorithm • Pattern size • Chunk size • Layout of chunks to disks • Level of redundancy
Determining the Pattern Size • Find the size of the layout's repeating pattern • Not always the stripe size • Choose a hypothetical pattern size • Perform random reads at multiples of that distance • Repeat for a range of pattern sizes • Cluster results and identify actual pattern size
RAID-0 4 Disks 8 KB Chunks Pattern Size Example
RAID-0 4 Disks 8 KB Chunks Testing 2 KB Pattern Size Example
RAID-0 4 Disks 8 KB Chunks Testing 4 KB Pattern Size Example
RAID-0 4 Disks 8 KB Chunks Testing 6 KB Pattern Size Example
RAID-0 4 Disks 8 KB Chunks Testing 8 KB Pattern Size Example
RAID-0 4 Disks 8 KB Chunks Testing 10 KB Pattern Size Example
RAID-0 4 Disks 8 KB Chunks Testing 12 KB Pattern Size Example
RAID-0 4 Disks 8 KB Chunks Testing 14 KB Pattern Size Example
RAID-0 4 Disks 8 KB Chunks Testing 16 KB Pattern Size Example
RAID-0 4 Disks 8 KB Chunks Testing 18 KB Pattern Size Example
RAID-0 4 Disks 8 KB Chunks Testing 20 KB Pattern Size Example
RAID-0 4 Disks 8 KB Chunks Testing 22 KB Pattern Size Example
RAID-0 4 Disks 8 KB Chunks Testing 24 KB Pattern Size Example
RAID-0 4 Disks 8 KB Chunks Testing 26 KB Pattern Size Example
RAID-0 4 Disks 8 KB Chunks Testing 28 KB Pattern Size Example
RAID-0 4 Disks 8 KB Chunks Testing 30 KB Pattern Size Example
RAID-0 4 Disks 8 KB Chunks Testing 32 KB Pattern Size Example
RAID-0 4 Disks 8 KB Chunks Pattern Size Example
RAID-0 4 Disks 8 KB Chunks Actual 32 KB cluster cluster cluster Pattern Size Example
Shear Algorithm • Pattern size • Chunk size • Layout of chunks to disks • Level of redundancy
Determining the Chunk Size • Chunk size • amount of data contiguously allocated to one disk • Find the boundaries between disks • Choose a hypothetical boundary offset • Perform random reads on both sides of that offset • Repeat for all offsets in the pattern size • Cluster results and identify actual chunk size
RAID-0 4 Disks 8 KB Chunks Chunk Size Example
RAID-0 4 Disks 8 KB Chunks Testing 0 KB Chunk Size Example
RAID-0 4 Disks 8 KB Chunks Testing 2 KB Chunk Size Example
RAID-0 4 Disks 8 KB Chunks Testing 4 KB Chunk Size Example
RAID-0 4 Disks 8 KB Chunks Testing 6 KB Chunk Size Example
RAID-0 4 Disks 8 KB Chunks Testing 8 KB Chunk Size Example
RAID-0 4 Disks 8 KB Chunks Testing 10 KB Chunk Size Example
RAID-0 4 Disks 8 KB Chunks Testing 12 KB Chunk Size Example
RAID-0 4 Disks 8 KB Chunks Testing 14 KB Chunk Size Example