210 likes | 311 Views
Tuning the Storage Subsystem. Outline. Storage Subsystem Components Moore’s law and consequences Magnetic disk performances From SCSI to SAN, NAS and Beyond Storage virtualization Tuning the Storage Subsystem RAID levels RAID controller cache. Exponential Growth. Moore’s law
E N D
Outline • Storage Subsystem Components • Moore’s law and consequences • Magnetic disk performances • From SCSI to SAN, NAS and Beyond • Storage virtualization • Tuning the Storage Subsystem • RAID levels • RAID controller cache
Exponential Growth Moore’s law • Every 18 months: • New processing = sum of all existing processing • New storage = sum of all existing storage • 2x / 18 months ~ 100x / 10 years http://www.intel.com/research/silicon/moorespaper.pdf
Consequences of “Moore’s law” • Over the last decade: • 10x better access time • 10x more bandwidth • 100x more capacity • 4000x lower media price • Scan takes 10x longer (3 min vs 45 min) • Data on disk is accessed 25x less often (on average)
Disk Sales double every nine months Because volume of stored data increases Data Warehouses Internet Logs Web Archives Sky Survey Because media price drops much faster than areal density. Data Flood Graph courtesy of Joe Hellerstein Source: J. Porter, Disk/Trend, Inc. http://www.disktrend.com/pdf/portrpkg.pdf
Memory Hierarchy Access Time Price $/ Mb 1 ns Processor cache 100 x10 RAM 10 6 0.2 x10 Disks 0.2 (nearline) 10 Tapes / Optical Disks x10
1956: IBM (RAMAC) first disk drive 5 Mb – 0.002 Mb/in235000$/year9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2625 Kb/sec 1999: IBM MICRODRIVE first 1’’ disk drive340Mb 6.1 MB/sec tracks spindle platter read/write head actuator disk arm Controller disk interface Magnetic Disks
Access Time (2001) Controller overhead (0.2 ms) Seek Time (4 to 9 ms) Rotational Delay (2 to 6 ms) Read/Write Time (10 to 500 KB/ms) Disk Interface IDE (16 bits, Ultra DMA - 25 MHz) SCSI: width (narrow 8 bits vs. wide 16 bits) - frequency (Ultra3 - 80 MHz). http://www.pcguide.com/ref/hdd/ Magnetic Disks
The familiar bandwidth pyramid: The farther from the CPU, the less the bandwidth. 40 133 422 15 per disk Hardware Bandwidth System Bandwidth Yesterday in megabytes per second (not to scale!) Slide courtesy of J. Gray/L.Chung Hard Disk | SCSI | PCI | Memory | Processor
The familiar pyramid is gone! PCI is now the bottleneck! In practice, 3 disks can reach saturation using sequential IO 26 26 160 133 1,600 26 Hardware Bandwidth System Bandwidth Today in megabytes per second (not to scale!) Slide courtesy of J. Gray/L.Chung Hard Disk | SCSI | PCI | Memory | Processor
Outline • Storage Subsystem Components • Moore’s law and consequences • Magnetic disk performances • From SCSI to SAN, NAS and Beyond • Storage virtualization • Tuning the Storage Subsystem • RAID levels • RAID controller cache
Storage Area Network Limitations of SCSI as an interconnect protocol (SCSI Parallel Interface) • 16 devices per SCSI bus (channel) • Limited distance for a SCSI bus (10s of meters) • One and only one host per disk
“A storage area network is one or more devices communicating via a serial SCSI protocol (such as FC or iSCSI).”Using SANs and NAS, W. Preston, O’Reilly Storage Area Network
Host PCI bus (Infiniband) HBA (FC adapter) Disk Capacity (aerial density) Rotation Speed Controller JBOD, RAID, High-end Number of Channels Cache Size Interconnect Parallel (SCSI) Serial (FC, GbEthernet) Topologies Point-to-point Bus Synchronous (Parallel SCSI, ATA) CSMA (GbEthernet) Arbitrated Loop (FC) Fabric (FC) Storage SubsystemComponents
Fan Fan Fan EMC CLARiiON Cx600 Hardware Block Diagram CMI x 2 SP Fibre In x 4 SP Fibre In x 4 Power Supply Mirrored Write Cache Mirrored Write Cache Mirrored Cache Mirrored Cache STORAGE PROCESSOR STORAGE PROCESSOR FC FC FC FC Power Supply SPS SPS LCC LCC LCC LCC 15 Drives per DAE2 LCC LCC 240 Drives Total per Storage System Slide Courtesy of Lars Andersen - EMC
“A storage area network is one or more devices communicating via a serial SCSI protocol (such as FC or iSCSI).”Using SANs and NAS, W. Preston, O’Reilly Multiple Paths Zoning Persistent Binding Storage Area Network
Storage Virtualization • Slicing • Logical partition of each disk. Typically inner and outer tracks can be isolated in different slices. • Striping • RAID • SAME doctrine from Oracle (Stripe And Mirror Everything)
RAID levels RAID 0: striping No fault tolerance RAID1: mirroring 2 disks Faster reads but writes might be 2 times slower RAID5: parity Fault tolerant 4 I/O per write RAID10: mirroring of stripes Trades space for time RAID Controllers Software vs. Hardware RAID Disk Arrays Top TPC-C performance in August 2001IBM e-server xSeries 370 with SQL Server 2000688,220.9 tpmC for $15.543.346 ($5.600.000 for storage: > 1/3) http://www.tpc.org
Network Attached Storage • Move processing to data sources (analogy with printers: network card+postscript engine+print engine). • IP communication between server and NAS • Today file servers, tomorrow application servers (I.e., active disks for database systems): • Encapsulates data • Scalable