320 likes | 453 Views
CS5226 2002 Hardware Tuning. Xiaofang Zhou School of Computing, NUS Office: S16-08-20 Email: zhouxf@comp.nus.edu.sg URL: www.itee.uq.edu.au/~zxf. Outline. Part 1: Tuning the storage subsystem RAID storage system Choosing a proper RAID level Part 2: Enhancing the hardware configuration.
E N D
CS5226 2002Hardware Tuning Xiaofang Zhou School of Computing, NUS Office: S16-08-20 Email: zhouxf@comp.nus.edu.sg URL: www.itee.uq.edu.au/~zxf
Outline • Part 1: Tuning the storage subsystem • RAID storage system • Choosing a proper RAID level • Part 2: Enhancing the hardware configuration
Modern Storage Subsystem • More than just a disk • Disks, or disk arrays • Connections between disks and processors • Software to manage and config. devices • A logical volume for multiple devices • A file system to manage data layout
RAID Storage System • Redundant Array of Inexpensive Disks • Combine multiple small, inexpensive disk drives into a group to yield performance exceeding that of one large, more expensive drive • Appear to the computer as a single virtual drive • Support fault-tolerant by redundantly storing information in various ways
Data Striping File blocks (e.g., 8KB per block) Disk 2 Disk 3 Disk 4 Disk 5 Disk 6 Disk 1 Stripe unit: blocks 1-6, 7-12, …
Parity Check - Classical • An extra bit added to a byte to reveal errors in storage or transmission • Even (odd) parity means that the parity bit is set so that there are an even (odd) number of one bits in the word, including the parity bit • A single parity bit can only reveal single bit errors since if an even number of bits are wrong then the parity bit will not change • It is not possible to tell which bit is wrong
Parity Check - Checksum • A computed value based on the content of a block of data • Transmitted or stored along with the data to detect data corruption • Recomputed at the receiver end to compare with the one received • Detects all errors with old bits of errors, and most errors with event number of bits • It is computed by summing the bytes of the data block ignoring overflow • Other parity check methods, such as Hamming Code, corrects errors
RAID Types • Five types of array architectures, RAID 1 ~ 5 • Different disk fault-tolerance • Different trade-offs in features and performance • A non-redundant array of disk drives if often referred to RAID 0 • Only RAID 1, 3 and 5 are commonly used • RAID 2 and 4 do not offer any significant advantages over these other types • Certain combination is possible (10, 35 etc) • RAID 10 = RAID 1 + RAID 0
RAID 0 - Striping • No redundancy • No fault tolerance • High I/O performance • Parallel I/O
RAID 1 – Mirroring • Provide good fault tolerance • Works ok if one disk in a pair is down • One write = a physical write on each disk • One read = either read both or read the less busy one • Could double the read rate
RAID 3 - Parallel Array with Parity • Fast read/write
RAID 5 – Parity Checking • For error correction, rather than full redundancy • Each stripe unit has an extra parity stripe • Parity stripes are distributed
RAID 5 Read/Write • Read: parallel stripes read from multiple disks • Good performance • Write: 2 reads + 2 writes • Read old data stripe; read parity stripe (2 reads) • XOR old data stripe with replacing one. • Take result of XOR and XOR with parity stripe. • Write new data stripe and new parity stripe (2 writes).
RAID 10 – Striped Mirroring • RAID 10 = Striping + mirroring • An striped array of RAID 1 arrays • High performance of RAID 0, and high tolerance of RAID 1 (at the cots of doubling disks) .. More information about RAID disks at http://www.acnc.com/04_01_05.html
What RAID Provides • Fault tolerance • It does not prevent disk drive failures • It enables real-time data recovery • High I/O performance • Mass data capacity • Configuration flexibility • Lower protected storage costs • Easy maintenance
Hardware vs. Software RAID • Software RAID • Software RAID: run on the server’s CPU • Directly dependent on server CPU performance and load • Occupies host system memory and CPU operation, degrading server performance • Hardware RAID • Hardware RAID: run on the RAID controller’s CPU • Does not occupy any host system memory. Is not operating system dependent • Host CPU can execute applications while the array adapter's processor simultaneously executes array functions: true hardware multi-tasking
RAID Levels - Data Settings: accounts( number, branchnum, balance); create clustered index c on accounts(number); • 100000 rows • Cold Buffer • Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.
RAID Levels - Transactions No Concurrent Transactions: • Read Intensive: select avg(balance) from accounts; • Write Intensive, e.g. typical insert: insert into accounts values (690466,6840,2272.76); Writes are uniformly distributed.
SQL Server7 on Windows 2000 (SoftRAID means striping/parity at host) Read-Intensive: Using multiple disks (RAID0, RAID 10, RAID5) increases throughput significantly. Write-Intensive: Without cache, RAID 5 suffers. With cache, it is ok. RAID Levels
Which RAID Level to Use? • Log File • RAID 1 is appropriate • Fault tolerance with high write throughput. Writes are synchronous and sequential. No benefits in striping. • Temporary Files • RAID 0 is appropriate. • No fault tolerance. High throughput. • Data and Index Files • RAID 5 is best suited for read intensive apps or if the RAID controller cache is effective enough. • RAID 10 is best suited for write intensive apps.
Controller Prefecthing No, Write-back Yes • Read-ahead: • Prefetching at the disk controller level. • No information on access pattern. • Better to let database management system do it. • Write-back vs. write through: • Write back: transfer terminated as soon as data is written to cache. • Batteries to guarantee write back in case of power failure • Write through: transfer terminated as soon as data is written to disk.
SCSI Controller Cache - Data Settings: employees(ssnum, name, lat, long, hundreds1, hundreds2); create clustered index c on employees(hundreds2); • Employees table partitioned over two disks; Log on a separate disk; same controller (same channel). • 200 000 rows per table • Database buffer size limited to 400 Mb. • Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.
SCSI (not disk) Controller Cache - Transactions No Concurrent Transactions: update employees set lat = long, long = lat where hundreds2 = ?; • cache friendly: update of 20,000 rows (~90Mb) • cache unfriendly: update of 200,000 rows (~900Mb)
SQL Server 7 on Windows 2000. Adaptec ServerRaid controller: 80 Mb RAM Write-back mode Updates Controller cache increases throughput whether operation is cache friendly or not. Efficient replacement policy! SCSI Controller Cache
Enhancing Hardware Config. • Add memory • Cheapest option to get a better performance • Can be used to enlarge DB buffer pool • Better hit ratio • If used for enlarge OS buffer (as disk cache), it benefits but to other apps as well • Add disks • Add processors
Add Disks • Larger disk ≠better performance • Bottleneck is disk bandwidth • Add disks for • A dedicated disk for the log • Switch RAID5 to RAID10 for update-intensive apps • Move secondary indexes to another disk for write-intensive apps • Partition read-intensive tables across many disks • Consider intelligent disk systems • Automatics replication and load balancing
Add Processors • Function parallelism • Use different processors for different tasks • GUI, Query Optimisation, TT&CC, different types of apps, different users • Operation pipelines: • E.g., scan, sort, select, join… • Easy for RO apps, hard for update apps • Data partition parallelism • Partition data, thus the operation on the data
A A Parallel Join Processing • Algorithm: decompose and processing in parallel • T = R S • Let f: A (1..n) (a hash function) • R = i=1..n Ri, Ri = {r R | f(r.A) = i} • S = i=1..n Si, Si = {s S | f(s.A) = i} • T = i=1..nRi Si • Issues • However, data distribution, task decomposition and load balancing are non-trivial
Parallelism • Some tasks are easier to be parallelised • E.g., scan, join, sum, min • Some tasks are not so easy • E.g., sorting, avg, nested-queries
Parallel DB Architectures • Shared memory • Tightly coupled, easy-to-use, but not scalable (bottlenecks when accessing shared memory and disks) • Shared nothing • A distributed with message-passing as the only communication mechanism • Highly scalable • Difficult for load distribution and balancing • Shared disk • A trade-off, but towards the shared-memory end
Summary • In this module, we have covered: • The storage subsystem • RAID: what are they and which one to use? • Memory, disks and processors • When to add what?