1 / 61

CS 554: Advanced Database System Notes 02: Hardware

CS 554: Advanced Database System Notes 02: Hardware. Hector Garcia-Molina. Outline. Hardware: Disks Access Times (disk) Optimizations (disk access time) Other Topics: Storage costs Using secondary storage Disk failures. Hardware. DBMS. Data Storage. CPU. P. Typical Computer.

russellm
Download Presentation

CS 554: Advanced Database System Notes 02: Hardware

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 554: Advanced Database SystemNotes 02: Hardware Hector Garcia-Molina Notes 2

  2. Outline Hardware: Disks Access Times (disk) Optimizations (disk access time) Other Topics: Storage costs Using secondary storage Disk failures Notes 2

  3. Hardware DBMS Data Storage Notes 2

  4. CPU P Typical Computer Disk Controller ... ... M C Memory Secondary Storage Notes 2

  5. Secondary storage Many flavors: - Disk: Floppy (hard, soft) Removable Packs Winchester (most common) SSD disks Optical, CD-ROM… Arrays - Tape: Reel, cartridge Robots Notes 2

  6. “Typical Disk:” Platter Head … Terms: Platter, Head, Cylinder, Track, Sector (physical), Block (logical), Gap Notes 2

  7. Top View Gap Sector Track Notes 2

  8. Block Block Block = group of sectors that form a unit of access One read/write operation will read/write one block Notes 2

  9. Disk Access Time block x in memory I want block X How long ? Notes 2

  10. Platter Head … Time = Seek Time + Rotational Delay + Transfer Time + Other Seek time: to move head to the desired cylinder (track) Rotational delay: for waiting on the desired sector Transfer time: to transfer data on sectors to memory Notes 2

  11. Seek Time Once head moving, the head travels fast 3 or 5x Seek Time x Cylinders Traveled 1 N Takes time to start the head moving Notes 2

  12. Average Random Seek Time Start at cylinder i  Go to cylinder j N N  SEEKTIME (i  j) S = N(N-1) j=1 ji i=1 There are N starting cylinders and N-1 cylinders Total: N(N-1) possible values Notes 2

  13. Average Random Seek Time N N  SEEKTIME (i  j) S = N(N-1) j=1 ji i=1 “Typical” S: 10 ms  40 ms Notes 2

  14. Typical Seek Time • Ranges from • 4ms for high end drives • 15ms for mobile devices • Typical SSD (Solid State): ranges from • 0.08ms • 0.16ms • Source: Wikipedia, "Hard disk drive performance characteristics" Notes 2

  15. Rotational Delay Disk platter rotates Head is here Block I Want Notes 2

  16. Average Rotational Delay R=0 for SSDs R = 1/2 revolution Typical HDD figures Source: Wikipedia, "Hard disk drive performance characteristics" Notes 2

  17. Transfer Rate: # bits transferred/sec • Transfer rates: • HDD: up to 1000 Mbit/sec • 12x Blu-Ray: 432 Mbit/sec • 1xCD: 1.23 Mbits/sec • for SSDs, limited by interfacee.g., SATA 3000 Mbit/s • Transfer time: Amount data transferred Transfer rate Notes 2

  18. Other Delays • CPU time to issue I/O • Contention delay for disk controller • Different programs can be using the disk • Contention delay for bus, memory • Different programs can be transferring data These delays are negligible compared to Seek time + rotational delay + transfer time Notes 2

  19. So far: One (Random) Block Access • What about: Reading “Next” block? Notes 2

  20. If we do things right(e.g., Double Buffer, Stagger Blocks…) Time to get = Block Size + Negligible “next” block Transfer rate - skip gap - switch track - once in a while, next cylinder Notes 2

  21. Rule ofRandom I/O: ExpensiveThumb Sequential I/O: Much less Notes 2

  22. Cost for Writingsimilar to Reading …. unless we want to verify: need to add (full) rotation + Block size Transfer time Notes 2

  23. To Modify a Block? Notes 2

  24. To Modify Block: (a) Read Block into Memory (b) Modify block in Memory (c) Write Block [(d) Verify?] To Modify a Block? Notes 2

  25. Random Access Time • Hand Drive:Ranges from 2.9 msec (high end server drive) to 12 msec (laptop HDD) • Due to the need to move the heads and wait for the data to rotate under the read/write head Notes 2

  26. Data Transfer Rate • Hard Disk:Once the head is positioned, an enterprise HDD can transfer data at about 140 MBytes/sec. • In practice, much lower speeds because…. • Data transfer rate depends also on rotational speed (of the platter) ! Notes 2

  27. Reliability • Hard Disk: According to a study performed by CMU for both consumer and enterprise-grade HDDs, their average failure rate is 6 years, and life expectancy is 9–11 years. Notes 2

  28. Cost and Capacity • Hard Drive: • In 2013: HDDs of up to 6 TB were available. • In 2014: Cost: around $50 per TeraByte Notes 2

  29. Kibibytes • 1 kibibyte = 210 bytes = 1024 bytes. from Wikipedia Notes 2

  30. Outline • Hardware: Disks • Access Times • Optimizations • Other Topics • Storage Costs • Using Secondary Storage • Disk Failures here Notes 2

  31. Optimizations(in controller or O.S.) • Disk Scheduling Algorithms • e.g., elevator algorithm • Pre-fetch (Double buffering) • Arrays (RAID) • Mirrored Disks Notes 2

  32. Disk Scheduling: Elevator Algorithm Situation: Have many read/write requests Question: In which order do you process the requests ? Notes 2

  33. Disk Scheduling: Elevator Algorithm • Process requests • for these cylinders 2. Then process requests this way Current cylinder Notes 2

  34. Double Buffering Algorithm Problem: You have a File • Sequence of Blocks B1, B2, …, Bn You have a Program that: • Process B1 • Process B2 • Process B3 ... Notes 2

  35. Single Buffer Solution (“naïve” solution) (1) Read B1  Buffer (2) Process Data in Buffer (3) Read B2  Buffer (4) Process Data in Buffer ... Notes 2

  36. Say P = time to process/block R = time to read in 1 block n = # blocks  R (1) Read B1  Buffer (2) Process Data in Buffer (3) Read B2  Buffer (4) Process Data in Buffer ...  P  R  P Time to process n block = n(P + R) Notes 2

  37. A B C D E F G Double Buffering process Memory: Disk: Read block 1 Notes 2

  38. B A B C D E F G done Double Buffering process Memory: Disk: A Process block 1 AND read block 2 simultaneously Notes 2

  39. A B C D E F G Double Buffering process Memory: Disk: B C A AND read block 3 simultaneously Process block 2 done Notes 2

  40. Say P > R P = Processing time/block R = IO time/block n = # blocks What is processing time? Notes 2

  41. A B C D E F G Double Buffering process Memory: Disk: Read block 1  R Notes 2

  42. B A B C D E F G done Double Buffering Time needed = P (P > R) process Memory: Disk: A AND read block 2  R simultaneously Process block 1  P Notes 2

  43. A B C D E F G Double Buffering Time needed = P (P > R) process Memory: Disk: B C A AND read block 3  R simultaneously Process block 2  P done Notes 2

  44. Say P  R P = Processing time/block R = IO time/block n = # blocks What is processing time? • Double buffering time = R + nP • Single buffering time = n(R+P) Notes 2

  45. Using disk array to accelerate disk access • Why use multiple disks: • Multiple disks  multiple disk heads • Multiple outputs = Increased data rate Notes 2

  46. Techniques to deploit multiple disks • Block Striping: • Store blocks of a file over multiple disks • (This technique uses multiple disks as point 2) • Mirror disk: • Store the same data on multiple disks • RAID: • Redundant Array of Independent (inexpensive) Disks Notes 2

  47. Block Striping • Blocks of the same file stored on different disks Data blocks of 1 file Notes 2

  48. Disk Mirroring • Mirrored disks contain identical content • Read operation: n times as fast • Write operation: about the same as 1 disk logicallyone disk Notes 2

  49. Disk Arrays (Even parity) • RAIDs (various flavors) Parity block Data blocks 00 01 00 10 11 logicallyone disk Notes 2

  50. Disk Failures • Intermittent read failure • Cause: power fluctuations/failure • Intermittent write failure • Cause: power fluctuation/failure • Media decay  discuss first • Disk surface worn out • Permanent failure  redundancy… • Disk crash Notes 2

More Related