130 likes | 241 Views
Secondary storage management. Section 13.3. Eilbroun Benjamin CS 257 – Dr. TY Lin. Presentation Outline. 13.3 Accelerating Access to Secondary Storage 13.3.1 The I/O Model of Computation 13.3.2 Organizing Data by Cylinders 13.3.3 Using Multiple Disks 13.3.4 Mirroring Disks
E N D
Secondary storage management Section 13.3 Eilbroun Benjamin CS 257 – Dr. TY Lin
Presentation Outline • 13.3 Accelerating Access to Secondary Storage • 13.3.1 The I/O Model of Computation • 13.3.2 Organizing Data by Cylinders • 13.3.3 Using Multiple Disks • 13.3.4 Mirroring Disks • 13.3.5 Disk Scheduling and the Elevator Algorithm • 13.3.6 Prefetching and Large-Scale Buffering
13.3 Accelerating Access to Secondary Storage • Several approaches for more-efficiently accessing data in secondary storage: • Place blocks that are together in the same cylinder. • Divide the data among multiple disks. • Mirror disks – Making two or more copies of the data on different disks. • Use disk-scheduling algorithms. • Prefetch blocks into main memory. • Scheduling Latency – added delay in accessing data caused by a disk scheduling algorithm. • Throughput – the number of disk accesses per second that the system can accommodate.
13.3.1 The I/O Model of Computation • The number of block accesses (Disk I/O’s) is a good time approximation for the algorithm. • This should be minimized. • Ex 13.3: You want to have an index on R to identify the block on which the desired tuple appears, but not where on the block it resides. • For Megatron 747 (M747) example, it takes 11ms to read a 16k block. • A standard microprocessor can execute millions of instruction in 11ms, making any delay in searching for the desired tuple negligible.
13.3.2 Organizing Data by Cylinders • If we read all blocks on a single track or cylinder consecutively, then we can neglect all but first seek time and first rotational latency. • Ex 13.4: We request 1024 blocks of M747. • If data is randomly distributed, average latency is 10.76ms by Ex 13.2, making total latency 11s. • If all blocks are consecutively stored on 1 cylinder: • 6.46ms + 8.33ms * 16 = 139ms (1 average seek) (time per rotation) (# rotations)
13.3.3 Using Multiple Disks • If we have n disks, read/write performance will increase by a factor of n. • Striping – distributing a relation across multiple disks following this pattern: • Data on disk R1: R1, R1+n, R1+2n,… • Data on disk R2: R2, R2+n, R2+2n,… … • Data on disk Rn: Rn, Rn+n, Rn+2n, … • Ex 13.5: We request 1024 blocks with n = 4. • 6.46ms + (8.33ms * (16/4)) = 39.8ms (1 average seek) (time per rotation) (# rotations)
Continued.. Using the several disks can increase the ability of a database system to handle heavy loads of disks access requests.
13.3.4 Mirroring Disks • Mirroring Disks – having 2 or more disks hold identical copied of data. • Benefit 1: If n disks are mirrors of each other, the system can survive a crash by n-1 disks. • Benefit 2: If we have n disks, read performance increases by a factor of n. • Performance increases further by having the controller select the disk which has its head closest to desired data block for each read. • Unfortunately, the writing of disks blocks does not speed up at all.
13.3.5 Disk Scheduling and the Elevator Problem • Disk controller will run this algorithm to select which of several requests to process first. • Pseudo code: • requests[] // array of all non-processed data requests • upon receiving new data request: • requests[].add(new request) • while(requests[] is not empty) • move head to next location • if(head location is at data in requests[]) • retrieve data • remove data from requests[] • if(head reaches end) • reverse head direction
13.3.5 Disk Scheduling and the Elevator Problem (con’t) Events: Head starting point Request data at 8000 Request data at 24000 Request data at 56000 Get data at 8000 Request data at 16000 Get data at 24000 Request data at 64000 Get data at 56000 Request Data at 40000 Get data at 64000 Get data at 40000 Get data at 16000 64000 56000 48000 40000 32000 24000 16000 8000
13.3.5 Disk Scheduling and the Elevator Problem (con’t) Elevator Algorithm FIFO Algorithm
13.3.6 Prefetching and Large-Scale Buffering • If at the application level, we can predict the order blocks will be requested, we can load them into main memory before they are needed.