320 likes | 424 Views
MM File Management. Karrie Karahlaios and Brian P. Bailey Spring 2007. Announcements. Sector Track Platter Cylinder R/W head Example 16 heads x 1400 cyls x 16 sectors/track x 512 bytes/sector = 183.5MB . Physical Disk Structure. Measures of Performance. Seek time (ms)
E N D
MM File Management Karrie Karahlaios and Brian P. BaileySpring 2007
Sector Track Platter Cylinder R/W head Example 16 heads x 1400 cyls x 16 sectors/track x 512 bytes/sector = 183.5MB Physical Disk Structure
Measures of Performance • Seek time (ms) • time to move disk arm to a specific track • Latency (ms) • time for sector to rotate under disk arm • Transfer rate (Mbps) • data that can be read in one time unit
Zoned Bit Recording • Utilize larger, outer tracks • early disks could not handle varying number of sectors / track • reduce density of outer sectors • Each zone (set of tracks) has variable number of sectors • outer part can hold more data and support higher transfer rates
File System • Mapped onto physical disk structure • want to match user’s conceptual model • Collection of files and directories • file is logical storage unit • directories contain information about files (names, type, location, size, protection, etc.) • Basic operations • create, write, read, reposition, delete • sequential and random access
Allocation Methods • Contiguous • Linked • Constrained • Striping • … and many others
Continuous • Occupy contiguous set of blocks • Strengths • minimizes seek time • supports sequential and random access • Weaknesses • suffers external fragmentation
Linked • Stored as a linked list of blocks • Strengths • eliminates external fragmentation • supports files of arbitrary length • Weaknesses • random access slow, overhead of pointers • susceptible to block errors
Constrained • Linked structure, but allocate next block based on “distance” from previous one • distance = predicted seek and latency • Strengths • improves sequential access • minimizes seek time • Weaknesses • increases algorithm complexity
Striping (RAID-0) • Stripe file across an array of N disks • divide file into stripes, dive stripe into units, assign each unit to different disk • Strengths • reduces disk access time by N • Weaknesses • susceptible to failure of any one disk • p(failure) = N * p(any one disk failing)
MM File System Requirements • Storing/retrieving multimedia files • large size; continuous periodic requests • Maintain high throughput • Support RT and non RT requests • Guarantee a sustained level of service
Meeting the Requirements • Methods of placing data on disk • Scheduling algorithms • Admission control policies • Maximize transfer time
Zipfs Law • Probability of occurrence of the kth most common word is proportional to 1/k • applies to many observable events • More generally Pi = k / iα where • i is the ith most popular item; k is a constant; alpha is close to 1
Apply to File Allocation • For multimedia, assume that • alpha=1 • Sum(Pi)=1 • Compute the probability of each multimedia file being accessed • use for layout and prefetching
Scheduling Algorithms • FCFS • SSTF • SCAN and C-SCAN • EDF • SCAN-EDF • Understand each algorithm and weigh advantages and disadvantages
FCFS • Serve requests based on incoming order • Inherently fair • Does not consider location of requests • can lead to high overhead
SSTF • Select request closest to current position • minimizes seek time/overhead • May cause starvation of some requests
SCAN and C-SCAN • Serves all requests in current direction • reverses when no more requests • serves middle tracks better than edges • C-SCAN scans across disk in cycles • more fair to the edge tracks
EDF • Attach deadlines to each request • select request with earliest deadline • can have high overhead
SCAN-EDF • SCAN-EDF selects • earliest deadline, or if same deadline • select request closest to the disk’s center • Use EDF, but perturb deadlines • Di = Di + f(Ni); where f(Ni) = Ni / Nmax • Consider direction?
Admission Control • Based on the admission control policy discussed in the paper: • C. Martin, P.S. Narayan, B. Ozden, R. Rastogi, and A. Silberschatz. The Fellini Multimedia Storage System, Journal of Digital Libraries, 1997.
Mathematical Setup • Client requests received in cycles of duration T • T is referred to as the common period of the system • assumes circular (C-SCAN) scan of the disk • consumption rate of each real-time client is ri • Retrieval rate for each client must be > T*ri • Ensure that the file system in each period T can retrieve T*ri bits for each client
Setup (cont.) • Serve both real and non-real time clients • Serve real-time clients using fraction of T • Use to serve real-time clients • Use to serve non real-time clients • To retrieve T*ri bits for each client, the controller must ensure time to retrieveT*r1, …, T*rn bits does not exceed
Number of Disk Blocks • If b is block size, then maximum number of disk blocks to be retrieved for ri is
Latency • Retrieval of a disk block involves a seek to the track containing the block, a settle time delay, and a rotational delay • Let tseek, trot, and tsettle be the worst case times for each measure
Maximum Latency • Thus, the maximum latency for servicing clients r1, r2, …, rq is
Transfer Time • If the transfer rate from the innermost track of the disk is rdisk, then the time to transfer T*ri bits of data for request ri is
Admission for Real-Time Clients • Thus, the total time to retrieve T*r1, …, T*rq bits for requests R1, …, Rq is the sum of the latency and transfer times • Admit new client, if on adding it, this equation is still satisfied
Admission for Non RT Clients • Remainder of the period is for requests from non real-time clients • Let di be the data requested from Ci • Number of blocks is
Admission for Non RT Clients • For each request, latency plus transfer time is • Over all requests p, this becomes • Admit new non RT client, if on adding it, above equation is still satisfied
Example • Transfer rate (rdisk) = 100 KB / sec • Cycle time (T) = 10ms • Max latency = 1ms • Client A data rate (r1) = 45 KB/sec • Client B data rate (r2) = 40 KB/sec • Are the two real-time clients admissible? • If so, what proportion of the cycle time is needed to serve these clients?