220 likes | 332 Views
Fast File System. 2/17/2006. Introduction. Paper talked about changes to old BSD 4.2 File System (FS) Motivation - Applications require greater throughput Large amounts of paging Want to retain the good of the old FS Abstraction Backward compatibility with existing software.
E N D
Fast File System 2/17/2006
Introduction • Paper talked about changes to old BSD 4.2 File System (FS) • Motivation - Applications require greater throughput • Large amounts of paging • Want to retain the good of the old FS • Abstraction • Backward compatibility with existing software
Old File System • Block size change from 512 KB to 1024 performance > x2 – WHY? • Disk access had 2x more data • Direct blocks had twice the data so indirect blocks weren’t needed as much • Performance degradation over time • 175kb/s 30kb/s due to randomization of block placement on disk • Fundamental limits: • Small block size • read-ahead in the system • Large seek numbers limits file system throughput.
New File System (FS) • Drives mapped into partitions • Each partition has a FS described by redundant superblock • File system blocks = 4096 bytes • Cyllinder groups – what’s that?
Cyllinder Groups • a collection of cylinder groups; Each cylinder group has the following components: • a backup copy of the superblock • a cylinder group header, with statistics, free lists, etc, about this cylinder group, similar to those in the superblock • a number of inodes, each containing file attributes • a number of data blocks
Storage Utilization • 4x bigger blocks sizes 4096 bytes throughput • Problem: unix FS commonly composed of small files wasted space
Utilization Cot’d • Small files stored in more efficient way • Blocks broken into fragments of 512 bytes • Block map associated with each cylinder group records the space available in a cylinder group at the fragment level • Is a block is available? look at aligned fragments
Utilization Cot’d • Fragments of adjoining blocks cannot be used as a full block, even if they are large enough. • If no block with enough aligned fragments is available at file creation, a full size block is split yielding the necessary fragments and a single unused fragment.
Utilization • One of three conditions for file growth allocation • Enough space in allocated block data written to space • Files contain no fragments • Files contain fragments • Problem: file growth one fragment at a time many data copies • Soln: User programs write one block at a time
Utilization • Capacity • Problem: as unallocated space 0, the throughput falls to 50% • System should keep ~10% unallocated space • Soln: at a threshold, only administrator can write new blocks
Parameterization of HW • Why? • Old FS had no information about physical characteristics of storage device • Blocks allocated optimally • Processor speed • HW support for mass transfers • Characteristics of mass storage devices (# platters, physical data layouts, etc)
Parameterization of HW • Physical characteristics of each disk: • number of blocks per track • rate of disk spin • Cylinder group summary information: • Cost of rotationally optimal blocks is not free • Soln: count of the available blocks in a cylinder group at different rotational positions. • FS can be parameterized to support min. processor disk operation schedule
Layout Policies • Two parts to data layout policies: • top level -- global policies use FS-wide summary information to make decisions regarding the placement of inodes and blocks • Lower level -- local allocation routines use a locally optimal scheme to lay out data blocks. • Global policies try to balance conflict: • localizing data that is concurrently accessed • spreading out unrelated files
Layout Policies Cot’d • Two allocatable resources • Inodes • Blocks • Layout policy tries to place all the inodes of files in a directory in the same cylinder group. • Data blocks usually accessed together layout policy tries to place all data blocks for a file in the same cylinder group, preferably at rotationally optimal positions in the same cylinder.
Performance • % of bandwidth in Table 2 measures: effectiveness of utilization of the disk by the file system. • upper bound on the transfer rate from the disk: • number of bytes on a track x number of revolutions of the disk per second. • Bandwidth is calculated by comparing the data rates the file system is able to achieve as a percentage of the bound. • Results: • the old FS uses 3−5% of the disk bandwidth • new FS uses up to 47% of the bandwidth.
Performance • Some stats:
Performance Continued • Limits • Processors limit throughput • Memory to memory copying – 40% of I/O time • Block chaining would require driver rewrites • One block allocated at a time – 10% of system writes
Enhancements • File system changes and required downtime allowed for some new ideas • Long File Names • File Locking • Symbolic Links • Rename • Quotas
Summary • File System Changes from Old System • Block size increased • Layout more efficient • Fragments used to reduce space waste • Performance increased • New items implemented into FS that had been requested by users
Summary Cot’d • New FS organization • Utilization • Parameterization • New Layout Policies • Performance Improvements
Key Ideas • Old FS used fraction of the available data throughput • New FS: • same data structures • Same FS semantics • New FS has new functionality