300 likes | 462 Views
A Fast File System for UNIX. Marshall Kirk McKusick, William N. Joy, Samuel J. Leffler and Robert S. Fabry University of California, Berkeley Presented by Catherine Vilhauer. Introduction. Problems in old UNIX file system Low data throughput rates not sufficient
E N D
A Fast File System for UNIX Marshall Kirk McKusick, William N. Joy, Samuel J. Leffler and Robert S. Fabry University of California, Berkeley Presented by Catherine Vilhauer CS533 - Concepts of Operating Systems
Introduction • Problems in old UNIX file system • Low data throughput rates not sufficient • E.g. VLSI design and image processing - little processing on large amount of data • Old UNIX provided only 2% of the maximum disk bandwidth or 20 kilobytes per second per arm. • Berkeley carried out modifications which are basis of today’s UNIX NFS CS533 - Concepts of Operating Systems
Review of Disk Structure * Taken from Modern Operating Systems, Tanenbaum CS533 - Concepts of Operating Systems
Limits to Disk Bandwidth • Seeking and settling time • Rotational delay • Disk controller may be reading data into its cache • Reading/writing time. CS533 - Concepts of Operating Systems
Review of Disk Structure * Taken from www.storagereview.com CS533 - Concepts of Operating Systems
The Old Unix File System • Each disk drive divided into one or more partitions • Each disk partition may contain one file system • File system never spans multiple partitions • File system described by superblock which contains basic parameters of the file system • Data blocks • Count of the maximum number of files • Pointer to the free list • Partitions set up by Master Boot Record CS533 - Concepts of Operating Systems
Old UNIX File System (cont’d) • Within file system are files (obviously) • Some are directories • Contain pointers to files that may also be directories • Every file has a descriptor called an i-node • Ownership of file • Time stamps marking last modification and access times • Array of indices that point to data blocks for the file • May contain references to indirect blocks containing further data block indices CS533 - Concepts of Operating Systems
Review of What I-Nodes Are CS533 - Concepts of Operating Systems
Disk Layout in the Old UNIX File System CS533 - Concepts of Operating Systems
Disk Layout in the Old UNIX File System • Problems • Segregation of i-node info from data • Long seek time • Files in a single directory not normally allocated consecutive blocks - many non-consecutive blocks of i-nodes accessed when executing operations on the i-nodes of several files in a directory CS533 - Concepts of Operating Systems
Problem with old file system i-node data More data CS533 - Concepts of Operating Systems
Other problems • Never transfers more than 512 bytes • Often finds that the next sequential data block is not on the same cylinder, forcing seeks between 512 byte transfers • Small block size, limited read-ahead in the system and many seeks severely limits file system throughput CS533 - Concepts of Operating Systems
Initial Improvements Made at Berkeley • Basic block size increased to 1024 bytes • Improved performance by more than 2x • BUT • Still only using 4% of disk bandwidth • Why? • Free list (initially ordered) became scrambled after use • Forced a seek every block access CS533 - Concepts of Operating Systems
New File System • Cylinder Groups added • Max block size - 4096 bytes • Now possible to create files up to 232 bytes • Size can be any power of 2 up to 4096 bytes • Size is stored in super-block • Allows file systems with different block sizes on same system • New allocation algorithms to improve locality CS533 - Concepts of Operating Systems
Cylinder Groups • Disk partition divided into Cylinder Groups • One or more consecutive cylinders on a disk • Cylinder group book-keeping info • Redundant copy of super-block kept at varying offset so it spirals through the disk • Space for i-nodes • Bit map describing available blocks (replaces free list) • Summary info describing usage of data blocks • Begins at varying offset (so that all super-blocks not lost) CS533 - Concepts of Operating Systems
Fragments and Fragmentation • New file system uses 4096 byte block size • Allows 4x more information to be transferred per disk transaction • Problem: Small Files • Uniformly large block wastes space • Solution: Fragments • Can divide single file system block into one or more fragments • 2, 4 or 8 fragments specified at file system creation • Lower bound 512 bytes - disk sector size CS533 - Concepts of Operating Systems
Fragment Example CS533 - Concepts of Operating Systems
Space allocated when program does write system call Each time the system checks to see if the size of the file has increased Space Allocation CS533 - Concepts of Operating Systems
There is enough space - easy No fragmented blocks and no space in last block If space in already allocated block then written here Otherwise a full block is allocated and written here Process repeated until < full block of new data remains. File contains one or more fragments with no space If size of new data plus size of data already in fragments is greater than size of a full block, then a new block allocated and contents of the fragment are copied. Then continues as in second bullet point Space Allocation - Three Types CS533 - Concepts of Operating Systems
Problems Data copied many times as a fragmented block expands But fragment reallocation can be minimized if the user program writes a full block at a time, except for a partial block at the end of the file. Results Wasted space the same as old file system Savings in space utilization offset by need to keep track of free blocks Notes File system should not be completely full - around 90% full is optimal Space Allocation (cont’d) CS533 - Concepts of Operating Systems
New file system tries to parameterize the hardware for optimum configuration Allocate new blocks on same cylinder as previous block Cylinder group summary info keeps count of available blocks in cylinder group at different rotational positions 8 rotational positions Super-block contains vector of lists called rotational layout tables Parameter that defines the number of milliseconds between completion of a data transfer and the initiation of another data transfer on same cylinder can be changed at any time File System Parameterization CS533 - Concepts of Operating Systems
Global Policies Make placement decisions for new i-nodes and data blocks Calculate rotationally optimal block layouts Decide when to force long seek because insufficient blocks Local Policies Use locally optimal scheme to lay out data blocks Aims: Increase locality of reference to minimize seek latency Improve layout of data to make larger transfers possible Too much localization = local cylinder group run out of space Layout Policies CS533 - Concepts of Operating Systems
Attempts to place all i-nodes of files in directory in the same cylinder group Directories are different New directory placed in cylinder group that has greater than average free i-nodes, and smallest number of directors Data blocks Policy tries to place all data blocks for a file in cylinder group at rotationally optimal positions But, large files will use up all space Redirect block allocation to different cylinder group when file size >48 kilobytes and every megabyte after Layout Policies (cont’d) CS533 - Concepts of Operating Systems
Global policy routines call local allocation routines If requested block not available Use next available, rotationally closest block If no blocks open on same cylinder, use block in same cylinder group If cylinder group full, quadratically hash to choose another cylinder group Exhaustive search Layout Policies (cont’d.) CS533 - Concepts of Operating Systems
New Data Layout i-node data more data CS533 - Concepts of Operating Systems
Performance CS533 - Concepts of Operating Systems
Running ‘list directory’ on a large directory, number of disk accesses for i-nodes cut by factor of 2 Containing only files - cut by factor of 8 Transfer rates no longer change over time Bandwidth Old file system uses only 3-5% of disk bandwidth New file system uses up to 47% Reads and writes faster Due to larger block size in new file system Writes same speed as reads in contrast to old system Performance CS533 - Concepts of Operating Systems
Allowed long file names - 512 bytes Implemented file locking Symbolic links References across physical file systems and inter-machine linkage Rename Old system required 3 system calls New implementation to avoid losing file with only temporary name if system crashes Quotas Restricts the file system resources a user can obtain Other new features CS533 - Concepts of Operating Systems
Conclusion • FFS is the basis for UFS (Unix File System) • Nearly all UNIX machines use a variant of UFS including • Solaris • Unix BSD, Free Unix • Mac OS offers UFS as an alternative to its HFS • Linux offers partial support for UFS CS533 - Concepts of Operating Systems
References • Andrew S. Tanenbaum, Modern Operating Systems • Entry on Unix File Systems in Wikipedia.org CS533 - Concepts of Operating Systems