510 likes | 533 Views
This review covers the implementation of file systems, including file operations, directory operations, and different file allocation schemes such as contiguous, linked, and FAT. It discusses the advantages and disadvantages of each scheme and provides insights into bad block management and extent allocation.
E N D
Implementation of File Systems CS-4513Distributed Computing Systems (Slides include materials from Operating System Concepts, 7th ed., by Silbershatz, Galvin, & Gagne, Modern Operating Systems, 2nd ed., by Tanenbaum, and Distributed Systems: Principles & Paradigms, 2nd ed. By Tanenbaum and Van Steen) File Implementations
Review • File • Named, persistent collection of data • Potentially very long lived, very large • File Operations • Open, Close, Read, Write, Truncate, Seek, Tell • Directories • Special kinds of files for organizing other files • Entries may point to files or other directories • Directory Operations • Lookup, List, Add, Remove, Rename, Link, Unlink File Implementations
Implementation of Files • Map file abstraction to physical disk blocks • Goals • Efficient in time, space, use of disk resources • Fast enough for application requirements • Scalable to a wide variety of file sizes • Many small files (< 1 page) • Huge files (100’s of gigabytes, terabytes, spanning disks) • Everything in between File Implementations
File Allocation Schemes • Contiguous • Blocks of file stored in consecutive disk sectors • Directory points to first entry • Linked • Blocks of file scattered across disk, as linked list • Directory points to first entry • Indexed • Separate index block contains pointers to file blocks • Directory points to index block File Implementations
Contiguous Allocation • Ideal for large, static files • Databases, fixed system structures, OS code • Multi-media video and audio • CD-ROM, DVD • Simple address calculation • Directory entry points to first sector • File block i disk sector address • Fast multi-block reads and writes • Minimize seeks between blocks File Implementations
Contiguously Allocated Files File Implementations
File Creation(Contiguous File System) • Search for an empty sequence of blocks • First-fit • Best-fit • Prone to fragmentation when … • Files come and go • Files change size • Similar to base-limit style virtual memory File Implementations
Digression: Bad Block Management • Bad blocks on disks are inevitable • Part of manufacturing process (less than 1%) • Most are detected during formatting • Occasionally, blocks become bad during operation • Manufacturers typically add extra tracks to disks • Physical capacity = (1 + x) * rated_capacity • Who handles bad blocks? • Disk controller: Bad block list maintained internally • Automatically substitutes good blocks • Formatter: Re-organize track to avoid bad blocks • OS: Bad block list maintained by OS, bad blocks never used File Implementations
Bad Block Management inContiguous Allocation File Systems • Bad blocks must be concealed • Foul up the block-to-sector calculation • Methods • Look-aside list of bad sectors • Check each sector request against hash table • If present, substitute a replacement sector behind the scenes • Spare sectors in each track, remapped by formatting • Handling • Disk controller, invisible to OS • Lower levels of OS; invisible to most of file system or application File Implementations
Contiguous Allocation – Extents • Extent: a contiguously allocated subset of a file • Directory entry points to • (For file with one extent) the extent itself • (For file with multiple extents) pointer to an extent block describing multiple extents • Advantages • Speed, ease of address calculation of contiguous file • Avoids (some of) the fragmentation issues • Can be adapted to support files across multiple disks • … File Implementations
Contiguous Allocation – Extents • … • Disadvantages • Too many extents degenerates to indexed allocation • As in Unix-like systems, but not so well • Popular in 1960s & 70s • Currently used for large files in NTFS • Rarely mentioned in textbooks • Silbershatz, §11.4.1 & 22..5.1 File Implementations
Questions? File Implementations
Blocks scattered across disk Each block contains pointer to next block Directory points to first and last blocks Sector header: Pointer to next block ID and block number of file 10 16 25 01 Linked Allocation File Implementations
This is Silbershatz figure 11.5 Links in the book are incorrect 10 16 25 01 Linked Allocation (Note) File Implementations
Linked Allocation • Advantages • No space fragmentation! • Easy to create, extend files • Ideal for lots of small files • Disadvantages • Lots of disk arm movement • Space taken up by links • Sequential access only! File Implementations
Variation on Linked Allocation – File Allocation Table (FAT) • Instead of link on each block, put all links in one table • the File Allocation Table — i.e., FAT • One entry per physical block in disk • Directory points to first & last blocks of file • Each block points to next block (or EOF) File Implementations
FAT File Systems • Advantages • Advantages of Linked File System • FAT can be cached in memory • Searchable at CPU speeds, pseudo-random access • Disadvantages • Limited size, not suitable for very large disks • FAT cache describes entire disk, not just open files! • Not fast enough for large databases • Used in MS-DOS, early Windows systems File Implementations
Disk Defragmentation • Re-organize blocks in disk so that file is (mostly) contiguous • Link or FAT organization preserved • Purpose: • To reduce disk arm movement during sequential accesses File Implementations
Bad Block Management –Linked and FAT Systems • In OS:– format all sectors of disk • Don’t reserve any spare sectors • Allocate bad blocks to a hidden file for the purpose • If a block becomes bad, append to the hidden file • Advantages • Very simple • No look-aside or sector remapping needed • Totally transparent without any hidden mechanism File Implementations
Questions? Linked and FAT File Systems File Implementations
Indexed Allocation • i-node: • Part of file metadata • Data structure lists the sector address of each block of file • Advantages • True random access • Only i-nodes of open files need to be cached • Supports small and large files File Implementations
Unix/Linux i-nodes • Direct blocks: • Pointers to first n sectors • Single indirect table: • Extra block containing pointers to blocks n+1 .. n+m • Double indirect table: • Extra block containing single indirect blocks • … File Implementations
Indexed Allocation • Access to every block of file is via i-node • Bad block management • Similar to Linked/FAT systems • Disadvantage • Not as fast as contiguous allocation for large databases • Requires reference to i-node for every access vs. • Simple calculation of block to sector address File Implementations
Questions? File Implementations
Free Block Management in File Systems • Bitmap • Very compact on disk • Expensive to search • Supports contiguous allocation • Free list • Linked list of free blocks • Each block contains pointer to next free block • Only head of list needs to be cached in memory • Very fast to search and allocate • Contiguous allocation vary difficult File Implementations
Free Block ManagementBit Vector 0 1 2 n-1 … 0 block[i] free 1 block[i] occupied bit[i] = Free block number calculation (number of bits per word) * (number of 0-value words) + offset of first 1 bit File Implementations
Free Block ManagementBit Vector (continued) • Bit map • Must be kept both in memory and on disk • Copy in memory and disk may differ • Cannot allow for block[i] to have a situation where bit[i] = 1 in memory and bit[i] = 0 on disk File Implementations
Free Block ManagementBit Vector (continued) • Solution: • Set bit[i] = 1 in disk • Allocate block[i] • Set bit[i] = 1 in memory • Similarly for set of contiguous blocks • Potential for lost blocks in event of crash! • Discussion:– How do we solve this problem? File Implementations
Free Block ManagementLinked List • Linked list of free blocks • Not in order! • Cache first few free blocks in memory • Head of list must be stored both • On disk • In memory • Each block must be written to disk when freed • Potential for losing blocks? File Implementations
Reading Assignment • Silbershatz, Chapter 11 • Ignore §11.9, 11.10 for now! • Tanenbaum (Modern Operating Systems), Chapter 6 File Implementations
Scalability of File Systems • Question: How large can a file be? • Answer: limited by • Number of bits in length field in metadata • Size & number of block entries in FAT or i-node • Question: How large can file system be? • Answer: limited by • Size & number of block entries in FAT or i-node File Implementations
MS-DOS & Windows • FAT-12 (primarily on floppy disks): • 4096 512-byte blocks • Only 4086 blocks usable! • FAT-16(early hard drives): • 64 K blocks; block sizes up to 32 K bytes • 2 GBytes max per partition, 4 partitions per disk • FAT-32(Windows 95) • 228 blocks; up to 2 TBytes per disk • Max size FAT requires 232 bytes in RAM! File Implementations
MS-DOS File System (continued) • Maximum partition for different block sizes • The empty boxes represent forbidden combinations File Implementations
Classical Unix • Maximum number of i-nodes = 64K! • How many files in a modern PC? • I-node structure allows very large files, but … • Limited by size of internal fields File Implementations
Modern Operating Systems • Need much larger, more flexible file systems • Many terabytes per system • Multi-terabyte files • Suitable for both large and small • Cache only open files in RAM File Implementations
Examples of Modern File Systems • Windows NTFS • Silbershatz §22.5 • Tanenbaum §11.7 • Linux ext2fs • Silbershatz §21.7.2 • Other file systems … • Consult your favorite Linux system documentation File Implementations
New Topic File Implementations
Mounting mount –t type device pathname • Attach device (which contains a file system of type type) to the directory at pathname • File system implementation for type gets loaded and connected to the device • Anything previously below pathname becomes hidden until the device is un-mounted again • The root of the file system on device is now accessed as pathname • E.g., mount –t iso9660 /dev/cdrom /myCD File Implementations
Mounting (continued) • OS automatically mounts devices in mount table at initialization time • /etc/fstabin Linux • Users or applications may mount devices at run time, explicitly or implicitly — e.g., • Insert a floppy disk • Plug in a USB flash drive • Type may be implicit in device • Windows equivalent • Map drive File Implementations
Virtual File Systems • Virtual File Systems (VFS) provide object-oriented way of implementing file systems. • VFS allows same system call interface to be used for different types of file systems. • The API is to the VFS interface, rather than any specific type of file system. File Implementations
Schematic View of Virtual File System File Implementations
Virtual File System (continued) • Mounting: formal mechanism for attaching a file system to the Virtual File interface File Implementations
Linux Virtual File System (VFS) • A generic file system interface provided by the kernel • Common object framework • superblock: a specific, mounted file system • i-node object: a specific file in storage • d-entry object: a directory entry • file object: an open file associated with a process File Implementations
Linux Virtual File System (continued) • VFS operations • super_operations: • read_inode, sync_fs, etc. • inode_operations: • create, link, etc. • d_entry_operations: • d_compare, d_delete, etc. • file_operations: • read, write, seek, etc. File Implementations
Linux Virtual File System (continued) • Individual file system implementations conform to this architecture. • May be linked to kernel or loaded as modules • Linux kernel 2.6 supports over 50 file systems in official version • E.g., minix, ext, ext2, ext3, iso9660, msdos, nfs, smb, … File Implementations
Questions? File Implementations
Implementation of Directories • A list of [name, information] pairs • Must be scalable from very few entries to very many • Name: • User-friendly, variable length • Any language • Fast access by name • Information: • File metadata (itself) • Pointer to file metadata block (or i-node) on disk • Pointer to first & last blocks of file • Pointer to extent block(s) • … File Implementations
name1 attributes name2 attributes name3 attributes name4 attributes … … Very Simple Directory • Short, fixed length names • Attribute & disk addresses contained in directory • MS-DOS, etc. File Implementations
name1 name2 name3 name4 … Simple Directory i-node i-node i-node i-node Data structurescontaining attributes • Short, fixed length names • Attributes in separate blocks (e.g., i-nodes) • Attribute pointers are disk addresses (or i-node numbers) • Older Unix versions, MS-DOS, etc. File Implementations
attributes attributes attributes attributes … … name1 longer_name3 very_long_name4 name2 … More Interesting Directory • Variable length file names • Stored in heap at end • Modern Unix, Windows • Linear or logarithmic search for name • Compaction needed after • Deletion, Rename File Implementations