File System Implementation Overview

Chapter 12: File System Implementation

Chapter 12: File System Implementation • File-System Structure • File-System Implementation • Directory Implementation • Allocation Methods • Free-Space Management • Efficiency and Performance • Recovery

Objectives • To describe the details of implementing local file systems and directory structures • To discuss block allocation and free-block algorithms and trade-offs • To discuss memory-management issues related to the file system

Definitions • A block is the smallest unit of disk space that can be allocated • Also called a sector • Typically 512 bytes, but can vary from 32 bytes to 4kb (and more) • Sequential accessto a file means reading from start to finish in order • Direct access to a file means reading a specific part, without what comes before or after

File-System Structure • Logical file systemprovides the symbolic interface to programs and users • Read “c:\bob’s stuff\bob.txt” • Includes the file control blockwith information such as owner and user permissions • File-organization module maps logical files to physical data • “bob.txt” = blocks 71, 12, 14, 2 • Basic file systemissues device-specific commands • Read drive 1, cylinder 73, track 2, sector 10 • I/O controller and device driver controls the physical device • Move reading head, wait for disk to spin, read

File Systems • UNIX • UFS (Unix File System) • FFS (Berkeley Fast File System) • Linux • EFS (Extended File System: ext2, ext3) • Microsoft • FAT (File Allocation Table: FAT12, FAT16, FAT32) • NTFS • Google • GFS • 64MB sectors • Optimized for read and append operations rather than write or compress • Files distributed over large number of cheap computers with high failure rates

File-System Implementation • Some basic data structures commonly found in file systems • On disk: • Boot control block contains info needed to boot OS from that volume • UFS: boot block, NTFS: partition boot sector • Volume control block contains volume details, such as block sizes and number of blocks • UFS: superblock, NTFS: master file table • Directory structure organizes the files • UFS: separate data structure, NTFS: part of the master file table • File Control Block (FCB) contains details about its file • UFS: inode for each file, NTFS: row in database in master file table • In memory: • Mount table contains information on each mounted volume • Directory structure cache contains information on recently-accessed directories • System-wide open-file table contains the FCB of every open file • Per-process open-file table contains a pointer to the correct entry in the system-wide table • Unix: file descriptor, Windows: file handle • Buffers to hold blocks during read/write operations

File-System Implementation Opening a file Reading a file

Directory Implementation • How to design and manage the directory structure? • Linear list of file names with pointer to the data blocks • Simple to program • Time-consuming to execute • Can be improved with memory cache and tree structures • Hash table of file names with pointer to the data blocks • Decreases directory search time • Need to handle hash collisions (using linked list for each hash value) • Fixed table size / range of hash values

Allocation Methods • Blocks on disk are small • Typically 512b • Files can be over several blocks • We need an allocation method! • Use space efficiently but allow quick retrieval • Disk setup: a head that moves and reads/writes on command, and disks that rotate at varying speeds • Allows for some flexibility in allocation methods • Slowest operation: moving the head (disk seek) • Contiguous allocation • Linked allocation • Indexed allocation • Some systems support multiple methods, but normally systems support just one

Contiguous Allocation • Each file occupies a set of contiguous blocks on the disk • Directory represents a file as starting block and length • Advantage: Efficiency • Sequential access to a file requires one disk seek to get to starting block • Direct access to a file location i requires one disk seek to get to starting block + i • Disadvantages: • External fragmentation • Need to use a free space allocation program • Need to predict the final size of files • Too small and file won’t be able to grow • Too big and system wastes space (internal fragmentation) • Right size but file grows slowly: temporary waste of space

Contiguous Allocation Improvements • Expand files using extents • An additional contiguous set of blocks • Linked (not contiguous) to original contiguous set of blocks of file • Reduces (does not eliminate) problem of internal fragmentation • Contiguous allocation still used because it’s so efficient • IBM VM/CMS • Veritas FS (with extent)

Linked Allocation • Each file occupies blocks randomly scattered on the disk • Directory represents a file as starting block (and sometimes end block) • Each block contains a pointed to the next one • Advantages: • No external fragmentation • Minimal internal fragmentation • No need to know the size of files in advance • Disadvantages: • Wasted space for pointers (a few bytes per block) • Sequential access requires multiple disk seeks • No direct access • Poor reliability: one pointer errorruins entire file (or several files!)

Linked Allocation Improvements • Allocating clusters of contiguous blocks instead of individual blocks • Pointers from cluster to cluster instead of block per block: less wasted space • One disk seek per cluster instead of per block: more efficient • More internal fragmentation • Double-linked lists (pointers to previous and next sector) • Improves reliability • Doubles wasted space for pointers • Store pointers in a File Allocation Table (FAT) • FAT at the beginning of volume, separate from data • Can be cached in memory to improve direct access efficiency and free space management • Can be duplicated to improve reliability

Indexed Allocation • Each file occupies blocks randomly scattered on the disk • A single index block contains all the pointers • Directory represents a file its index block • Advantages: • All those of linked allocation, plus • Direct access possible through the index block • Disadvantages: • Wasted space & internal fragmentation for pointers (at least one block per file) • Sequential access requires multiple disk seeks

Indexed Allocation • Index size limit is the index block size • Number of pointers per block n • Too small and you can’t handle large files • Too large and you waste space • Linked scheme • Link together several index blocks • Last pointer in a block is to the next index block or “nil” (end of index) • Infinite file size • Multilevel index • First-level index block points to second-level index blocks which point to file blocks • High file size limit: n² • Combined scheme • The first n – 3 pointers point to data blocks, the last three are multilevel indexes • First block points to a single-level index, second block points to a two-level index, third block points to a three-level index • Very high file size limit: (n-3) + n + n² + n³

Indexed Allocation • UFS uses combined scheme • Information stored in each file’s inode

Free-Space Management • File system needs to keep track of free space on disk • The free-space list • Must allow for efficient search for free space • Must take minimal space

Free-Space Management • Keep a bit vector(bit map) with one bit per block (or cluster) • 1 for free block, 0 for used block • Easy to find n contiguous free blocks • Can be used with very efficient bit operators • Downside: can take a lot of space for large disks with small blocks 000011110101100111001110111111001110110000110100001

Free-Space Management • Linked list of free blocks • Allocate space by picking blocks from head of list onwards, then update head pointer • No wasted space (pointers are in free blocks) • Downsides • Inefficient to scan entire free space list (though that’s infrequent) • Poor reliability: one pointer error can mark a file as free space • File Allocation Table • FAT keeps track of which block links to which block • Simple to add a special marker for free blocks

Free-Space Management • Indexed free space (grouping method) • A free block becomes an index block containing pointers to other free blocks • Used in a linked scheme to get limitless free space (potentially the entire volume) • Contiguous free space management (counting method) • Oftentimes sets of blocks are freed at once • In contiguous allocation or with clustering • Keep track of starting block and number of free contiguous blocks instead of address of free block • Typically creates a shorter free space list

Efficiency and Performance • Disk is the slowest part of the computer system – we want to use it efficiently! • Efficiency dependent on design decisions • UFS • Pre-allocate and distribute inodes on volume, then write file data near the inodes • Wastes space, but improves allocation, free space management, and reduces disk seek • Vary cluster size to reduce internal fragmentation • FAT • Limit pointer sizes to limit wasted space • But that also limited the maximum disk size that could be managed • Pointers had to increase from 12 bits to 16 bits to 32 bits over time

Efficiency and Performance We can further improve system performance by caching disk data in memory Disk cache In disk controller, interface between read head and system bus Reduce latency time by reading data from disk to disk cache, then transferring from cache to memory Buffer cache In main memory, for blocks expected to be used again Page cache In main memory, for pages of file data expected to be used by a process If the same cache also stores process pages, we call it a unified virtual memory

Efficiency and Performance • Problem: • A file is read by a process using read() system calls, and is also put in memory by another process • It is in two caches at once: double-caching • Wastes memory • Wastes CPU cycles • Risk of inconsistencies • A unified buffer cache uses the page cache to cache both memory file pages and file system I/O

Efficiency and Performance • Cache is memory: Memory management is an issue! • Solaris up to 2.5.1 • No distinction between process pages and page cache pages • Result: a process putting a lot of files in memory caused the page cache to expand and fill up memory • Pageout algorithm kicked in, swapped out process pages • Thrashing! • Solaris 2.6 and 7 • Optional priority paging can be enabled to limit the growth of the page cache • Adds a new threshold “cachefree” before “lotsfree”, at which Pageout swaps out page cache pages only • Solaris 8 • Pageout handles process pages and page cache pages separately • One cannot reclaim pages from the other one

Efficiency and Performance • When to write to disk? • Asynchronous writeskeeps the chances in page cache, delays write to disk to later • Faster, most commonly done • Can fail if disk becomes unavailable in the mean time! • Synchronous writeswrites the changes to disk immediately • Slower, done for changes to FCB • Memory replacement algorithm • LRU not optimal for sequential file access • Free-behind removes a page from buffer when next page is requested • Read-ahead reads the requested page and several subsequent pages to memory at once • Minimizes overhead of having several small page I/O

Recovery • Disk is non-volatile: there is an expectation that changes made will remain no matter what • But a lot of information is cached in memory for efficiency • Volatile: if the system crashes, those changes are lost • Worse: FCB changes are synchronously written, so we can end up with inconsistencies • Consistency checking • Compare info in directory structure with data blocks on disk, and try to fix inconsistencies • Unix: fsck, Windows: chkdsk • Back up data from disk to another storage device • Full backup to completely duplicate file system • Incremental backup to duplicate changes since last backup • On error, restore latest backup

Recovery • Log-based transaction-oriented (or journaling) file systems • Record each update to the file system as a transaction • All transactions are synchronously written to a log before they are done • A transaction is considered committed once it is written to the log • The requested action is done asynchronously • When the action is done, the transaction is removed from the log • If the file system crashes, we have a log of committed but undone transactions • Execute them to avoid inconsistencies

Review • Compare our three allocation schemes (contiguous, linked, and indexed) in terms of • External fragmentation • Sequential read efficiency • Direct read efficiency • File size increases

Exercises • Skip the following sections: 12.2.2 (Partitions and Mounting), 12.2.3 (Virtual File Systems), 12.5.5 (Space Maps), 12.8 (NFS), 12.9 (Example: The WAFL File System) • If you have the “with Java” textbook, skip the Java sections and subtract 1 to the following section numbers • 12.1 • 12.3 • 12.4 • 12.5 • 12.6 • 12.9 • 12.10 • 12.11 • 12.13 • 12.14 • 12.16 • 12.19

End of Chapter 12

File System Implementation Overview

File System Implementation Overview

Presentation Transcript

Chapter 10: File System

ThreadOS: File System Implementation

Chapter 12: File System Implementation

File-System Implementation

Operating Systems

The Design and Implementation of a Log-Structured File System

Chapter 10: File-System Interface Chapter 11: File System Implementation

Chapter 11: File System Implementation

Chapter 12: File System Implementation

File System Implementation

Chapter 11: File System Implementation

Chapter 11: File System Implementation

Chapter 12: File System Implementation

Chapter 11: File System Implementation

Chapter 12: File System Implementation

Chapter 12: File System Implementation

File System Implementation

Chapter 11: File System Implementation

Chapter 10: File-System Interface

Chapter 11: File-System Implementation

Chapter 11: File System Implementation

Chapter 11: File System Implementation