820 likes | 999 Views
Chapter 11: File System Implementation. Chapter 11: File System Implementation. File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance Recovery Log-Structured File Systems NFS
E N D
Chapter 11: File System Implementation • File-System Structure • File-System Implementation • Directory Implementation • Allocation Methods • Free-Space Management • Efficiency and Performance • Recovery • Log-Structured File Systems • NFS • Example: WAFL File System
Objectives • To describe the details of implementing local file systems and directory structures • To describe the implementation of remote file systems • To discuss block allocation and free-block algorithms and trade-offs
File-System Structure • File structure • Logical storage unit • Collection of related information • File system resides on secondary storage (disks) • File system organized into layers • File control block – storage structure consisting of information about a file
Disk Organization Boot Sector Volume Directory Blk0 Blk1 … Blkk-1 Track 0, Cylinder 0 … Blkk Blkk+1 Blk2k-1 Track 0, Cylinder 1 … … Blk Blk Blk Track 1, Cylinder 0 … … Blk Blk Blk Track N-1, Cylinder 0 … … Blk Blk Blk Track N-1, Cylinder M-1
In-Memory File System Structures • The following figure illustrates the necessary file system structures provided by the operating systems. • Figure 12-3(a) refers to opening a file. • Figure 12-3(b) refers to reading a file.
Virtual File Systems • Virtual File Systems (VFS) provide an object-oriented way of implementing file systems. • VFS allows the same system call interface (the API) to be used for different types of file systems. • The API is to the VFS interface, rather than any specific type of file system.
Chapter 11: File System Implementation • File-System Structure • File-System Implementation • Directory Implementation • Allocation Methods • Free-Space Management • Efficiency and Performance • Recovery • Log-Structured File Systems • NFS • Example: WAFL File System
Directory Implementation • Linear list of file names with pointer to the data blocks. • simple to program • time-consuming to execute • Hash Table – linear list with hash data structure. • decreases directory search time • collisions – situations where two file names hash to the same location • fixed size
Chapter 11: File System Implementation • File-System Structure • File-System Implementation • Directory Implementation • Allocation Methods • Free-Space Management • Efficiency and Performance • Recovery • Log-Structured File Systems • NFS • Example: WAFL File System
Allocation Methods • An allocation method refers to how disk blocks are allocated for files: • Contiguous allocation • Linked allocation • Indexed allocation
Contiguous Allocation • Each file occupies a set of contiguous blocks on the disk • Simple – only starting location (block #) and length (number of blocks) are required • Random access
Contiguous Allocation • Mapping from logical (LA) to physical Q (quotient) LA/512 R (remainder) • Block to be accessed = Q + starting address • Displacement into block = R
Contiguous Allocation • Problems • Wasteful of space • dynamic storage-allocation problem: finding the best hole • tend to externally fragment the physical disk space into small sets of contiguous blocks • these sets are too little to contain most files • does not accommodate dynamic file sizes • if add to end of file and not enough physical adjacent blocks, must rewrite entire file. • can use best fit, first fit, worst fit strategies.
Extent-Based Systems • Many newer file systems (I.e. Veritas File System) use a modified contiguous allocation scheme • Extent-based file systems allocate disk blocks in extents • An extent is a contiguous block of disks • Extents are allocated for file allocation • A file consists of one or more extents. • When run out of space, allocate another extent and link it to the first extent • Problems • Internal fragmentation • Could get external fragmentation if extents are different sizes.
pointer block = First block … Head: 417 ... Length Length Length Byte 0 Byte 0 Byte 0 ... ... ... Byte 4095 Byte 4095 Byte 4095 Block 0 Block 1 Block N-1 Linked Allocation • Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk.
Linked Allocation (Cont.) • Simple – need only starting address • Free-space management system – no waste of space • No random access • Mapping Q LA/511 R Block to be accessed is the Qth block in the linked chain of blocks representing the file. Displacement into block = R + 1
Linked Lists • Disadvantages • random access of bytes in the stream will be slow. • requires list traversal • each block must be read form device so the link field can be obtained and next block referenced • can use doubly linked lists to increase performance
Linked Lists • Disadvantages • overhead • must store pointer to next block and length of the block • if use doubly linked lists, increases storage • Pointers take up space: if pointer is 4 bytes and block is 512 bytes, waste 0.78% of disk. • Solution: • collect blocks into multiples called clusters. • allocate clusters rather than blocks. • pointers use much smaller percentage of disk space • increase in internal fragmentation • clustersimprove disk access time for many other algorithms; used in most OS’s
DOS FAT file system • File-allocation table (FAT) – disk-space allocation used by MS-DOS and OS/2. • Uses linked allocation • MS-DOS floppy disk • uses the file allocation table (FAT) file system • disk divided into • a reserved area (contains the boot program) • the file allocation tables • a root directory • file space
DOS FAT file system • Space allocated for files represented by values in the allocation table • provide a linked list of all the blocks in the file • special values indicate end-of-file, unallocated blocks, bad blocks • Original FAT • no subdirectories • limited to very small disks • hard to recover the disk if the allocation tables were damaged.
DOS FAT file system • Simple FAT variant • one FAT entry corresponding to each block on the disk • a file is a set of disk blocks • the FAT entry corresponding to the first entry designates the logical sector number of the second block • the FAT entry of the second block specifies the logical sector number of the third block • etc. • the FAT entry for the last block contains an end-of-file (EOF) designator • The FAT table maps the logical sector numbers to physical blocks
DOS FAT file system • Simple FAT variant (cont) • the file descriptor gives the address of the first sector i • this is also the index into the FAT • can use the FAT entry to reference the next logical sector in the file. • if j is the contents of the FAT table for entry i • j is the logical sector number for the next block • j is also the index to the next FAT entry • i is the physical location of the current block
DOS FAT file system • To allocate a new block: find the first 0-valued table entry • then replace the previous end-of-file value with the address of the new block • replace the 0 in the new block with the end-of-file value. • FAT can result in many disk head seeks • disk head must move to the start of the partition to read the FAT and find the location of the block in question • then move to the location of the block itself. • worst case: both moves occur for each of the blocks. • solution: cache the FAT • FAT is good for random access • disk had can find location of any block by reading the FAT.
DOS FAT file system • other FAT variants • FAT varies among disk types • change number of entries in the FAT (can be 12, 16, or 32) • change number of actual tables • change size of the logical sector addressed by a FAT entry
DOS FAT file system • other FAT variants: use clusters of sectors • cluster is a group of contiguous sectors • a cluster is treated as a virtual sector within the FAT. • the FAT entry addresses a cluster rather than an individual sectors • disk space is allocated to files on a cluster basis • today: • floppy disks use 12-bit FATs • hard drives use 16-bit or 32-bit FATs
Indexed Allocation • Brings all pointers together into the index block. • Logical view. index table
Indexed Allocation (Cont.) • Need index table • Random access • Dynamic access without external fragmentation, but have overhead of index block. • Mapping from logical to physical in a file of maximum size of 256K words and block size of 512 words. We need only 1 block for index table. Q LA/512 R Q = displacement into index table R = displacement into block
Indexed Allocation – Mapping (Cont.) • Mapping from logical to physical in a file of unbounded length (block size of 512 words). • Linked scheme – Link blocks of index table (no limit on size). Q1 LA / (512 x 511) R1 Q1= block of index table R1is used as follows: Q2 R1 / 512 R2 Q2 = displacement into block of index table R2 displacement into block of file:
Indexed Allocation – Mapping (Cont.) • Two-level index (indexed scheme) (maximum file size is 5123) Q1 LA / (512 x 512) R1 Q1 = displacement into outer-index R1 is used as follows: Q2 R1 / 512 R2 Q2 = displacement into block of index table R2 displacement into block of file:
Indexed Allocation – Mapping (Cont.) outer-index file index table
UNIX file system • UNIX file structure uses a variant of the indexed allocation scheme • external file descriptor contains pointers to 15 different storage blocks • first 12 blocks of the file are indexed directly by the first 12 pointers • the last 3 pointers used for indirect pointers • each points to an index block
UNIX file system • Usual implementation • file man uses 4K blocks • the 12 direct pointers accommodate files up to 48K • if a file needs more blocks, file man allocates an index block • the 13th pointer points to this index block • the index block contains only direct pointers • if yet more blocks are needed, the 14th pointer is pointed to a double indirect block, • this is an index block, each of whose entries point to another index block • if yet more blocks are needed, the 15th pointer is pointed to a triple indirect block
UNIX files • How large can UNIX files be? • depends on size of blocks and size of disk addresses used in the system. • 12 direct blocks • blocks 0 - 11 are accessed via the direct pointers in the inode • Assume an indirect block can store 1,000 disk addresses. • Block addresses 12 - 1, 011 accessed indirectly through the single indirect block. • the 14th block pointer is double indirect. • points to a block that points to 1,000 indirect blocks • blocks 1,012 to 1,001,011 accessed via double indirect list • the 15th block pointer is triple indirect • addresses blocks 1,001,012 to 1,001,001,011
UNIX files • How large can UNIX files be? • This is 4000GB. • 32-bit versions of BSD UNIX do not use triple indirect pointer • 32-bit block addresses preclude file sizes > 4GB • 64-bit block addresses need triple indirect
Chapter 11: File System Implementation • File-System Structure • File-System Implementation • Directory Implementation • Allocation Methods • Free-Space Management • Efficiency and Performance • Recovery • Log-Structured File Systems • NFS • Example: WAFL File System
Free-Space Management • How should unallocated blocks be managed? • Need a data structure to keep track of them • Index block • same as for conventional file, but no info in any block • initially very large; makes it impractical: index table too large • Linked list • Very large • Hard to manage spatial locality • want to allocate closely located blocks to a file • this minimizes seek time • hard to do with a linked list; have to traverse the list to find appropriate blocks
Free-Space Management • Need a data structure to keep track of them • Grouping • Modification of free-list (linked-list) approach • Store the addresses of n free blocks in the first free block • n - 1 of these are actually free; last block contains pointers to the nextn free block. Etc. • Can now find a large number of free blocks quickly. • Counting • Don’t keep the address of every free block. • Free blocks are often consecutive • So keep the address of the first free block and the number of following consecutive blocks that are free. • Each entry in the free-space list consists of a disk address and a count.
Free-Space Management • Bit vector (block status map or “disk map”) (n blocks) 0 1 2 n-1 … 0 block[i] free 1 block[i] occupied bit[i] = Block number calculation (number of bits per word) * (number of 0-value words) + offset of first 1 bit Most CPUs have special bit-manipulation instructions. Example: Intel and PowerPC have instructions that return the offset in a word of the first bit with the value 1
Free-Space Management (Cont.) • Bit map requires extra space • Example: block size = 212 bytes disk size = 230 bytes (1 gigabyte) n = 230/212 = 218 bits (or 32K bytes) • Easy to get contiguous files • Counting