470 likes | 503 Views
Explore how file systems manage data efficiently while providing users with secure, reliable, and durable file access. Learn about directories, disk management, user permissions, and file naming conventions. Dive into the components and challenges of file system design.
E N D
File Systems CSE 2431: Introduction to Operating Systems Reading: Chap. 13, §§14.1–14.5, 15.1–15.5, 20.7, [OSC]
Contents • Files • Directories • File Operations • File System Disk Layout • File Allocation
Why Files? • Physical reality • Block oriented • Physical sector #s • No protection among users of the system • Data might be corrupted if machine crashes • File system model • Byte oriented • Named files • Users protected from each other • Robust to machine failures
File System Requirements • Users must be able to: • Create, modify, and delete files at will. • Read, write, and modify file contents with minimal fuss about blocking, buffering, etc. • Share each other's files with proper authorization • Transfer information between files. • Refer to files by symbolic names. • Retrieve backup copies of files lost through accident or malicious destruction. • See a logical view of their files without concern for how they are stored.
File Types • ASCII – plain text (also Unicode/UTF-8) • A Unix executable file • Header: magic number, sizes, entry point, flags • Text (code) • Data • Relocation bits • Symbol table • Devices • Everything else in the system
So What Makes File Systems Hard? • Files grow and shrink in pieces • Little a priori knowledge • 6 orders of magnitude in file sizes • Overcoming disk performance behavior • Desire for efficiency • Coping with failure
File System Components User • Disk management • Arrange collection of disk blocks into files • Naming • User gives file name, not track or sector number, to locate data • Security • Keep information secure • Reliability/durability • When system crashes, lose stuff in memory; we want file durability File Naming File access Disk mgmt. Disk drivers
Contents • Files • Directories • File Operations • File System Disk Layout • File Allocation
Directories in Unix • Stored like regular files • Logic • Separates file from location in tree • Files can appear in multiple places
Directory Contents • Each entry is for one file: • File name (symbolic name) • File type indicates format of a file • Location device and location • Size • Protection • Creation, access, and modification date • Owner identification
Directory Operations • Maps symbolic names into logical file names • Search • Create file • List directory • Backup, archival, file migration
Problems With Single Level Directory • Name clashes when • More than one user • Large file systems • Moving files from one system to another
Two-Level Directory (2) • Introduced to remove naming problems between users • First level contains list of user directories • Second level contains user files • System files kept in separate directory or level 1 • Sharing accomplished by naming other users’ files
Tree-Structured Directories (2) • Arbitrary depth of directories • Leaf nodes are files • Interior nodes are directories • Path name lists nodes to traverse for finding file • Use absolute paths from root • Use relative paths from current working directory pointer
Acyclic Graph Structured Directories (2) • Acyclic graphs allow sharing • Two users can name the same file • Implemented by links - use logical names of files (file system and file) • Implemented by symbolic links map pathname into a new pathname • Duplicate paths complicates backup copies • Need reference counts for hard links
Symbolic Links • Symbolic links are different than regular links (often called hard links). Created with ln -s • Can be thought of as a directory entry that points to the name of another file. • Does not change link count for file • When original deleted, symbolic link remains • They exist because: • Hard links don’t work across file systems • Hard links only work for regular files, not directories dirent Contents of file symlink dirent Contents of file dirent Hard link Symbolic Link
General Graph Structured Directories (2) • Cycles • More flexible • More costly • Need garbage collection (circular structures) • Must prevent infinite searches
Contents • Files • Directories • File Operations • File System Disk Layout • File Allocation
Relevant Definitions • File descriptor (fd): Integer used to represent a file – easier than using names • Metadata: Data about data - bookkeeping data used to eventually access the “real” data • Open file table: System-wide list of descriptors in use
Types of Metadata • Inode: index node, or a specific set of information kept about each file • Two forms – on disk and in memory • Directory: names and location information for files and subdirectories • Note: stored in files in Unix • Superblock: contains information to describe the file system, disk layout • Information about free blocks/inodes on disk
Contents of an Inode • Disk inode: • File type, size, blocks on disk • Owner, group, permissions (r/w/x) • Reference count • Times: creation, last access, last mod • Inode generation number • Padding & other stuff • 128 bytes on classic Unix
Data Structures for Typical File System Process control block Open file table (systemwide) Memory Inode Disk inode Open file pointer array . . .
Open-file Table Information • File Pointer • Current file position pointer • File Open Count • Counter which tracks the number of file opens and closes. Why? • Disk Location • Information needed to locate the file on disk (in inode).
Opening A File fd = open(FileName, access) • File name lookup and authentication • Copy the file metadata into the in-memory data structure, if it is not in yet • Create an entry in the open file table (system wide) if there isn’t one • Create an entry in PCB • Link up the data structures • Return a pointer to user PCB Allocate & link up data structures Open file table File name lookup & authenticate Metadata File system on disk
Reading And Writing What happens when you… • Read 10 bytes from a file? • Write 10 bytes into an existing file? • Write 4096 bytes into a file? Disk works on blocks (sectors)
Reading A Block read(fd, userBuf, size) PCB Open file table Get physical block to sysBuf, copy to userBuf Metadata read(device, phyBlock, size) Buffer cache Logical phyiscal Disk device driver
Contents • Files • Directories • File Operations • File System Disk Layout • File Allocation
Disk Layout A possible file system layout
A Disk Layout for A File System • Superblock defines a file system • Size of the file system • Size of the file descriptor area • Free list pointer, or pointer to bitmap • Location of the file descriptor of the root directory • Other metadata such as permission and various times • For reliability, replicate the superblock
Effects of Corruption • Inode: file gets “damaged” • Directory • “Lose” files/directories • Might get to read deleted files • Free space bitmap information • Two file blocks allocated to the same block • Some blocks never get used • Superblock • Can’t figure out anything • This is why we replicate the superblock • How do you check for possible corruption?
Contents • Files • Directories • File Operations • File System Disk Layout • File Allocation
File Allocation in Disk Space • Low-level access methods depend upon disk allocation scheme used to store file data • Contiguous allocation • Linked list allocation • Indexed allocation
Contiguous Allocation (2) • Request in advance for the size of the file • Search bit map or linked list to locate a space: best fit, first fit, etc. • File header • First sector in file • Number of sectors • Pros • Fast sequential access • Easy random access • Easy to recover in case of crash • Cons • External fragmentation • Hard to grow files
Linked Files • File header points to 1st block on disk • Each block points to next • Example: FAT (MS-DOS) • Pros • Can grow files dynamically • Space efficient, little fragmentation • Cons • Random/direct access: horrible • Unreliable: losing a block means losing the rest • Need some bytes to store pointers File header . . . null
Indexed Allocation • Solves external fragmentation • Supports sequential, direct and indexed access • Access requires at most one access to index block first. This can be cached in main memory • File can be extended by rewriting a few blocks and index block • Requires extra space for index block, possible wasted space • Extension to big files issues
Other Forms of Indexed File Linked Link full index blocks together using last entry.
An Example of Indexed Allocation UNIX inode
Summary • Files • Directories • File Operations • File System Disk Layout • File Allocation