240 likes | 360 Views
Operating Systems CMPSCI 377 Lecture 15: File Systems I. Emery Berger University of Massachusetts, Amherst. Course Outline. Processes & Threads CPU Scheduling Synchronization & Deadlock Memory Management File Systems & I/O Distributed Systems. Files: OS Abstraction.
E N D
Operating SystemsCMPSCI 377Lecture 15: File Systems I Emery Berger University of Massachusetts, Amherst
Course Outline • Processes & Threads • CPU Scheduling • Synchronization & Deadlock • Memory Management • File Systems & I/O • Distributed Systems
Files: OS Abstraction • Files: another OS-provided abstraction over hardware resources
File System Abstraction • Applications operate on files in a file system • Device-independent interface • open(), close(), link(), read(), write(), rename() • Device-level interface • Manage disk in terms of sectors, tracks, & blocks • seek(), readblock(), writeblock() • OS converts these to operations on raw hardware (disk)
User Expectations on Data • Persistence • Data lives across jobs, power outages, crashes • Speed • Quick access to data • Size • Ability to store lots of data • Sharing/protection • Users can share or restrict access to data when appropriate • Ease of use • User can easily find, examine & modify data
Hardware/OS Support for Data • Persistence: disks: non-volatile memory • Speed: random access devices • Size: disk capacity growing fast • Persistence: redundancy,fault-tolerance • Sharing/protection: UNIX privileges • Ease of use: • Names associated with data (files) • Hierarchical directories • Transparent mapping of devices hardware operating system
Files • Named collection of related information recorded on secondary storage • Logical unit of storage on a device • e.g., helloworld.cpp, resume.doc • Can contain programs (source, binary) or data • Can be structured or unstructured • IBM mainframe OS: series of records (structured) • UNIX: file = stream of bytes (unstructured) • Files have attributes: • name, type, location, size, protection, creation time...
User Interface to File System • Data operations: • create(), delete() • open(), close() • read(), write(), seek() • Naming operations: • get & set attributes (ownership, protection) • hard link, soft links, rename
OS File-Related Data Structures • Open file table – shared by all processes with open file • open count • file attributes (ownership, protection, etc.) • location(s) of file on disk • pointers to location(s) of file in memory • Per-process file table – one for each file • Pointer to entry in open file table • Current position in file (offset) • Mode in which process accesses file (r, w, r/w...) • Pointers to file buffer
File Operations: Creating a File • Create(name) • Allocates disk space • Checks disk quotas, permissions, etc. • Creates file descriptor for file • (name, location on disk, attributes) • Adds file descriptor to directory that contains file • May mark file with type attribute (esp. Mac) • Advantages: error detection, launch appropriate app • Disadvantages: not supported everywhere, complicates file system and OS, less flexible • UNIX: no types, Windows: file extensions indicate type
File Operations: Deleting a File • Delete(name) • Find directory containing file • Free disk blocks used by file • Remove file descriptor from directory
File Operations:Opening Files • fd = open(name, mode) • Check if file is already open by another process • If not: • Find file • Copy file descriptor into system-wide open file table • Check protection of file against requested mode • If not OK, abort • Increment open count • Create entry in process’s file table pointing to entry in system-wide table • Initialize current file pointer to start of file
File Operations: Closing Files • close (fd) • Remove entry for file in process’s file table • Decrement open count in system-wide file table • If open count == 0: • Remove entry from system-wide file table
OS File Operations: Reading File • Random access: read(fd, from, size, buf) • OS reads “size” bytes from file position “from” into “buf” for (i = from; i < from + size; i++) buf[i – from] = file[i]; • Sequential access: read(fd, size, buf) • OS reads “size” bytes from current file position “from” into “buf”; increments file position for (i = 0; i < size; i++) buf[i] = file[fp + i]; fp += size;
OS File Operations • write • Similar to read, but copies from buffer to file • seek • Just updates fp • memory mapping file • map portion of virtual address space to file • read/writes to that memory = OS reads/writes from corresponding location in file • avoids need for explicit reads/writes
File Access Methods • Common file access patterns from programmer’s perspective • Sequential: data processed in order • Most programs use this method • Example: compiler reading source file • Keyed: address block based on key table • Example: database search, hash table, dictionary • Common file access patterns from OS perspective • Sequential: pointer to next byte; update on read/write • Random: address any block directly given offset in file
Naming & Directories • Need method of retrieving files from disk! • OS uses numbers (inodes) for files • People prefer names... • OS provides directory to map names to file descriptors
Flat File Systems • One level directory • One namespace for entire disk, every name unique • Directory contains (name, index) pairs • Used by Apple, CP/M, DOS 1.0, first MacOS • Two level directories • Separate directories for each user Not a problem for 140K disks, but...
Hierarchical File Systems • Tree-structured name space • Used by all modern operating systems • Directory becomes special file on disk • Marked by special flag bit • User programs may read directories, but only system may manipulate directories • Each directory contains (name, inode) pairs, names need not be unique in file system • Distinguished root directory (UNIX) or drive names (Windows)
Referential Naming • Hard links (UNIX: ln command) • Allows multiple links to single file • Example: “ln A B”: • initially: A -> file # 100 • after: A, B -> file # 100 • OS maintains reference counts • Deletes file only after last link to it is deleted • Problem: users could create circular links with directories • Reference counting can’t reclaim cycles • Solution: can’t hard link directories
Referential Naming • Soft (symbolic) links (UNIX: ln -s) • Makes symbolic pointer from one file to another • “ln -s A B”: • Initially, A -> file #100 • After, A -> file #100, B -> A • Removing B does not affect A • Removing A: dangling pointer • Problem: Circular links can cause infinite loops • E.g., list all files in directory & its subdirectories • Solution: (lame) Limit number of links traversed
Directory Operations • Search for file: locate entry • Create file: adds directory listing • Delete file: removes directory listing • List directory: list all files (ls) • Rename file • Traverse file system
Protection • OS must allow users to control access to files • Grant or deny access to file operations depending on protection information • Access lists and groups (Windows) • Access list for each file with user name and access type • Lists can become large & tedious to maintain • Access control bits (UNIX) • Three categories of user (owner, group, world) • Three types of access privileges (read, write, execute) • One bit per operation (111101000 = rwxr-x----)
Summary • File systems provide • Naming • Protection • Persistence • Fast access • Next time: file system implementation