600 likes | 2.11k Views
File Systems. A collection of directories and files. Many Operating Systems support multiple, virtual, file system (VFS) organizations A VFS is an abstraction, which enables a single system call to abstract the file system organization details from the developer
E N D
File Systems A collection of directories and files Many Operating Systems support multiple, virtual, file system (VFS) organizations • A VFS is an abstraction, which enables a single system call to abstract the file system organization details from the developer • The system call provides a middle layer, which transfers to the correct low-level object-oriented interface
File Record Structure File: collection of records, Record: collection of fields • No Structure: A sequence of bytes • Record structure: Lines of text, Fixed length, variable length • Complex Record Structures • Formatted documents with appropriate control characters • Relocatable load files • Database table rows • Combination of binary fields • Who decides the structure: • Operating system • Program
File Control Block (FCB) OS data structure consisting of information about a file • Name –human-readable • Identifier – unique number identifies each file • Type – most systems support different types • Location – pointer to file location on device • Size – current file size • Protection – access rights and owner • Time, date, and user identification – data for protection, security, and usage • Where is file information maintained? On a disk resident directory structure
Abstraction of a ‘raw’ partition as collections of files and directories File System • Partition: Contains a file system on disk, consists of: • File control blocks (FCB):Defines a file’s attributes • Directory/Folder:A collection of FCBs • Boot Control Block: OS load Information • Partition Control Block: Information about the partition
File System Operations A file is an abstract data type with well-defined operations • Create • Write and Read • Reposition within file (Seek) • Delete or Truncate • Open – Load the file information from the directory structure into memory • Close – update the file information on disk and release resources
Open File Information • File pointer: pointer to last read/write location, per process that has the file open • File-open count: Allows removal from the open-file list on the last close • Pointers: Disk location and a data access cache • Access rights: per-process access mode information • Locking Information: mediates access to a file • Mandatory – access denied based on record locks • Advisory – processes can inquire lock status
Java File Exclusive Lock FileLock exclusive=null; public static final booleanEXCLUSIVE=false; try { RandomAccessFileraf = new RandomAccessFile("file.txt","rw"); FileChannelch = raf.getChannel(); // exclusively lock the first half of the file exclusive = ch.lock(0,raf.length()/2,EXCLUSIVE); /** Now modify the data . . . */ exclusive.release(); // release lock. } catch (Exception ioe) { System.out.println("I didn't like that"); } Blocks till lock available, or InterruptedException, or AsynchronousCloseExcception
Java File Shared Lock FileLock shared=null; public static final boolean SHARED=true; try { RandomAccessFile raf = new RandomAccessFile("file.txt","rw"); FileChannel ch = raf.getChannel(); // Shared lock on the top half long len = raf.length(); shared = ch.lock(len/2+1,len, SHARED); /** Now read the data . . . */ sharedLock.release(); // release lock. } catch (java.io.Exception ioe) { System.err.println("I didn't like that"); }
Direct and Sequential Access • Sequential Access: read, write, append, reset, rewrite (cannot read previously written records) • Direct (Random) Access: seek, read, write
File System Software Structure • Virtual File System (VFS): wrapper between applications and different file systems • Uniform application view File Types Files/Folders Read/Write Layered Approach
F 1 F 2 F 3 F 4 F n Directory Structure Directory: A collection of nodes containing file information Directory Files Typical File System Organization
Directory Design Note: A directory is another abstract data type • Operations: Search, Create, Delete, List, Rename, Traverse • Design Criteria • Efficiency – locating a file quickly • Naming – convenient to users, aliases, unique full qualified path names • Grouping – by extension or properties • Access control • Design decisions • Should sub-directories be removed on a delete operation? • What kind of path names should be allowed? • Are absolute and relative paths supported?
Directory Structure Goals: • Convenient name space • Quick to access and locate • Ability to group related files Definitions: Path (absolute, relative) working directory Single Level: Fails Goal b and c Two level: Fails Goal c Tree Structured
Single and Two Level Directories • Single level • Disadvantages: Name conflicts, no sub-folders • Can have the same names for different users • Efficient searching but no sub-folders
Tree-Structured Directories • Efficient searching, can group by sub-folders, Working directory, absolute/relative path names • Problem to resolve: How should links (aliases) work?
Acyclic-Graph Directories Cycles can lead to infinite loops • Problems sharing directories and files • Aliased names (link) • Multiple link levels • Dangling pointers • Solutions • A linked list of back pointers • Lazy detection • Follow link chains • Remove data when entry count = 0
General Graph Directory Issues Cycle detection algorithms Garbage collection algorithms Only allow links to files, not directories
Mount Points • Definitions • Mount: Loading a remote file system for local access • Mount Point: the path point where a remote file system merges with the local structure • Top figure: un-mounted file systems • Bottom figure: The top right file system mounted over the users directory of the file system of the top left
File Sharing Files are shared by users locally, and over networks, and grids • Sharing protection: user and group identifications and access codes • Client Server Network Models: Network File System (NSF) or CISF (Windows Common Internet File System) using remote procedure calls • Consistency for simultaneous access • Remote File Transfer: FTP (WinSCP) • Remote Login: TELNET (PuTTY) • Issues • Handling network and server failure. • Transaction based systems • Stateless protocol: easy recovery, but less security • State-based protocols: difficult recovery, better security • Establishing when updates become visible to other users
Access Control • File owner/creator controls: what can be done by whom • Types of access (Read, Write, Execute, Append, Delete, List) • Mode of access: read, write, execute • Three classes of users and examples of access rights a) owner access (u) 7 1 1 1 (RWX) b) group access (g) 6 1 1 0 (RW) c) public access (o) 1 0 0 1 (X) • System administrator creates group names and adds lists of users to it. • Owner defines access to a particular file (say game) or subdirectory Command to set access rights to a file: Owner (user) Public (other) group game Example: chmod u+rwx g+rw o+x gameExample: chmod g-x u-rw game Example: chmod u=rwx g=rw o=x game chmod 761 Associate file game with group staff: chgrp staff game
File System Transient Data in Memory (a) Opening a file (b) Reading a file
Directory Structure Alternatives • Simple List: names & disk pointers • simple to program • O(n) search time • Hashed • O(1) directory search time • collisions possible • fixed hash table size • Other alternatives • Separate chaining • Sorted list O(lg n) find; O(n) deletion Simple list structure Contiguous allocation
Allocating Space for Files Contiguous allocation • Each file occupies a set of contiguous blocks on the disk • Simple – Only starting block # and number of blocks are required • Both random and sequential access is possible • External fragmentation (holes) • Files cannot grow; adjacent space might be allocated • Some systems allocate in groups of blocks (extents or clusters). Files are linked lists of these contiguous allocations Logical to physical translation of record R Block = start + R*record size/block size Offset = R*record size % block size
Linked Allocation of File Space • Files are linked lists of blocks: blocks may be anywhere • Simple – Only need a directory’s starting address • No external fragmentation, but no random access • File-allocation table (FAT) used by MS-DOS and OS/2 has a chain of available clusters of blocks • Caching reduces disk seeks FAT Location of record R Block = located by linked list traversal Offset = R*record size % block size Free block count
Indexed Allocation of File Space • index block contains block pointers • Index table must be maintained and is linked • Random access possible • Allows dynamic access without external fragmentation • Index table can be cached Location of record R Block = located by index table lookup Offset = R*record size % block size
Multi-level Indexed Allocation of File Space Inode outer-index UNIX (4K bytes per block) file index table
Management of Free Space • Bit vector (bit per block; 0=free) • Extra space needed • Example: bit/block size = 4096disk size = 1 gigabytespace = 230/(212*23)= 32 KB • Easy to find groups of free blocks • Linked list (free list) • Finding contiguous space hard • No waste of space • Grouping: separate lists ordered by contiguous block size • Counting: A Linked list contains block #s + a count of adjacent free blocks • Issue: Maintaining consistency between memory structures and those on disk
Efficiency • Efficiency dependent on: • Allocate and access algorithms • FCB’s and directory content • Caching • Caching • By Buffer: cache disk blocks in separate section of memory • By Page: cache pages using virtual memory techniques. (Memory-mapped I/O) • Algorithm Optimizations • Use free-behind (release previously read blocks) and read-ahead replacement to optimize sequential access • Dedicate section of memory as a virtual disk (RAM disk). Various Disk-Caching Locations
Unified and Non-unified Buffer Cache • Page cache: holds pages, rather than disk blocks • Buffered Cache:holds recently used disk blocks • Unified Buffer Cache: Same cache for both file I/O and Memory Mapped Files • Non Unified Buffer Cache: Separate page cache for Memory Mapped Files and for file I/O. Requires extra copying Unified Buffer No Unified Buffer
Reliability • Consistent back up procedures • System programs perform full or incremental back ups • Data recovery recovers any lost data from a back up device • Consistency checking on reboot • Inconsistent directory/block allocations automatically repaired • Log structured (or journaling) to minimize seeks • Write file system operations to a transaction on a circular buffer (or log). • Transaction committed after log write operations complete • A background task processes log transactions • Asynchronously updates the file system • Deletes appropriate log records after the update completes • After a crash, the system finishes any partial operations
The Sun Network File System (NFS) Software specification for accessing remote files across LAN or WAN • Networked system view: independent and heterogeneous • Sharing of file systems: transparent to users • Mount operations: • require specifying the host IP address • Remote directories are mounted over any local file system directory; they hide the directories and subdirectories over which they mount • Cascading mounts: locally mount over other mounted file systems. Users do not get access to subdirectories remotely mounted over remote directories • Implementation: • Remote Procedure calls (RPC) & External Data Representation (XDR) protocol • Servers are stateless but maintain client lists for server shutdowns
NFS Mounting Purpose: Establish connections • Mount operation • usr/shared mounts over usr/local • User loses access to local • Cascaded mount operation • usr/dir2 mounts over usr/local/dir1 • Now dir2 hides dir1 Pseudo code Establish connection with server Request name of remote directory to mount Server returns file handle, containing file-system identifier/inode number User view changes and the remote file system becomes available Remote procedure calls for file/directory operations available Three independent file systems
NFS Protocol NFS servers • Uses buffering (server side) and caching (client side). The local kernel checks if the local cache is up to date • All operations are synchronous • Utilizes RPC calls • 1-1 API with UNIX system calls (except open, close) • NO concurrency-control • Request are stateless, with a full set of arguments
NFS Path-Name Translation • Performed by breaking the full path into path component names and performing a separate NFS lookup call for every pair: path component name and directory virtual node (vnode) • To make lookup faster, a directory name lookup cache on the client’s side holds the vnodes for remote directory names