610 likes | 734 Views
File Systems. Fred Kuhns (fredk@arl.wustl.edu, http://www.arl.wustl.edu/~fredk) Department of Computer Science and Engineering Washington University in St. Louis. Why a file system. There is a general need for long-term and shared data storage: need to store large amount of information
E N D
File Systems Fred Kuhns (fredk@arl.wustl.edu, http://www.arl.wustl.edu/~fredk) Department of Computer Science and Engineering Washington University in St. Louis
Why a file system • There is a general need for long-term and shared data storage: • need to store large amount of information • persistent storage (outlives process and system reboots) • concurrent sharing of information • Files meet these requirements • The file manager or file system within the OS Cs422 – Operating Systems Organization
File Concept • Abstraction presented to the user • Named collection of related information on secondary storage. • File name may encode the file type • file extensions in UNIX and Windows • Common examples of File types • Regular files, directories • Executable files • special files (block and character) • Archives Cs422 – Operating Systems Organization
File Structure • None - sequence of words, bytes • Simple record structure • Lines, Fixed length, Variable length • Complex Structures • Formatted document, multi-media documents • Who decides: • Operating system • Application • “Middleware” • DBMS Cs422 – Operating Systems Organization
File Attributes • Name – only information kept in human-readable form. • Type – needed for systems that support different types. • Location – pointer to file location on device. • Size – current file size. • Protection – controls who can do reading, writing, executing. • Time, date, and user identification – data for protection, security, and usage monitoring. • Information about files are kept in the directory structure, which is maintained on the disk. Cs422 – Operating Systems Organization
File Operations • create • write • read • reposition within file – file seek • delete • truncate • open(Fi) – search the directory structure on disk for entry Fi, and move the content of entry to memory. • close (Fi) – move the content of entry Fi in memory to directory structure on disk. Cs422 – Operating Systems Organization
File Types – name, extension Cs422 – Operating Systems Organization
Access Methods • Sequential Access - • read next • write next • reset • no read after last write • (rewrite) • Direct Access: n = relative block number • read n • write n • position to n • read next • write next • rewrite n Cs422 – Operating Systems Organization
Directory Structure • A collection of nodes containing information about all files. Directory Files F 1 F 2 F 3 F 4 F n Both the directory structure and the files reside on disk. Backups of these two structures are kept on tapes. Cs422 – Operating Systems Organization
Information in a Device Directory • Name • Type • Address • Current length • Maximum length • Date last accessed (for archival) • Date last updated (for dump) • Owner ID (who pays) • Protection information (discuss later) Cs422 – Operating Systems Organization
Operations Performed on Directory • Search for a file • Create a file • Delete a file • List a directory • Rename a file • Traverse the file system Cs422 – Operating Systems Organization
Organize the Directory (Logically) • Efficiency – locating a file quickly. • Naming – convenient to users. • Two users can have same name for different files. • The same file can have several different names. • Grouping – logical grouping of files by properties, (e.g., all Pascal programs, all games, …) Cs422 – Operating Systems Organization
Single-Level Directory • A single directory for all users. • Naming problem • Grouping problem Cs422 – Operating Systems Organization
Two-Level Directory • Separate directory for each user. • Path name • Can have the same file name for different user • Efficient searching • No grouping capability Cs422 – Operating Systems Organization
Tree-Structured Directories Cs422 – Operating Systems Organization
Tree-Structured Directories • Efficient searching • Grouping Capability • Current directory (working directory) • cd /spell/mail/prog • type list Cs422 – Operating Systems Organization
Tree-Structured Directories • Absolute or relative path name • Creating a new file is done in current directory. • Delete a file rm <file-name> • Creating a new subdirectory is done in current directory. mkdir <dir-name> Example: if in current directory /spell/mail mkdir count mail prog copy prt exp count Deleting mail deleting the entire subtree rooted by ‘mail’ Cs422 – Operating Systems Organization
Acyclic-Graph Directories • Have shared subdirectories and files. Cs422 – Operating Systems Organization
Acyclic-Graph Directories • Two different names (aliasing) • If dict deletes all dangling pointer. Solutions: • Backpointers, so we can delete all pointers.Variable size records a problem. • Backpointers using a daisy chain organization. • Entry-hold-count solution. Cs422 – Operating Systems Organization
General Graph Directory Cs422 – Operating Systems Organization
General Graph Directory (Cont.) • How do we guarantee no cycles? • Allow only links to file not subdirectories. • Garbage collection. • Every time a new link is added use a cycle detection algorithm to determine whether it is OK. Cs422 – Operating Systems Organization
Protection • File owner/creator should be able to control: • what can be done • by whom • Types of access • Read • Write • Execute • Append • Delete • List Cs422 – Operating Systems Organization
Access Lists and Groups • Mode of access: read, write, execute • Three classes of users RWX a) owner access 7 1 1 1RWX b) groups access 6 1 1 0 RWX c) public access 1 0 0 1 • Ask manager to create a group (unique name), say G, and add some users to the group. • For particular file or subdirectory, define an appropriate access. owner group public chmod 761 game Attach a group to a file chgrpG game Cs422 – Operating Systems Organization
File-System Structure • Disk divided into one or more partitions • independent FS on each partition • Sector 0 contains the Master Boot Record (MBR) • MBR contains partition table • one partition marked as active • boot block – first block of active partition • BIOS reads and executes MBR, which reads boot block and executes it. • program in boot block loads OS and runs it. • Often FS contains superblock which contains key FS parameters Cs422 – Operating Systems Organization
Example Disk and Filesystem Layout Partition Table partition 2 (active) partition 1 partition 3 MBR boot block super block free space management inode list root dir files & dirs Cs422 – Operating Systems Organization
Files: Contiguous Allocation • Each file occupies a set of contiguous blocks on the disk. • Simple – only starting location (block #) and length (number of blocks) are required. • Random access. • Wasteful of space external fragmentation, may use compaction to fix. • Files cannot grow. • Mapping from logical to physical. Cs422 – Operating Systems Organization
Linked Allocation • Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk. pointer block = Allocate as needed, link together; e.g., file starts at block 9 Cs422 – Operating Systems Organization
Linked Allocation (Cont.) Cs422 – Operating Systems Organization
Linked Allocation (Cont.) • Simple – need only starting address • Free-space management system – no waste of space • No random access • Mapping Cs422 – Operating Systems Organization
Indexed Allocation • Brings all pointers together into the index block. • Logical view. index table Cs422 – Operating Systems Organization
Example of Indexed Allocation Cs422 – Operating Systems Organization
Indexed Allocation (Cont.) • Need index table • Random access • Dynamic access without external fragmentation, but have overhead of index block. • Wasted space • Index levels Cs422 – Operating Systems Organization
Indexed Allocation – Mapping outer-index file index table Cs422 – Operating Systems Organization
Combined Scheme: UNIX Cs422 – Operating Systems Organization
Disk Space Management • Block size – disk utilization and performance dependent on file size. • For example, assume medium size of 2KB • 130,000 B/track, 7200RPM (8.33ms), 10ms average seek time • Txfer time = 10 + 4.165 + 8.33*k/130000 • rate = k/(10 + 4.165 + 8.33*k/130000) • Disk space utilization • with larger blocks you increase internal fragmentation Cs422 – Operating Systems Organization
Free-Space Management • Bit vector (n blocks) 0 1 2 n-1 … 0 block[i] free 1 block[i] occupied bit[i] = Cs422 – Operating Systems Organization
Free-Space Management (Cont.) • Bit map requires extra space. Example: block size = 212 bytes (4096 Bytes) disk size = 234 bytes (16 GByte)n = 234/212 = 222 bits (4Mbits=512 KBytes) • Linked list (free list) • Cannot get contiguous space easily • No waste of space • Grouping • Counting Cs422 – Operating Systems Organization
Free-Space Management (Cont.) • Need to protect: • Pointer to free list • Bit map • Must be kept on disk • Copy in memory and disk may differ. • Cannot allow for block[i] to have a situation where bit[i] = 1 in memory and bit[i] = 0 on disk. • Solution: • Set bit[i] = 1 in disk. • Allocate block[i] • Set bit[i] = 1 in memory Cs422 – Operating Systems Organization
Directory Implementation • Linear list of file names with pointer to the data blocks. • simple to program • time-consuming to execute • Hash Table – linear list with hash data structure. • decreases directory search time • collisions – situations where two file names hash to the same location • fixed size Cs422 – Operating Systems Organization
Efficiency and Performance • Efficiency dependent on: • disk allocation and directory algorithms • types of data kept in file’s directory entry • Performance • disk cache – separate section of main memory for frequently sued blocks • free-behind and read-ahead – techniques to optimize sequential access • improve PC performance by dedicating section of memory as virtual disk, or RAM disk. Cs422 – Operating Systems Organization
Various Disk-Caching Locations Cs422 – Operating Systems Organization
Recovery • Consistency checker – compares data in directory structure with data blocks on disk, and tries to fix inconsistencies. • Use system programs to back up data from disk to another storage device (floppy disk, magnetic tape). • Recover lost file or disk by restoring data from backup. Cs422 – Operating Systems Organization
File System Implementations • UNIX Examples - SVR4 and BSD Cs422 – Operating Systems Organization
UNIX FS Framework Provides persistent storage Facilities for managing data Interface exported abstractions: files, directories, file descriptors and file systems kernel does not interpret file contents files and directories form tree structure Cs422 – Operating Systems Organization
File and Directory Organization / (hard) links bin etc dev usr vmunix sh local etc /usr/local/bin/bash bin bash Cs422 – Operating Systems Organization
File Attributes Type - for example regular, FIFO, special. Reference count size in bytes device id ownership access modes timestamps Cs422 – Operating Systems Organization
User View of Files • File Descriptors (open, dup, dup2, fork) • All I/O is through file descriptors • references the open file object • per process object • File Object - holds context • created by an open() system call • stores file offset • reference to vnode • vnode - abstract representation of a file Cs422 – Operating Systems Organization
How it works File Descriptors {{0, uf_ofile} {1, uf_ofile} {2 , uf_ofile} {3 , uf_ofile} {4 , uf_ofile} {5 , uf_ofile}} Open File Objects {*f_vnode,f_offset,f_count,...}, {*f_vnode,f_offset,f_count,...}, {*f_vnode,f_offset,f_count,...}, {*f_vnode,f_offset,f_count,...}, {*f_vnode,f_offset,f_count,...}} Vnode/vfs In-memory representation of file Vnode/vfs In-memory representation of file Vnode/vfs In-memory representation of file Vnode/vfs In-memory representation of file Vnode/vfs In-memory representation of file Cs422 – Operating Systems Organization
File Systems File hierarchy composed of one or more File Systems One File System is designated the Root File System Attached to mount points File can not span multiple File Systems resides on one logical disk Cs422 – Operating Systems Organization
Logical Disks Viewed as linear sequence of fixed sized, randomly accessible blocks. A file system must reside in a logical disk, however a logical disk need not contain a file system. Typically physical disks divided into partitions that correspond to logical disks Cs422 – Operating Systems Organization