570 likes | 747 Views
Operating Systems. CST 352 File Systems. Topics. Introduction File System Considerations Naming Structure Types Access Operations. Topics. Directories Single Level Hierarchical Operations Implementation Layout Allocation Directories Free Space Management. Introduction.
E N D
Operating Systems CST 352 File Systems CST 352 - Operating Systems
Topics • Introduction • File System Considerations • Naming • Structure • Types • Access • Operations CST 352 - Operating Systems
Topics • Directories • Single Level • Hierarchical • Operations • Implementation • Layout • Allocation • Directories • Free Space Management CST 352 - Operating Systems
Introduction • All computer systems need to retrieve and store information. • A machine must be capable of being powered down without losing important information. • Information must also remain viable outside the reference of a creating/consuming process. CST 352 - Operating Systems
Introduction • Persistence Requirements • Information must transcend process boundaries. • Information must transcend power cycles of a machine. • Multiple processes/threads must be able to have simultaneous access to information. • Stored information must be capable of growing to large quantities. CST 352 - Operating Systems
Introduction To deal with the aforementioned requirements, the most common solution is to use a magnetic disk. On the magnetic disk, information needs to be organized in groups, known as files. CST 352 - Operating Systems
Introduction A File is an abstract way to represent information stored on some persistent media, such as a magnetic disk. Using a file, information can be created by a thread in a process and written to a disk, then later read by a thread in a separate process. CST 352 - Operating Systems
File System Considerations File Naming • To deal with files on a magnetic disk, the process using the file must have some way to refer to the chunk of disk storage. • File names must conform to standards set by the operating system. • A typical naming scheme deals with two parts – a name and an extension. • The extension can be used by the OS to associate files with application programs. CST 352 - Operating Systems
File System Considerations File Naming – Possible components. • protocol (or scheme) — access method (e.g., http, ftp, file etc.) • host (or network-ID) — host name, IP address, domain name, or LAN network name (e.g., wikipedia.org, 207.142.131.206, \\MYCOMPUTER, SYS:, etc.) • device (or node) — port, socket, drive, root mountpoint, disc, volume (e.g., C:, /, SYSLIB, etc.) • directory (or path) — directory tree (e.g., /usr/bin, \TEMP, [USR.LIB.SRC], etc.) • file — base name of the file • type (format or extension) — indicates the content type of the file (e.g., .txt, .exe, .COM, etc.) • version — revision number of the file CST 352 - Operating Systems
File System Considerations File Structure • Structure as simply a sequence of 8 bit values. • Most common and most flexible. • Using process must interpret the file. • Structure as a sequence of fixed length records. • Allows each record to be structured for block read. • Structure fields in the file using a BTree type structure. • Allows fast access and indexing. CST 352 - Operating Systems
File System Considerations File Types • Regular Files: These files are created by users and contain information relative to those user processes. • Directories: System files used to manage groupings of files. • Character Special Files: Used by the system for spooling of serial I/O devices. • Block Special Files: Used by the system to model disk I/O. CST 352 - Operating Systems
File System Considerations File Types • Regular Files are of two types: • ASCII– The file can be dumped directly to the screen. All characters are printable. • Binary – The file contains a sequence of binary information, not necessarily ASCII. • In UNIX, binary files start with a “magic number”. The OS uses this to determine if the file is a true executable file. CST 352 - Operating Systems
File System Considerations File Access • Sequential Access – The bytes in a file must be accessed in a serial fashion. There can be no random seeks through a file. • Random Access – The bytes in a file can be accesses in any order. Access is done based on a “key-value” pair. CST 352 - Operating Systems
File System Considerations File Organization • Sequential Access • Pile • Variable length records • Variable set of fields • Chronological order Each record in the file contains a burst of data. Data is appended to the file as it shows up for write. Read access must be done sequentially. Search for a particular item must be an exhaustive search. CST 352 - Operating Systems
File System Considerations File Organization • Sequential Access • Pile CST 352 - Operating Systems
File System Considerations File Organization • Sequential Access • Sequential File • Fixed-length Records. • Fixed set of fields in fixed order. • Sequential order based on a “key” field. CST 352 - Operating Systems
File System Considerations File Organization • Sequential Access • Sequential File CST 352 - Operating Systems
File System Considerations File Organization • Random Access • Indexed Sequential File • Characteristics are the same as those of a sequential file. • A “key” based index of access pointers (file pointers) is maintained to give random access points into file records. CST 352 - Operating Systems
File System Considerations File Organization • Random Access • Indexed Sequential File CST 352 - Operating Systems
File System Considerations File Organization • Random Access • Indexed File • A tree based index is created to give direct access to file records. • Multiple tree indexes may be deployed to search on different key types. • Records can be variable length. CST 352 - Operating Systems
File System Considerations File Organization • Random Access • Indexed File CST 352 - Operating Systems
File System Considerations File Organization • Random Access • Hashed Index • Disk records are of variable length. • A hash table is deployed to map a key to the actual disk address. CST 352 - Operating Systems
File System Considerations File Organization • Random Access • Hashed Index CST 352 - Operating Systems
File System Considerations File Attributes In the header of a file, special attributes are stored for management of the file: • Permission – What process can and cannot access the file. • Password – File access control. • Creator – What process or user created the file. • Owner – Who is the current owner of the file. • RW Flag – Readable? CST 352 - Operating Systems
File System Considerations File Attributes • Hidden Flag – Can the file be seen by directory listings? • System Flag – Is this a system file? • Archive Flag – Has this file been archived? • Binary Flag – Is this a binary file? • Etc. CST 352 - Operating Systems
File System Considerations File Operations • Create – Write a file entry point in the file system. • Delete – Free up any disk space associated with the file to be deleted. • Open – Get the file attributes and address copied from the disk into main memory. Initialize the file pointer. • Close – Remove the file attribute cache from main memory. CST 352 - Operating Systems
File System Considerations File Operations (cont’d) • Read – Read a sequence of bytes from the file from the current file pointer position. • Write – Write a sequence of bytes to the file based on the current file pointer position. • Append – Add a sequence of bytes to the end of a file. • Seek – Move the file pointer to a new position in the file. CST 352 - Operating Systems
File System Considerations File Operations (cont’d) • Get Attributes – Fetch only the attributes of a file. • Set Attributes – Set the attributes of a file. • Rename – Assign a new logical name to the file. CST 352 - Operating Systems
File System Considerations File System Generic Layered Architecture CST 352 - Operating Systems
File System Considerations File System Generic Layered Architecture • Physical Device – The actual hardware. • Device Drivers – Perform operations on the hardware (e.g. start, stop, read, write, etc.) • Basic File System – Block interface, buffering, read commands, write commands. • Basic I/O Subsystem – File I/O initiation and termination. Management of control structures. • Logical I/O Interface – Present file I/O to the file system as records of data. CST 352 - Operating Systems
Directories Directories provide an abstract method to create a “file of files”. Creating a directory allows the ability to store multiple files in a file system. CST 352 - Operating Systems
Directories Single Level There is one “file of files” in the system. The directory can only contain files, not directories. CST 352 - Operating Systems
Directories Hierarchical Allow any directory to contain directories as one of the entries. A directory that can contain files and other directories. CST 352 - Operating Systems
Directories Operations • Find – find a file or directory in a directory. • Create File – create a new file entry in this directory. • Delete File – delete a file from this directory. • List – list a directory or file or all directories and file contained in this directory. CST 352 - Operating Systems
Implementation Allocation Strategies • To manage files, the free space on the disk must be tracked. • Every time a file is created and written to, the file manager must write to a block (group of sectors) on disk. • In addition to handling free and used disk space, the file system must keep track of what blocks go with which files. CST 352 - Operating Systems
Implementation Allocation Strategies • Contiguous – Keep a file as a contiguous sequence of disk blocks. • Advantages: • Simple implementation • Must only keep track of the starting block and the number of blocks used for the file. • Read performance is ideal • A file is located in a continuous sequence of blocks, requiring minimum seek. CST 352 - Operating Systems
Implementation Allocation Strategies • Contiguous (cont’d) • disadvantages: • High fragmentation • Initially, new files are added to the end of free space. • As files are freed, space will open up. • The file system will then use this free space, most likely for a file smaller than the one freed up. • To create a file, the file size must be known in advance. • Files that grow larger that their original size must be relocated to a new contiguous area of free blocks on the disk. CST 352 - Operating Systems
Implementation Allocation Strategies • Linked List – Keep a linked list of free disk sectors. • The start of each free block has the address to the next free block. • When a file is created, the next free block will be added to the file descriptor. • The file system just needs to keep track of the first block address. CST 352 - Operating Systems
Implementation Allocation Strategies • Linked List (cont’d) • Advantages • There will be no disk fragmentation • Management is simple • Disadvantages • File reading is limited to sequential access • Random access is non existent CST 352 - Operating Systems
Implementation Free Space Management • Bitmap • Keep a bitmap where each bit corresponds to a block on disk. • 1 – allocated • 0 – free CST 352 - Operating Systems
Implementation Popular File Systems • FAT (File Allocation Table – Created by Bill Gates) • NTFS (New Technology File System – Microsoft) • XFS (X File System – Silicon Graphics) • HFS+ (Hierarchical File System – Apple) • EXT 2/3/4….(Linux, Android, others???) • ZFS (Free BSD) – popular??? I don’t know • UFS (Unix File System) – not so popular any more CST 352 - Operating Systems
Implementation Popular File Systems • File Allocation Table (FAT) – Keep the file pointers in a table in memory (an array implementation of a linked list). • A Cluster is a Group of Sectors on the Hard Drive that have information in them. • A 16K Cluster has 32 Sectors in it (512*32=16384). • Each Cluster has an entry in the FAT Table. • FAT 16 – Limited to 216 (16 bit) entries (clusters). • A File name maps to an entry in the FAT table. CST 352 - Operating Systems
Implementation Popular File Systems • File Allocation Table (FAT) – Entry Structure: FAT Code Range Meaning 0000h Available Cluster 0002h-FFEFh Used, Next Cluster in File FFF0h-FFF6h Reserved Cluster FFF7h BAD Cluster FFF8h-FFFFh Used, Last Cluster in File CST 352 - Operating Systems
Implementation Popular File Systems • File Allocation Table (FAT) – Keep the file pointers in a table in memory (an array implementation of a linked list). • Advantages • File pointer access is fast because it is in memory. • Entire block is available for memory. • File chain may be followed without accessing the disk. CST 352 - Operating Systems
Implementation Popular File Systems • File Allocation Table (FAT) • disadvantages • The entire FAT must be in memory. • 20 Gbyte disk with a 1 Kb block requiring 20 million entries. Each entry is 4 bytes (FAT 32). The table will therefore be 60 Mbytes of memory. CST 352 - Operating Systems
Implementation Popular File Systems • NTFS • Physical disk space is divided into clusters (like FAT). • MFT -12% of the partition is set aside for the Master File Table. • The first 16 MFT files are special “housekeeping” files for NTFS (called metafiles). CST 352 - Operating Systems
Implementation Popular File Systems • NTFS • MFT - the common table of files. • The centralized directory of all remaining disk files and itself. • MFT is divided into records of the fixed size (usually 1 KBytes) • Each record corresponds to some file. CST 352 - Operating Systems
Implementation Popular File Systems • NTFS - Partition Layout CST 352 - Operating Systems
Implementation Popular File Systems • NTFS - MFT Entry Structure CST 352 - Operating Systems
Implementation Popular File Systems • NTFS – Metafiles $MFT Itself MFT $MFTmirr copy of the first 16 MFT records placed in the middle of the disk $LogFile journaling support file $Volume housekeeping information - volume label, file system version, etc. $AttrDef list of standard files attributes on the volume $. root directory $Bitmap volume free space bitmap $Boot boot sector (bootable partition) $Quota file where the users rights on disk space usage are recorded (began to work only in NT5) $Upcase File - the table of accordance between capital and small letters in files names on current volume. It is necessary because in NTFS file names are stored in Unicode that makes 65 thousand various characters and it is not easy to search for their large and small equivalents. CST 352 - Operating Systems