220 likes | 323 Views
phones off (please). CSCI2413 Lecture 8. Operating Systems File systems. Lecture Outline. File Principles File Allocation Contiguous linked-list Indexed File Allocation Table MS-DOS NTFS UNIX. Basic File Principles.
E N D
phones off(please) CSCI2413 Lecture 8 Operating Systems File systems
Lecture Outline • File Principles • File Allocation • Contiguous • linked-list • Indexed • File Allocation Table • MS-DOS • NTFS • UNIX CSCI2413 - L8
Basic File Principles • To create a new file, a new directory entry is made in the logical file system • To use a file for I/O, the directory structure has to be searched to find its attributes and data blocks • at each read/write this would entail high overhead • A file is opened once at the start of use • the operating system copies the file information into a table in operating system memory • the file is then referred to by index into this table • in UNIX these indices are called file descriptors • in Windows / WinNT they are called file handles CSCI2413 - L8
Contiguous Allocation • Each file is stored in a single group of adjacent blocks on the hard disk • simple to implement • excellent performance for file reads • The same principles apply as in the contiguous allocation of primary memory • similar allocation algorithms can be used • e.g. first-fit or best-fit • As with memory allocation, this scheme suffers from fragmentation problems • only worse, as de-fragmentation takes much longer! CSCI2413 - L8
Non-Contiguous Allocation directoryentry file block 1 file block 2 file block 3 file block 4 first next next next end • Keep each file as a linked-list of disk blocks • the directory entry points to the first block number • each block then points to the next block in the list • This overcomes the allocation problem, but • random file access is very slow CSCI2413 - L8
Indexed Allocation • Keep a list of all blocks allocated to a file in a special index table • an array of disk block numbers that comprise the file • non-allocated blocks have an index of a special value • usually null (zero) • Non-contiguous storage allowed • random access is quick • However, a lot of space wasted CSCI2413 - L8
Allocation Illustrations directory directory directory name start size file1.doc 0 2 file2.doc 5 5 file3.doc 16 8 name start file1.doc 0 file2.doc 1 file3.doc 2 name index file1.doc 0 file2.doc 8 file3.doc 16 0 1 2 3 0 1 2 3 0 1 2 3 4 5 6 7 4 5 6 7 4 5 6 7 8 9 10 11 8 9 10 11 8 9 10 11 12 13 14 15 12 13 14 15 12 13 14 15 16 17 18 19 16 17 18 19 16 17 18 19 20 21 22 23 20 21 22 23 20 21 22 23 24 25 26 27 24 25 26 27 24 25 26 27 28 29 30 31 28 29 30 31 28 29 30 31 contiguous linked list indexed CSCI2413 - L8
Partition Tables • The very first sector on a hard drive is the primary boot sector which also contains the partition table • A partition is a collection of adjacent blocks on the hard drive that are grouped together • The partition table defines contiguously allocated blocks by start block and size • first partition is referred to as logical drive C:, D:, etc. • each partition has its own OS and so its own logical file system • e.g. MS-DOS/Win9x, Windows NT, Linux CSCI2413 - L8
Directory Entries file name ext reserved time date start size 8 3 1 10 2 2 2 4 attribute flags • MS-DOS/Win essentially uses the linked-list allocation method with one important refinement • the links are not held with the data blocks, but are kept in a special array, called the file allocation table (FAT) • blocks are referred to as clusters • A directory entry consists of • the file name, other information and the start block • the root directory is held at a fixed location on the hard disk and is of a fixed size (with 112 entries maximum) CSCI2413 - L8
Clusters • 1 sector is too small to adopt as the main unit of storage; • Sectors are grouped together into fixed size unitscalled CLUSTERS; Typical sizes of clusters: - 2048 bytes ( 2K - 4 sectors) - 4096 bytes ( 4K - 8 sectors) - 65534 bytes (64K - 128 sectors) CSCI2413 - L8
File Allocation - MS-DOS FAT: a table of pointers to clusters: The pointer points to the next cluster in which the file continues. The cluster in which the end of file occurs has an entry of FFFF An empty cluster is given an entry of 0000 Start of file CSCI2413 - L8
FAT System • How many clusters can we have on a disk? • What is the max. amount of usable disk space? • The FAT table can be selected as 16 bit or 32 bit • 16-bit FAT gives 65536 entries • the file system is compatible with MS-DOS • 64K entries 32K clusters 2Gb maximum partition size • 32-bit FAT gives 4 109 entries • extremely large maximum partition size! CSCI2413 - L8
NTFS • Shortcomings of the (V)FAT structure is overcome by NTFS: • full access protection security • large capacity allocation scheme • internationalisation using long unicode file names • robust fault-tolerance with transaction logging • NTFS uses 64-bit disk addresses, supporting disk partitions of up to 264 bytes CSCI2413 - L8
NTFS Structures • NTFS uses a complex index allocation scheme • each NTFS volume is organised as a linear sequence of blocks (clusters) – block size can be 512 bytes to 64 Kb • the principal component is the master file table (MFT) • MFT is a variable size file that holds records describing every file and directory in a volume • a mirror (copy) of the MFT is kept near the middle CSCI2413 - L8
File Space Allocation – Unix Systems Unix systems have a pool of fixed size data structures called I-Nodes When a file is created, it is given an i-node which summarises all information about the file. FieldExample data Owner of file p03456789 Group Id (group to which user belongs) students File type regular Permissions rwxr-xr-x Last accessed 25 Nov 2004 18.30 Last modified 20 Nov 2004 09.00 Last I-Node modification 20 Nov 2004 09.00 File Size 56348 bytes No. of links (how many aliases) 2 Pointers to disk allocations 13 I-Node Structure CSCI2413 - L8
File Space Allocation – Unix Systems • Pointers point to disk blocks allocated to the file. • Disk blocks are typically 512 bytes • Pointers may point directly to a block or else point to index blocks which contain another set of pointers CSCI2413 - L8
UNIX I-node • The i-node is organised as 13 3-byte bytes addresses: • 10 addresses point to the first 10 data blocks of the file. • The last 3 addresses point to block on disk that contains the next portion of the index, which contain pointers to succeeding blocks in the file. CSCI2413 - L8
I-node Structure file attributes 11 266 65803 block 1 267 block 2 block 3 66058 522 block 4 131083 65547 block 5 block 6 131338 block 6 65802 block 8 block 9 block 10 single indirect double indirect triple indirect huge! Data blocks CSCI2413 - L8
Summary • MS-DOS systems used the FAT method of managing files. • FAT-based systems had many deficiencies, most important of these was the lack of security features. • Early versions of Windows were merely interfaces to the basic MS-DOS OS using FAT. • NTFS replaced FAT in Windows NT (organisational use) and Windows95, Windows 2000, Windows XP (personal use). • NTFS has many advantages over FAT and is one of the key components of the Windows NT OS. • Unix file system uses three main tables: the file descriptor table, the open file description table and the i-node table. The i-node table is the most important of these, containing all the administrative information about a file and the location of its blocks. • Protection is based on controlling read, write and execute access for the owner, group and others.
Appendices …. • A file allocation table (FAT) is a table that an operating system maintains on a hard disk that provides a map of the clusters (the basic units of logical storage on a hard disk) that a file has been stored in. • When you write a new file to a hard disk, the file is stored in one or more clusters that are not necessarily next to each other; they may be rather widely scattered over the disk. • The operating system creates a FAT entry for the new file that records where each cluster is located and their sequential order. When you read a file, the operating system reassembles the file from clusters and places it as an entire file where you want to read it. • Virtual File Allocation Table (VFAT) is the part of the Windows 95 and later operating system that handles long file names, which otherwise could not be handled by the original file allocation table programming.
…. • NTFS (NT file system; sometimes New Technology File System) is the file system that the Windows NT operating system uses for storing and retrieving files on a hard disk. NTFS is the Windows NT equivalent of the Windows 95 file allocation table (FAT) and the OS/2 High Performance File System (HPFS). However, NTFS offers a number of improvements over FAT and HPFS in terms of performance, extendibility, and security. • Notable features of NTFS include: • Use of a B-tree directory scheme to keep track of file clusters • Information about a file's clusters and other data is stored with each cluster, not just a governing table (as FAT is) • Support for very large files (up to 2 to the 64th power or approximately 16 billion bytes in size) • An access control list (ACL) that lets a server administrator control who can access specific files • Integrated file compression • Support for names based on Unicode • Support for long file names • Data security on both removable and fixed disks
How NTFS Works • When a hard disk is formatted (initialized), it is divided into partitions or major divisions of the total physical hard disk space. Within each partition, the operating system keeps track of all the files that are stored by that operating system. Each file is actually stored on the hard disk in one or more clusters or disk spaces of a predefined uniform size. Using NTFS, the sizes of clusters range from 512 bytes to 64 kilobytes. Windows NT provides a recommended default cluster size for any given drive size. For example, for a 4 GB (gigabyte) drive, the default cluster size is 4 KB (kilobytes). Note that clusters are indivisible. Even the smallest file takes up one cluster and a 4.1 KB file takes up two clusters (or 8 KB) on a 4 KB cluster system. • The selection of the cluster size is a trade-off between efficient use of disk space and the number of disk accesses required to access a file. In general, using NTFS, the larger the hard disk the larger the default cluster size, since it's assumed that a system user will prefer to increase performance (fewer disk accesses) at the expense of some amount of space inefficiency. • When a file is created using NTFS, a record about the file is created in a special file, the Master File Table (MFT). The record is used to locate a file's possibly scattered clusters. NTFS tries to find contiguous storage space that will hold the entire file (all of its clusters).