270 likes | 348 Views
Phones OFF Please. File Systems Parminder Singh Kang Home: www.cse.dmu.ac.uk/~pkang Email: pkang@dmu.ac.uk. Introduction File systems are needed: Because main memory is not big enough To maintain permanent copies of information
E N D
Phones OFF Please File Systems Parminder Singh Kang Home: www.cse.dmu.ac.uk/~pkang Email: pkang@dmu.ac.uk
Introduction • File systems are needed: • Because main memory is not big enough • To maintain permanent copies of information • File system should be device independent to allow programs to use the same • commands with different devices, • Disks provide the bulk of secondary storage on which a file system is maintained. • The advantages of using disks are; • Ability to rewrite at same place. • User can access directly given block of information • And it improves efficiency and performance of operations by transferring data in unit of blocks instead of byte by byte.
1 File Subsystem • Provides users/applications with a logical interface • Impose uniform structure on storage, • i.e. typically hierarchical directory structure • Refer to elements by meaningful names • Specify operations on storage in application terms, e.g. read a real number • Maps logical organisation to a physical storage media • Hides details of the physical organisation – • Using device drivers allows all I/O to be treated alike
1.1Device drivers • Transfer from/to peripheral requires series of steps : • Check current status • Initiate status change • Request transfer • Receive notification complete • 1.2 File Subsystem requirement • Specifying Logical File characteristics • Cataloguing/locating files • Mapping physical to logical • Supporting file operation • Controlling access
1.3 File System Structure • I/O Control: • Consist of device drivers and interrupt handlers to transfer information between main memory and disk system. • The basic function of I/O controller is to access specific location on device. • Basic File System: • Main function is to issue generic commands to appropriate device driver to read and write physical block on disk. • Basic file system uses concept of physical address space. Each block is identified by numeric disk address (e.g. drive, cylinder, track, sector) • File Organization Module: • track of file allocation used and mapping between logical and physical blocks. • By knowing type of file allocation and physical address; file organization module translates logical block address to physical address.
Application programs Logical File System File-Organization module Basic File System I/O Control Devices • Logical File System: • Contains metadata (file structure information, Inode information and file control block information). • All information is managed by file control block (FCB). FCB includes information about file name, inodes, permissions, location of file content etc.
1.4 File System Structure Implementation • File system implementation refers to disk and memory. • Implementation varies with operating system and file system use. • On Disk Implementation: • Disk Label (VTOC) • Boot Control Block • Primary Super Block • Backup Super Block • Inode
In Memory Implementation: • Contains information about each mounted partition. • In memory directory structure holds information about recently accessed directories. • Open file table contains copy if Inodes for each open file. disk and memory file system implementation and why they needed?
2. UNIX file characteristics • Structure • File is flat sequence of bytes • UNIX imposes no structure • Other O/S sometimes do • Naming conventions • Char sequence as name • Max length can vary • Type • UNIX does not infer type from name • UNIX has different file type • Regular • Directories • Symbolic Links (Hard or Soft) • Device
Organisation • Defines whether file is accessed sequentially or randomly • Both supported by current offset pointer • Access • Define who can do what to file • Record when it was last done • 3. Physical Storage • Physical storage is mainly • hard disks • CD-Rom/DVD • Floppy disks • Magnetic tape (backup)
4. UNIX File System • btree structured • only one tree (one root) • may be multiple disks • i.e. uses a device independent hierarchical which is regarded as a tree: • root • user • user files • bin etc dev usr • Device file systems can be attached to the tree using the program /etc/mount
Once device is mounted files can be accessed using directory name; • the device does not have to be known, e.g. • cp /user/test.dat mytest.dat • to copy a file to the current directory. • This has the advantage that a file system can be moved to a different device without • the programs, which use it needing modification. • MS-DOS and Windows on the other hand, are not device independent, • i.e. one has to use device names, e.g. • copy a:test.dat c:\user.
4.1 Types of file • Unix has: • regular files - users programs and data, etc. • directories • special files - I/O devices, e.g. /dev/tty and /dev/hd1 • MS-DOS/Windows has • regular files - users programs and data, etc. • Directories • special files – prn: con:, com1:, etc. • 4.2 File names • Standard UNIX allows up to 14 characters in a filename with combinations • of name and extension as required, e.g. test.data. • In UNIX the extension is solely for the programmers convenience, • i.e. to identify types of files at a logical level, • e.g. a user may end all data files with. data
4.3 Using files • two basic operation needed; read and write. • At a program level one usually has a set of language or library or • systems calls to access files • 4.4 Directories (only OS can write into Directory, Justify?) • a directory is a special system file which contains details of other files • provide a logical interface to user to keep track of files. • Simple file systems, e.g. CP/M, have one directory per device: • These become large with many users • one can have name conflicts. • Alternatives are to have one directory per user (RSX) or • many directories per user (UNIX, MS-DOS, Windows). • absolute path names (from the root) • e.g. on UNIX /usr1/stf/bb/public/opsys/progs/pipe1.cc • path names relative to the current directory, • e.g. network/programs/terminal or ../../test_data/test.data
4.5 File structure • At the lowest (physical) level one can read/write blocks from/to a device. • At a logical level one reads/writes records; where a record would be anything • from a byte (read a character) to N bytes (read a large structure). • 4.6 Disk space management • The surface of a disk is divided into a number of cylindrical tracks each of • which is divided up into a number of sectors.The unit reading/writing from/to a • disk is a block. (where a block may be anything from a sector to a track). • Disk I/O transfer speed is made up from (depends upon rotation speed of disk): • start-up time (for floppies) ~ 0.5sec • track seek time (time to move head to require track) ~ 5-20msec for hard disk • latency (time for required sector to appear under head) ~ 0.5-5msec • transfer time ~ 0.05 - 0.1 mSec for 1Kblock • larger the block size the faster the transfer rate will be but the more space is wasted. • a disk cache can improve transfer rates significantly.
4.7 Disk Partitions • To assist in the overall organisation of file system large disks can • be split into logical disks by partitions. • Boot block (Master boot record) • Partition 1 • Partition 2 etc • Each partition can be either “raw” containing no file system • or “cooked” containing file system. • Raw disk is used when no file system is appropriate. • For example; UNIX swap space uses raw partition, as it uses its own format and does not use a file system. • Boot information can be stored in a separate partition. (Why?) • Root Partition, Contains operating system kernel and system files are mounted • at boot time.
5 Unix File storage • A file system needs to keep track of • where every file is stored • details of each file • where next block of file is • which blocks are free • which blocks are in use • 5.1 Allocating Space to Files • allocate a contiguous sequence of blocks • unused small areas • problem with file growth • Linked list of blocks • problem with random access • index blocks (UNIX uses a version of this)
5.2 UNIX file storage structure • only O/S can write to a directory • each entry hasinode number and name • ┌───────────────┬────────── ─┐ • │ i-node number │ file name │ • └───────────────┴──────── ───┘ • command df (disk free) will tell you how many i-nodes are free. • An i-node contains the following information on the file: • file mode (indicates type of file - normal file, special file, etc.) • number of links to file (e.g. from other directories) • owners user id • owners group id • access permissions for each user type, e.g. read, write, and execute • file size in characters • time created, last accessed, last modified • location of first 10 blocks (if file < 10 blocks contains address of the file) • single indexed, double indexed, triple indexed
5.3 Locating a file • To locate a file’s data requires the following loop • Inode-data-inode-data…. • Note: • Root i-node is always i-node 2 • Each directory has entries for . (current directory) and . . (parent directory)
root dir user dir staff dir sam dir ┌───────────┐ ┌────►┌───────────┐ ┌────►┌───────────┐ ┌────►┌───────────┐ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ├───────────┤ │ ├───────────┤ │ ├───────────┤ │ ├───────────┤ ┌─┤ user │ │ ┌─┤ staff │ │ ┌─┤ sam │ │ ┌─┤ x.data │ │ ├───────────┤ │ │ ├───────────┤ │ │ ├───────────┤ │ │ ├───────────┤ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ └───────────┘ │ │ └───────────┘ │ │ └───────────┘ │ │ └───────────┘ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ user i-node │ │ staff i-node │ │ sam i-node │ │ x.data i-node │ ┌───────────┐ │ │ ┌───────────┐ │ │ ┌───────────┐ │ │ ┌───────────┐ └►│ │ │ └►│ │ │ └►│ │ │ └►│ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ location ├──┘ │ location ├──┘ │ location ├──┘ │ location ├──► file │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘
The sequence of events is: • the root directory is searched for the user directory file entry • the i-node number is extracted and the location of the user directory found from the i-node • the user directory is searched for the staff directory file entry • the i-node number is extracted and the location of the staff directory found from the i-node • the staff directory is searched for the sam directory file entry • the i-node number is extracted and the location of the sam directory found from the i-node • the sam directory is searched for the x.data file entry • the i-node number is extracted and the locations of the x.data file found from the i-node
5.4 The i-node and data addressing • Addresses of data blocks of file stored in inode: • 10 direct pointers • 1 single indirect pointer to a block of addresses • 1 double indirect pointer to a block, which contains pointers to blocks of addresses. • Some systems have a third level pointer. • 5.4.1 how big should the pointers be? • Early versions of Unix used 16 bit pointers. • With 1K blocks this meant you were limited to 65Mbyte as the largest disk. • Later versions of Unix used 32 bit pointers and 4K (or 8K blocks) • This gives a maximum file/disk size of 4K x 4Gbytes = 16Tbytes. • Modern versions of Unix are using 64 bit pointers.
5.4.2 File Operations • Opening an existing file • Creating a file • Reading from a file • Writing to a file • Closing a file • Deleting a file • Changing access permissions • Renaming a file
5.4.3 Links in Unix • Each file has one i-node but may have many directory entries • Each name entry is a link to an i-node • links may be hard or soft • Hard links • each directory entry points directly at same i-node • the i-node maintains count of links to it • this only operates on a single device • Soft Links • Special file containing the path to the target file • Separate i-node • Can span devices
5.5 Efficiency and Performance • Unix uses a buffer cache to hold large block of memory • As blocks are read they are stored in the cache, Reading next block can go on • while current block is being processed • If cache is sufficiently large or not? • further improvement using Delayed write (can be problem if system crashes) • i-nodes written back immediately • Written data blocks are flushed after a few sec’s • written to disk but marked delayed write. • Block can be modified further before it reaches the head of the list • when it is then written. Useful if file is deleted before block written. • Eventually cache fills • Block that was accessed longest ago is flushed • Read ahead improves efficiency
6 Log Structured File Systems • CPU’s are getting faster • Memory is getting faster • Disks are getting bigger, but not much faster • This creates bottleneck in the file system - especially for large file servers (Solution?) • Log structured file systems • Most accesses are to the cache • writes slow the system (small quantity of data) • Disks operate most efficiently with large writes (one or more tracks) therefore; • Collect writes together and write them all at once as a log record. • If record is big (~ 1Mbyte) disk will operate efficiently. • Record contains i-nodes, directories, data mixed up. • Need a table to keep track of where every i-node is. Keep this in memory and on disk.
Note: • Much more complex to administer. • Eventually disk fills. • Have a garbage collection process which goes through log records and • compacts them.Disk operates like a very large circular buffer. • 7 DOS/Windows • The file system has • Boot sector • FAT • Root directory and Data blocks • The directory entry contains all the details about the file including the name. • It has a pointer to the first block. • To find the next block the system uses a FAT (File Allocation table). • The FAT is a large one dimensional array. There is an entry for each block • which contains either • The address of the next blockor End Of File marker