360 likes | 488 Views
CSNB334 Advanced Operating Systems 7. File Management. Introduction. The file system part of the operating system provides the resource abstractions typically associated with secondary storage. A file is a collection of data with the following properties: Long-term existence.
E N D
Introduction • The file system part of the operating system provides the resource abstractions typically associated with secondary storage. • A file is a collection of data with the following properties: • Long-term existence. • Sharable b/w processes. • Structure : Hierarchical. • Typical operations on a file • Create • Delete • Open • Close • Read • Write. • Attributes of a file • Owner, creation time, time last modified, access privileges etc.
Linux File Structure • Linux views a file as a named stream of bytes for writing/reading from storage devices without distinction into physical fields, records and so on. • A simple description of the UNIX system, also applicable to Linux, is this: • "On a UNIX system, everything is a file; if something is not a file, it is a process." • Files include • Programs, services, texts, images, and so forth. • named pipes • Sockets • Input and output devices.
Linux File Manager • Gives a set of functions (system calls) to manipulate files: • int open(char *pathname, int oflag, [, int mode]); • int creat(char *pathname, int mode) • int read( int filedes, char *buf, unsigned int nbytes) • int close(int filedes) • int write(int filedes, char *buf, unsigned int nbytes) • long lseek(int filedes, long offset, int where) // to position a file • where = 0; offset from beginning of file. • where = 1; offset from current position in the file • where = 2; offset + size of file. • int ioctl(int filedes, unsigned long request, char * arg) • Used to change the behaviour of an open file.
The Linux (Virtual) File System • Linux includes a versatile and powerful file handling facility – the VFS – designed to support a wide variety of file management systems and file structures. • Basically a VFS is a kernel software layer that handles all system calls related to a standard Unix filesystem. • Its main strength is providing a common interface to several kinds of filesystems to user processes regardless of the target file system or the underlying processor hardware. • This allows Linux to access files from disks in other OS formats such as Windows, MINIX etc.
Filesystems supported by the VFS • Filesystems for Linux • Second Extended Filesystem (Ext2), the recent Third Extended Filesystem (Ext3), and the Reiser Filesystems (ReiserFS ). • Filesystems for Unix variants • sysv filesystem (System V , Coherent , Xenix ), UFS (BSD , Solaris , NEXTSTEP ), MINIX filesystem, and VERITAS VxFS (SCO UnixWare ) • Microsoft filesystems • MS-DOS, VFAT (Windows 95 and later releases), and NTFS (Windows NT 4 and later releases) • ISO9660 CD-ROM filesystem and Universal Disk Format (UDF ) DVD filesystem • Other proprietary filesystems • IBM's OS/2 (HPFS ), Apple's Macintosh (HFS ), Amiga's Fast Filesystem (AFFS ), and Acorn Disk Filing System (ADFS ) • Additional filesystems originating in systems other than Linux • such as IBM's JFS and SGI's XFS • You can see which file systems are registered by looking in at /proc/filesystems.
The Role of the Virtual Filesystem (VFS) • Let's assume that a user issues the shell command: $ cp /floppy/TEST /tmp/test • where /floppy is the mount point of an MS-DOS diskette and /tmp is a normal Second Extended Filesystem (Ext2) directory. • The VFS is an abstraction layer between the application program and the filesystem implementations. • Therefore, the cp program is not required to know the filesystem types of /floppy/TEST and /tmp/test. • Instead, cp interacts with the VFS by means of generic system calls.
The Common File Model • The key idea behind the VFS consists of introducing a common file model capable of representing all supported filesystems. • This model strictly mirrors the file model provided by the traditional Unix filesystem. • However, each specific filesystem implementation must translate its physical organization into the VFS's common file model.
The Linux (Virtual) File System • When a process initiates a file-oriented system call the kernel calls a function in the VFS. • This function handles the file-system-independent manipulations such as • Check access rights • Close an open file • Modify the file pointer (with lseek()) • The file-system-dependent manipulations such as • Determining where blocks are located on the disk • Instructing device drive to read/write blocks • are handled by a translator (mapping function) that converts the call from the VFS into a call to the target file system.
An example • From the previous example (cp command), consider the read( ) command. This would be translated by the kernel into a call specific to the MS-DOS filesystem. • The application's call to read( ) makes the kernel invoke the corresponding sys_read( ) service routine. • Each file is represented by a file data structure in kernel memory. • This data structure contains a field called f_op that contains pointers to functions specific to MS-DOS files, including a function that reads a file. • sys_read( ) finds the pointer to this function and invokes it. • Thus, the application's read( ) is turned into the rather indirect call: • file->f_op->read(...);
The Linux (Virtual) File System • VFS is an OO scheme. • you have a base class, named filesystem which has a bunch of virtual methods which are overridden by every other custom file system present in the kernel. • Since, it is written in C, rather than an OO langauge • VFS objects are implemented simply as C data structures. • Each object contains • Data • Pointers to file-system-dependent functions that operate on that data.
The Linux (Virtual) File System • The four primary object types in VFS are: • Superblock object. • Represents a specific mounted file system. • Inode object. • Metadata for a file on disk. • Dentry object. • A specific component in a path • File object. • In-memory representation of an open file.
The superblock object • The superblock object holds information about each mounted file system. • Owes its name to historical heritage • When the first block of a disk partition (called the superblock) was used to hold the meta-information about the partition itself. • The actual data structure in linux • struct super_block. • Holds information • Device that this filesystem is mounted on. • Basic block size of the file system. • Flags, such as a read-only flag. • Mount time. • File type • Dirty flag, to indicate that the superblock has been changed but not written back to disk. • Semaphore for controlling access to the file system. • List of superblock operations.
Superblock • struct super_block { • kdev_t s_dev;/* device */ • unsigned long s_blocksize;/* block size */ • unsigned char s_blocksize_bits;/* ld(block size) */ • unsigned char s_lock;/* superblock lock */ • unsigned char s_rd_only; • Unsigned char s_dirt; • Struct file_system_type *s_type; • Struct super_operations *s_op; • Unsigned long s_flags; • Unsigned long s_magic; • Unsigned long s_time; • Struct inode *s_covered;/* mount point */ • Struct inode *s_mounted; /* root inode */ • Struct wait_queue *s_wait;/* s_lock wait queue */ • Union { • Struct minix_sb_info minix_sb; • Struct ext2_sb_info ext2_sb; • …. • Void *generic_sb; • }u;
Superblock operations • s_op points to a vector of functions for accessing the file system. • struct super_operations { void (*read_inode)(struct inode *); // reads a specified inode from a mounted file system. Int (*notify_change)(struct inode *,struct iattr *); // Called when inode attributes are changed Void (*write_inode)(struct inode *); // Write given inode to disk. Void (*put_inode)(struct inode *); // if inode is no longer required.Called when deleting file and release its blocks. Void (*put_super)(struct super_block *); // Void (*write_super)(struct super_block *); // Called when the VFS decides that the superblock needs to be written to disk. Void (*statfs)(struct super_block *,struct statfs *); Void (*remount_fs)(struct super_block *,int *,char *); }; • These functions serve to translate the specific representation of the superblock and inode on data media to their general form in memory and vice-versa.
The Inode Object • An inode (Index Node) is associated with each file. • The inode object holds all information (metadata) about a named file (except its name and the actual data contents). • Owner • Group • Permissions • Access time • On-disk location of the file’s data. • Size of data it holds • Number of links • To obtain the inode number for a file : ls -i • An inode is both a physical object located on the disk of a filesystem and a conceptual object described in the kernel by a struct inode • Each inode object is associated with an inode number that uniquely identifies the file within the filesystem.
INODE • struct inode { Kdev_t I_dev; //ID of device containing the file or 0 Unsigned long I_ino; //file’s inode number Umode_t I_mode; //permissions Nlink_t I_nlinkl; //number of hard links Uid,gid etc…. Dev_t I_rdev; /* only if device special file */ Size ,times of modification,access,creation etc.. Struct inode_operations *I_op; ……. ……. } • System calls related to obtaining the metadata of a file • int stat (const char * path, struct stat * buf); • int fstat (int fd, struct stat * buf);
Inode Operations • Struct inode_operations { Struct file_operations *default_file_ops; Int (*create)(struct inode *,const char *,int,int,struct inode **); Int (*lookup)(struct inode *,const char *,int,struct inode **); Int (*link)(struct inode *,struct inode *,const char *,int); Int (*unlink)(struct inode *,const char *,int); Int (*symlink)(struct inode *,const char *,int); Int (*mkdir)(struct inode *,const char *,int);} • NOTE :All these functions are directly called from the implementation of the corresponding system call.
The Dentry Object • Directory entry (dentry) is a file that associates inodes to filenames. • The directory structure is very simple: each is an array of links.
Directory and Link Structure A link is a structure which associates a name (string) to an inode number. • Each file has to have at least one link in one directory. • This is true for directories too, except for the root directory. • All files can be identified by their path, which is the list of links which have to be traversed to reach the file (either starting at the root directory, or at the current directory). • A file can have links in many directories; • a directory has to have a single link towards itself (except ``.'' and ``..''), from its parent directory.
File Object • Every file that is opened by a process has a corresponding entry of the file object. • An “open file'' is described in the Linux kernel by a struct file item; • the structure encloses a pointer to the inode representing the file. The file object by itself has no corresponding image on the disk. • The main information stored in a file object is the file pointer – current position in the file from which the next operation will take place – different for different processes. • File structures are created by system calls like open, pipe and socket, and are shared by parent and child across fork.
File Structure struct file { struct list_head f_list; //This field links files together into one of a number of lists. There is one list for each active file-system, starting at the s_files pointer in the super-block. struct dentry *f_dentry; //This field records the directory entry that points to the inode for this file struct file_operations *f_op; //This field points to the methods to use on this file atomic_t f_count; //The number of references to this file. One for each different user-process file descriptor. unsigned int f_flags; // This field stores the flags for this file such as access type (read/write), nonblocking, appendonly etc. mode_t f_mode; loff_t f_pos; //This records the current file position which will be the address used for the next read request, and for the next write request if the file does NOT have the O_APPEND flag. unsigned long f_reada, f_ramax, f_raend, f_ralen, f_rawin; struct fown_struct f_owner; unsigned int f_uid, f_gid; //These fields get set to the owner and group of the process which opened the file. int f_error; unsigned long f_version; /* needed for tty driver, and maybe others */ void *private_data; };
File Methods struct file_operations { loff_t (*llseek) (struct file *, loff_t, int); ssize_t (*read) (struct file *, char *, size_t, loff_t *); ssize_t (*write) (struct file *, const char *, size_t, loff_t *); int (*readdir) (struct file *, void *, filldir_t); unsigned int (*poll) (struct file *, struct poll_table_struct *); int (*ioctl) (struct inode *, struct file *, unsigned int, unsigned long); int (*mmap) (struct file *, struct vm_area_struct *); int (*open) (struct inode *, struct file *); int (*flush) (struct file *); int (*release) (struct inode *, struct file *); int (*fsync) (struct file *, struct dentry *); int (*fasync) (int, struct file *, int); int (*check_media_change) (kdev_t dev); int (*revalidate) (kdev_t dev); int (*lock) (struct file *, int, struct file_lock *); };
Filesystem Mounting • Before using a filesystem, there are two basic operations that must be performed • Registration • Done when you build the kernel. • Mounting • For root filesystem – done at system initialization. • For other filesystems – done at any time.
Mounting • All files accessible in a UNIX system are arranged in one big tree, • the file hierarchy, rooted at /. • These files can be spread out over several devices. • The mount command serves to attach the file system found on some device to the big file tree.
Mount command • For example, to "mount" the DVD-ROM drive before you can access it. mount -t iso9660 /dev/hdc /cdrom • mount makes a device part of the file system. • -t iso9660 specifies the format of the file system being mounted. (The iso9660 is the standard format for data CDs (and most DVDs) but would be msdos if we were mounting a floppy drive with a DOS-formatted floppy in it.) • dev/hdc is the path to the DVD-ROM drive's device driver file. • /cdrom is the directory to "map" the device to in the file system so it can be accessed. • Called the “mount point” – can be any user defined directory.
Mounting (Contd..) • The "mount_root()" function takes care of mounting the first file system. • Every mounted file system is then represented by super_block structure. • The function read_super() of the virtual file system is used to initialize the superblock.
Registering the File Systems • When you build the Linux kernel you are asked if you want each of the supported file systems. • You can see which file systems are registered by looking in at /proc/filesystems. For example: ext2 nodev proc iso9660 • When the kernel is built, the file system startup code contains calls to the initialization routines of all of the built in file systems. • Each file system's initialization routine registers itself with the Virtual File System and is represented by a file_system_type data structure which contains the name of the file system and a pointer to its VFS superblock read routine.
file_system_type data structures • Each file_system_type data structure contains the following information: • Superblock read routine • This routine is called by the VFS when an instance of the file system is mounted, • File System name • The name of this file system, for example ext2, • Device needed • Does this file system need a device to support? Not all file system need a device to hold them. The /proc file system, for example, does not require a block device,
Opening a file • To open a file • The file manager searches the storage systems for the specified pathname. • This Involves opening each directory in the pathname searching the path for the next file or directory in the pathname. • If the search encounters a mount point, then it moves from one file system to the other and continues the search. • Once the file is found, • VFS checks file and user permission for that file. If the process has the correct permissions, VFS sets up various table entries to manage I/O. • Entry in file descriptor table (each process has one) – besides stdin(0), stdout(1) and stderr(2). • This entry is an integer value returned by the open() system call • Used for all subsequent references to the file. • The entry in file descriptor table points to an entry in the open file table which is of type struct file. • The file structure entry holds the status information specific to the process that opened the file. • E.g. the value of the file position for this process’s use. • The file structure entry references the VFS inode after it has been created in the primary memory.
Mounting a File System – How is it done? $ mount -t iso9660 /dev/cdrom /mnt/cdrom • This mount command will pass the kernel three pieces of information; • the name of the file system, • the physical block device that contains the file system and, • thirdly, where in the existing file system topology the new file system is to be mounted. • The first thing that the Virtual File System must do is to find the file system. • To do this it searches through the list of known file systems by looking at each file_system_type data structure in the list pointed at by file_systems. • If it finds a matching name it now knows that this file system type is supported by this kernel and it has the address of the file system specific routine for reading this file system's superblock.
Mounting a File System – How is it done? • Next if the physical device passed by mount is not already mounted, it must find the VFS inode of the directory that is to be the new file system's mount point. • Once the inode has been found it is checked to see that it is a directory and that there is not already some other file system mounted there. • The same directory cannot be used as a mount point for more than one file system. • Next, the VFS mount code must allocate a VFS superblock and pass it the mount information to the superblock read routine for this file system. • The superblock read routine must fill out the VFS superblock fields based on information that it reads from the physical device. • For the EXT2 file system this mapping or translation of information is quite easy, it simply reads the EXT2 superblock and fills out the VFS superblock from there. • For other file systems, such as the MS DOS file system, it is not quite such an easy task. • If the block device cannot be read from or if it does not contain this type of file system then the mount command will fail.
A Mounted File System Each mounted file system is described by a vfsmount data structure which are queued on a list pointed at by vfsmntlist. • In turn the VFS superblock points at • the file_system_type data structure • for this sort of file system and • to the root inode for this file system. • This inode is kept resident in the VFS inode cache • all of the time that this file system is loaded. • Each vfsmount structure contains • the device number of the block device • holding the file system, • the directory where this file system • is mounted and • a pointer to the VFS superblock • allocated when this file system • was mounted