630 likes | 673 Views
Explore essential information on long-term data storage, file system management, file structure, attributes, operations, and more. Learn key aspects of file naming, access, and directory organization for optimal data storage efficiency.
E N D
Long-term Information Storage • Must store large amounts of data • Information stored must survive the termination of the process using it • Multiple processes must be able to access the information concurrently. In short:
Long-term Information Storage Files: Good!
Long-term Information Storage Files: Good! No Files: Bad!
File System • Operating system determines how files are: • Structured • Named • Accessed • Used • Protected • Implemented • Most important aspect to users is how files appear to them: naming convention, available operations, protection, etc. (Not implementation!!).
File Naming • Unix: • Case sensitive. • Allows, but does not require, extensions (e.g., prog.c). • Assigns no meaning to extensions. • Add as many extensions as desired (e.g., prog.back.stupid.c). • Does not allow spaces in name (unless “\ “) ; • Windows: • Not case sensitive. • Allows 1-3 character extensions. • Extensions have meaning (to other application codes, not to the OS) • Allows spaces in file name. (ever tried to copy “my work” to Unix?)
File Naming Typical file extensions.
File Structure • None - sequence of words, bytes • Simple record structure • Lines • Fixed length • Variable length • Complex Structures • Formatted document • Relocatable load file
File Structure • Three kinds of files • byte sequence (i.e., no structure). • record sequence • Tree (e.g., data base)
File Structure • Can simulate last two with first method by inserting appropriate control characters • Who decides: • Operating system • Program (i.e., programs can support any model they want) • Unix and Windows support only the sequence of bytes functionality.
File Access • Sequential access • read all bytes/records from the beginning • cannot jump around, could rewind or back up • convenient when medium was mag tape • Random access • bytes/records read in any order • essential for data base systems • read can be … • move file marker (seek), then read or … • read and then move file marker
File Attributes • Name – only information kept in human-readable form • Identifier (file descriptor) – unique tag (number) identifies file within file system • Type – needed for systems that support different types • Location – pointer to file location on device • Size – current file size
File Attributes • Protection – controls who can do reading, writing, executing • Time, date, and user identification – data for protection, security, and usage monitoring • Information about files are kept in the directory structure, which is maintained on the disk (although generally cached).
File Operations • Create • Write • Read • Reposition within file • Delete • Truncate
File Operations in Unix • int fd = open(Fi) – search the directory structure on disk for entry Fi, and move the content of entry to memory • fd is a file descriptor (integer). • close (fd) – move the content of entry Fi in memory to directory structure on disk • seek() // change pointer to current location in file. • read(fd, buf, num_bytes)
Open Files • Several pieces of data are needed to manage open files: • File pointer: pointer to last read/write location, per process that has the file open • Open-file count: counter of number of times a file is open – to allow removal of data from open-file table when last processes closes it • Disk location of the file: cache of data access information • Access rights: per-process access mode information
Open Files • Unix maintains an open-file table for each process and for the whole system. • File descriptor is used as an index into the process open-file table. Entries are items that have to do with that particular process (e.g., file pointer, access rights, etc.). • A pointer to the system-wide open-file table is also in the process open-file table. • System-wide open-file table holds process-independent information (e.g., location on disk, last access time, file size, count of the number of processes using the file).
Open File Locking • Provided by some operating systems and file systems • Mediates access to a file • Mandatory or advisory: • Mandatory – access is denied depending on locks held and requested • Advisory – processes can find status of locks and decide what to do
Directory • A collection of data structures containing information about files Directory Files F 1 F 2 F 3 F 4 F n Both the directory structure and the files reside on disk Backups of these two structures are kept on tapes
Operations Performed on Directory • Search for a file • Create a file • Delete a file • List a directory • Rename a file • Traverse the file system
Organize the Directory (Logically) to Obtain • Efficiency – locating a file quickly • Naming – convenient to users • Two users can have same name for different files • The same file can have several different names • Grouping – logical grouping of files by properties, (e.g., all Java programs, all games, …)
Single-Level Directory Naming problem Grouping problem
Two-Level Directory • Separate directory for each user • Path name • Can have the same file name for different user • Efficient searching • No grouping capability
Tree-Structured Directories (Cont) • Efficient searching • Grouping Capability • In Unix, a directory is a file that contains meta-data about the files it contains.
Tree-Structured Directories (Cont) • Most OS support absolute and relative path names. • Unix has two pre-defined relative path names: • . Represents current directory • .. Represents parent directory • Current directory (working directory) • cd /spell/mail/prog or • cd .. (relative to CD)
Path Names A UNIX directory tree
Relative Path Name Assume Current Directory is /usr/jim. Then .. is /usr, . is /usr/jim To access dict: ../lib/dict.
Unix: mkdir creates a new sub-directory below the current working directory. • rmdir removes an entire directory (and all sub-directories). • rm deletes a file • If someone suggests that you try out a cool command called rm –r * don’t do it!!
Shared Files (1) File system containing a shared file
Links This is termed a hard link. Both directory entry pointing to the same inode. (a) Situation prior to linking (b) After the link is created (c) After the original owner removes the file
Symbolic Links • Provide the path name of the target file in the linked file. • Other processes do not have access to the inode (i.e., directory structure). • What happens when file deleted by owner?
Implementing Directories (1) (a) A simple directory fixed size entries disk addresses and attributes in directory entry (b) Directory in which each entry just refers to an i-node
Allocation of File Blocks • Contiguous allocation • Linked-list allocation • FAT • Indexed (inodes).
Directory Structure with Contiguous Allocation of File Blocks
Implementing Files: Contiguous Allocation (a) Contiguous allocation of disk space for 7 files (b) State of the disk after files D and E have been removed
Entry 4 bytes. Blocks 1K. 20 Million Entries (not files!) == 80 MB for table.
Unix Directory Entry File Name Inode Tester 15
B S Inode list Data blocks Boot area superblock Unix File System • Unix File System: 1 inode for each file/directory.
File Attributes Pointing to file data blocks Direct blocks Pointing to file data blocks Single Indirect Double and Triple Indirect blocks not shown Points to data block whose data is the addresses of data blocks belonging to the file
File Attributes File Attributes File Attributes File Attributes 100 Direct block 0 180 253 400 Direct block 1 Direct block 1 Direct block 1 Inode 0 Inode 1 Inode 2 Inode 3 Root inode Problem:Open file /usr/pmd
File Attributes File Attributes File Attributes File Attributes 100 Direct block 0 180 253 400 Direct block 1 Direct block 1 Direct block 1 Inode 0 Inode 1 Inode 2 Inode 3 Root inode Open file /usr/pmd Step 1: Fetch inode for / (root, always inode 0)
File Attributes File Attributes File Attributes File Attributes 100 Direct block 0 180 253 400 Direct block 1 Direct block 1 Direct block 1 Inode 0 Inode 1 Inode 2 Inode 3 Root inode Open file /usr/pmd Inode 0