870 likes | 879 Views
This article discusses the main points of file systems, including file layout, directory layout, file management operations such as WriteFile(), CreateFile(), ReadFile(), and CloseHandle(). It also covers the external view of the file manager application and its interaction with the device and memory managers. The article provides a modern perspective on file management in UNIX and Windows operating systems.
E N D
Main Points • File layout • Directory layout
WriteFile() CreateFile() ReadFile() CloseHandle() SetFilePointer() The External View of the File Manager Application Program mount() write() open() close() read() lseek() File Mgr Device Mgr Memory Mgr File Mgr Device Mgr Memory Mgr Process Mgr Process Mgr UNIX Windows Hardware Operating Systems: A Modern Perspective, Chapter 13
foo.html File Manager File Manager <head> … </head> <body> … </body> • Structured information • Can be read by any applic • Accessibility • Protocol • Persistent storage • Shared device Why Programmers Need Files <head> … </head> <body> … </body> HTML Editor Web Browser Operating Systems: A Modern Perspective, Chapter 13
File Management • File is a named, ordered collection of information • The file manager administers the collection by: • Storing the information on a device • Mapping the block storage to a logical view • Allocating/deallocating storage • Providing file directories Operating Systems: A Modern Perspective, Chapter 13
Information Structure Applications Records Byte Stream Files Stream-Block Translation Storage device Operating Systems: A Modern Perspective, Chapter 13
Byte Stream File Interface fileID = open(fileName) close(fileID) read(fileID, buffer, length) write(fileID, buffer, length) seek(fileID, filePosition) Operating Systems: A Modern Perspective, Chapter 13
Low Level Files fid = open(“fileName”,…); … read(fid, buf, buflen); … close(fid); ... ... b0 b1 b2 bi int open(…) {…} int close(…) {…} int read(…) {…} int write(…) {…} int seek(…) {…} Stream-Block Translation Storage device response to commands Operating Systems: A Modern Perspective, Chapter 13
Structured Files Records Structured Record Files Record-Block Translation Operating Systems: A Modern Perspective, Chapter 13
Record-Oriented Sequential Files Logical Record fileID = open(fileName) close(fileID) getRecord(fileID, record) putRecord(fileID, record) seek(fileID, position) Operating Systems: A Modern Perspective, Chapter 13
Record-Oriented Sequential Files Logical Record H byte header k byte logical record ... Operating Systems: A Modern Perspective, Chapter 13
Record-Oriented Sequential Files Logical Record H byte header k byte logical record ... ... Physical Storage Blocks Fragment Operating Systems: A Modern Perspective, Chapter 13
Indexed Sequential File • Suppose we want to directly access records • Add an index to the file fileID = open(fileName) close(fileID) getRecord(fileID, index) index = putRecord(fileID, record) deleteRecord(fileID, index) Operating Systems: A Modern Perspective, Chapter 13
Indexed Sequential File (cont) Application structure index = i Index i k j Account # 012345 123456 294376 ... 529366 ... 965987 index = k index = j Operating Systems: A Modern Perspective, Chapter 13
File System Design • File System is an organized collection of regular files and directories (mkfs) • Data structures • Directories: file name -> file metadata • Store directories as files • File metadata: how to find file data blocks • Free map: list of free disk blocks
File System Design Constraints • For small files: • Small blocks for storage efficiency • Files used together should be stored together • For large files: • Contiguous allocation for sequential access • Efficient lookup for random access • May not know at file creation • Whether file will become small or large
Design Challenges • Index structure • How do we locate the blocks of a file? • Index granularity • What block size do we use? • Free space • How do we find unused blocks on disk? • Locality • How do we preserve spatial locality? • Reliability • What if machine crashes in middle of a file system op?
File Systems • Traditional FFS file system (Linux) • Microsoft’s FAT, FAT2 and NTFS file systems • Journaling file systems, ext3 • … others
Microsoft File Allocation Table (FAT) • Linked list index structure • Simple, easy to implement • Still widely used (e.g., thumb drives) • File table: • Linear map of all blocks on disk • Each file a linked list of blocks
FAT • Pros: • Easy to find free block • Easy to append to a file • Easy to delete a file • Cons: • Small file access is slow • Random access is very slow • Fragmentation • File blocks for a given file may be scattered • Files in the same directory may be scattered • Problem becomes worse as disk fills
Berkeley UNIX FFS (Fast File System) • inode table • Analogous to FAT table • inode • Metadata • File owner, access permissions, access times, … • Set of 12 data pointers • With 4KB blocks => max size of 48KB files
File System Structure • Basic unit for allocating space on the disk is a block partition partition Disk partition File System Data blocks Boot Block Super-block i-node table
I-nodes • Each file or directory in the file system has a unique entry in the i-node table. • File type (regular, symbolic link, directory…) • Owner • Permissions • Timestamps for last access; last modification, last status change • Size • …
i-node entry DB 0 DB 5 DB DB
FFS inode • Metadata • File owner, access permissions, access times, … • Set of 12 data pointers • With 4KB blocks => max size of 48KB files • Indirect block pointer • pointer to disk block of data pointers • Indirect block: 1K data blocks => 4MB (+48KB)
FFS inode • Metadata • File owner, access permissions, access times, … • Set of 12 data pointers • With 4KB blocks => max size of 48KB • Indirect block pointer • pointer to disk block of data pointers • 4KB block size => 1K data blocks => 4MB • Doubly indirect block pointer • Doubly indirect block => 1K indirect blocks • 4GB (+ 4MB + 48KB)
FFS inode • Metadata • File owner, access permissions, access times, … • Set of 12 data pointers • With 4KB blocks => max size of 48KB • Indirect block pointer • pointer to disk block of data pointers • 4KB block size => 1K data blocks => 4MB • Doubly indirect block pointer • Doubly indirect block => 1K indirect blocks • 4GB (+ 4MB + 48KB) • Triply indirect block pointer • Triply indirect block => 1K doubly indirect blocks • 4TB (+ 4GB + 4MB + 48KB)
FFS Asymmetric Tree • Small files: shallow tree • Efficient storage for small files • Large files: deep tree • Efficient lookup for random access in large files
Disk Organization Boot Sector Volume Directory Blk0 Blk1 … Blkk-1 Track 0, Cylinder 0 … Blkk Blkk+1 Blk2k-1 Track 0, Cylinder 1 … … Blk Blk Blk Track 1, Cylinder 0 … … Blk Blk Blk Track N-1, Cylinder 0 … … Blk Blk Blk Track N-1, Cylinder M-1 Operating Systems: A Modern Perspective, Chapter 13
Low-level File System Architecture Block 0 … … b0 b1 b2 b3 bn-1 . . . Randomly Accessed Device Sequential Device Operating Systems: A Modern Perspective, Chapter 13
Block Management • The job of selecting & assigning storage blocks to the file • Three basic strategies: • Contiguous allocation • Linked lists • Indexed allocation Operating Systems: A Modern Perspective, Chapter 13
Contiguous Allocation • Maps the N blocks into N contiguous blocks on the secondary storage device • Difficult to support dynamic file sizes File descriptor Head position 237 … First block 785 Number of blocks 25 Operating Systems: A Modern Perspective, Chapter 13
Linked Lists • Each block contains a header with • Number of bytes in the block • Pointer to next block • Blocks need not be contiguous • Files can expand and contract • Seeks can be slow First block … Head: 417 ... Length Length Length Byte 0 Byte 0 Byte 0 ... ... ... Byte 4095 Byte 4095 Byte 4095 Block 0 Block 1 Block N-1 Operating Systems: A Modern Perspective, Chapter 13
Length Length Length Indexed Allocation • Extract headers and put them in an index • Simplify seeks • May link indices together (for large files) Byte 0 ... Index block … Head: 417 ... Byte 4095 Block 0 Byte 0 ... Byte 4095 Block 1 Byte 0 ... Byte 4095 Block N-1 Operating Systems: A Modern Perspective, Chapter 13
File Descriptor 43 254 43 … Disk Block Disk Block 107 107 Disk Block 254 File Access Table (FAT) DOS FAT Files File Descriptor 43 254 … 107 Disk Block Disk Block Disk Block Operating Systems: A Modern Perspective, Chapter 13
Data Index Index Index Index Index Index Index Index Index Data Data Data Data Data UNIX Files inode mode owner … Direct block 0 Direct block 1 … Direct block 11 Single indirect Double indirect Triple indirect Data Data Data Operating Systems: A Modern Perspective, Chapter 13
Unallocated Blocks • How should unallocated blocks be managed? • Need a data structure to keep track of them • Linked list • Very large • Hard to manage spatial locality • Block status map (“disk map”) • Bit per block • Easy to identify nearby free blocks • Useful for disk recovery Operating Systems: A Modern Perspective, Chapter 13
FFS Locality • Block group allocation • Block group is a set of nearby cylinders • Files in same directory located in same group • Subdirectories located in different block groups • inode table spread throughout disk • inodes, bitmap near file blocks • First fit allocation • Small files fragmented, large files contiguous
FFS • Pros • Efficient storage for both small and large files • Locality for both small and large files • Locality for metadata and data • Cons • Inefficient for tiny files (a 1 byte file requires both an inode and a data block) • Inefficient encoding when file is mostly contiguous on disk (no equivalent to superpages) • Need to reserve 10-20% of free space to prevent fragmentation
NTFS • Master File Table • Flexible 1KB storage for metadata and data • Extents • Block pointers cover runs of blocks • Similar approach in linux (ext4) • File create can provide hint as to size of file • Journalling for reliability • Discussed next time