240 likes | 356 Views
CSE451 Introduction to Operating Systems Spring 2007. Module 17 File System Implementations Gary Kimura & Mark Zbikowski May 2007. Road Map. Recap OS Structure Processes, threads, and scheduling Synchronization Memoryf Storage Still to come File system implementations & Caching
E N D
CSE451 Introduction to Operating Systems Spring 2007 Module 17 File System Implementations Gary Kimura & Mark Zbikowski May 2007
Road Map • Recap • OS Structure • Processes, threads, and scheduling • Synchronization • Memoryf • Storage • Still to come • File system implementations & Caching • Linking and loading • Protection and security • Final June 6th 2:30 to 4:20
The Real Goal of the File System • Should be to give back your data the way you saved it • Question we need to answer is: What is on the disk, where is it on the disk, and how it is maintained?
On-Disk Structures • Many format variations exist. • Why? • Plenty of literature on the Unix/Linux ones (read chapter 9 of TLK to help with project #3) • We’ll spend some of our class time looking at what Windows uses • Most volumes have a starting point where at a predefined location on the disk is the volume information. This is often not the first sector on the disk • Think of this as the root directory of a volume it contains • Volume size information • Creation information • Protection information • Label information • Free space and used space information
Common terminology • In discussing disks and file systems we often need to distinguish where the sector is located • Logical Sector Numbers (LSN) are used to identify the continuous stream of sectors presented by the disk driver • Virtual Sector Numbers (VSN) are used to identify the continuous stream of sectors presented by the file system • Besides location there is size • Sectors: the minimum read/write granularity for a disk • Clusters: is the minimum unit of allocation used by the file system (it can vary for each volume). A multiple of sectors. There are also Logical Cluster Numbers (LCN) and Virtual Cluster Numbers (VCN).
Example: FAT File System • The DOS file system (also called the FAT file system) is a simple structure with plenty of bad examples but easy to understand. As we dissect it I’ll point out some of the shortcomings • On a FAT volume there are four areas of interest • BIOS Parameter Block (BPB): identifies the volume as a fat file system • File Allocation Table (FAT): used to control the allocation and lookup of clusters for each file • Root Directory: contains entries for each file and directory on the root of the volume • File Data Area: used by the file system to store files and additional sub-directories
The BPB • For small volume the BPB contains • BytesPerSector: size of a sector on the volume • SectorsPerCluster: number of sectors per cluster • ReservedSectors: number of sectors skipped before the first FAT • Fats: number of FATs on the volume (typically 2 for some bizarre reason) • RootEntries: number of files/directories we can have in the root directory • Sectors: number of sectors on the volume • SectorsPerFat: number of sectors needed to store each copy of the FAT
The FAT • The volume is divided up into clusters each one with a number (i.e., LCN). • The FAT is logically an array containing LCN values. The size of the FAT is the number of clusters on the volume. • For example, FAT[10] corresponds to the 10th cluster on the volume • If a cluster is free then its FAT entry is 0. • If a cluster is in use then its FAT entry the LCN of the next cluster in the file, or if is is the last cluster in the file then its value is –1 • Bad clusters also have a distinguished value
Directory Entries • On FAT files could only be “8.3” (i.e., file names are at most 8 characters and extensions are at most 3 characters). This limits the maximum size of individual directory entries typedef struct _PACKED_DIRENT { FAT8DOT3 FileName; // offset = 0 UCHAR Attributes; // offset = 11 UCHAR NtByte; // offset = 12 UCHAR CreationMSec; // offset = 13 FAT_TIME_STAMP CreationTime; // offset = 14 FAT_DATE LastAccessDate; // offset = 18 union { USHORT ExtendedAttributes; // offset = 20 USHORT FirstClusterOfFileHi; // offset = 20 }; FAT_TIME_STAMP LastWriteTime; // offset = 22 USHORT FirstClusterOfFile; // offset = 26 ULONG32 FileSize; // offset = 28 } PACKED_DIRENT; // sizeof = 32
File Attributes • The first byte in the directory entry tells use quite a bit about it FAT_DIRENT_NEVER_USED 0x00 FAT_DIRENT_REALLY_0E5 0x05 FAT_DIRENT_DIRECTORY_ALIAS 0x2e FAT_DIRENT_DELETED 0xe5 • The attribute byte tells us more FAT_DIRENT_ATTR_READ_ONLY 0x01 FAT_DIRENT_ATTR_HIDDEN 0x02 FAT_DIRENT_ATTR_SYSTEM 0x04 FAT_DIRENT_ATTR_VOLUME_ID 0x08 FAT_DIRENT_ATTR_DIRECTORY 0x10 FAT_DIRENT_ATTR_ARCHIVE 0x20 FAT_DIRENT_ATTR_DEVICE 0x40
Limitations and Extensions • Easy to get corrupt volumes • Volume size limitation in earlier versions of FAT • Directories • The root directory on FAT is a fixed number of directory entries as specified in the BPB. • On older systems before sub-directories this meant that the volume had a maximum number of files • Unordered • Extensions as the years went on • Larger disks meant extending the BPB • Long files names meant hacking away at directory entries • Other hacks too hideous to even write about
A Brief history of NTFS • In its early years (before official public release) Windows NT only supported FAT/DOS and HPFS (from OS/2) • Early in its development NTFS was called FRFS (Ugh!) • NTFS was designed with features tailored specifically for NT • It was done originally by 4 software engineers who had earlier implemented FAT and HPFS for NT. These four developers also did the cache manager and major parts of the NT’s kernel mode runtime libraries • NTFS has sprouted additional features since its release in 1993. Most of these new features have been forward compatible.
NTFS Features • The basic model is that everything is a file • The master file table (MFT) describes each file on the volume including itself • Bitmap for allocation (of both clusters and MFT records) • Retrieval pointer information is stored in a compact form • Directories are B+ trees – collated by file name • Multiple (or alternate) data-streams per file • Recoverable meta-data using a logging/journaling file • Hard Links and reparse Points • Compressed and sparse data files • 64 bit system • NTFS is designed for 64 bits • Volume and file sizes are stored 64 bits • NTFS and Windows NT in general also stored time as 64 bits with a 200 ns resolution starting at 1601
Where to start on an NTFS disk • The NTFS volume starts with a boot sector at LBN=0, and a duplicate boot sector at LBN=(number of sectors on the partition div 2).† So a disk with N sectors start with two boot sectors as illustrated. 0 ... N/2 ... N +-----------+-------+------------+-------+------------+ |BootSector | ... | BootSector | ... | | +-----------+-------+------------+-------+------------+ • †In later versions this changed to LBN=n and not N/2.
Structure of the MFT • The master file table contains the file record segments for all of the volume. The first 16 or so file record segments are reserved for special files. User file records start at file record #16. 0 1 2 3 4 5 6 7 8 9 ... +---+---+---+---+---+---+---+---+---+---+-----+ | M | M | L | V | A | R | B | B | B | Q | | | f | f | o | o | t | o | i | o | a | u | | | t | t | g | l | t | o | t | o | d | o | | | | 2 | F | D | r | t | M | t | C | t | ... | | | | i | a | D | D | a | | l | a | | | | | l | s | e | i | p | | u | | | | | | e | d | f | r | | | s | | | +---+---+---+---+---+---+---+---+---+---+-----+
Structure of a File Record • Each file record is a fixed size and used to store meta data information for the file in a packed form where each tag starts with a [type, size] pair. • Name • Dates • Protection • Data streams (including size and retrieval information) • Indexes • The MFT File Record also contains a bitmap for allocating and freeing file records
Attributes • File data is stored in NTFS file records in what are called “attributes” • Attributes are either resident or nonresident depending on the size of the data and room within the file record • In a resident attribute the data is actually stored within the file record • If the data is nonresident then the file records essentially contains [vcn, lcn, size] triples on where the data is actually stored on the disk And that is the essence of NTFS in a nutshell. • One early extension was to add compressed and sparse files
Compressed Files • Data compression occurs on an individual attribute basis • Uses a patented compression format called LZNT1 • Allows for quick random read/write access to the file data • On a write operation NTFS attempts to compress every 16 clusters if they result is less than 16 clusters then the compressed data is written to disk. • On a read operation NTFS decompresses and buffers (as a mapped file) every 16 clusters
Sparse Files • Sparse files essentially follow the same paradigm as compressed files, but special case the situation where the data is all zeros • The implication with compressed and sparse files is that the actual storage on the disk can be less than the actual files size • Therefore it is possible to have a file larger than your disk • And that writing into the middle of a file can fail with an out of space error
Utilities • Format • Lay down the initial volume structure on the disk • Sometimes also does low-level media formatting • Error checking and correcting utilities • Chkdsk, scandisk, fsck, … • Backup and restore • Other disk management utilities • Volume management • Defragmenters and compactors • Indexing
File System Driver Implementation Issues • In Windows the file system driver stores in memory a partial representation of the directory and file structure found on the disk, starting with the root directory. • These in-Memory Data Structures are • Needed for volume management • Fast allocation of disk space • Concurrent access between processes • Needed to manage opened files and directories • Needed for each opened handle
Memory Mapped Files • Two paradigms for accessing data in a file • Read and write calls • Memory mapped files • With memory mapped files an allocated region of memory is mapped to a particular offset in a file • The user can “window” through the file by changing the offset of the mapping in the file • MM usually handles faulting in the data and writing dirty data using its demand paging logic
Software Caching • The idea is to keep user data and meta data in main memory to reduce the number of actual disk accesses. • There is Logical Caching and Virtual Caching. One stores the cache as tagged with logical disk blocks the other caches virtual blocks in the file • There are write-through and write-behind strategies. • Windows uses a virtual cache with write-behind • How much address space and physical space to dedicate for the cache is an issue • Older systems used a statically sized cache • It is possible to use a dynamically sized cache
Programming Considerations • Smallish directory size for quicker enumeration and opens • Short names (needed for backwards compatibility) but in the end were a performance nightmare • Delayed closing of files for quicker reopening of files • Relative opens for quicker open calls (and doing a prefix lookup for common directories) • What to do to avoid or cause wasted MFT space • Fragmented disk and files • Multiple named data streams • Removable media (a real Pandora’s box) • Change notify calls