570 likes | 627 Views
UNIX File Systems. (based on Chap 4. in the book “the design of the UNIX OS”). Acknowledgement : Soongsil Univ. Presentation Materials 단국대 최종무 교수님 강의노트. Hard Disk Drive Structure. Characteristics Seek time, rotational latency, transmission time Disk Scheduling?, cylinder group?, Buffering?.
E N D
UNIX File Systems (based on Chap 4. in the book “the design of the UNIX OS”) Acknowledgement : Soongsil Univ. Presentation Materials 단국대 최종무 교수님 강의노트
Hard Disk Drive Structure • Characteristics • Seek time, rotational latency, transmission time • Disk Scheduling?, cylinder group?, Buffering?
Flash Memory • limitation • Block erasure : block 단위로 지워짐. • Memory wear : # of erase limiation • Read disturb • Seek optimization 이 불필요 • Require strong reliability (e.g. power failure recovery)
File System • the system in which files are named and where they are placed logically for storage and retrieval • 컴퓨터에서 파일이나 자료를 쉽게 발견 및 접근(read/write)할 수 있도록 보관/조직하는 체제 • Properties • Hierarchical structure • Ability to create, read/write, and delete files • Dynamic growth of files • Protection of file data
File System 하는일 • Support file interfaces to user • File interface: open, close, read, write, … • Directory manipulation: mkdir, cd, rmdir,… • File system interfaces: mount, mkfs, fsck, … • File management • i-node, FAT • Access control, attribute control, .. • Storage management • Disk management (Block allocation, I/O handling, …)
OS별 File Systems UNIX File System LINUX : EXT2, EXT3, EXT4 Windows : NTFS OS X (Apple Mac OS) : HFS+ Traditional OS (MS-Windows 9x) : FAT32
FAT • Size limitation of FAT32 • 32GB보다 큰 파티션을 만들수 없다 • 4GB 초과하는 파일을 만들수 없다 • 단순/빠르고 • 호환성이 좋음 (대부분의 OS에서 사용가능)
FAT File Allocation Table
NTFS MS-Windows 계열 OS 기본 file system 보안 (사용 권한 및 암호화) 신뢰성 (파일시스템 오류 자동 복구) Storage growth Support for Large volume and file sizes (대용량 파일(16EB)/볼륨 지원) 호환성 떨어짐 (Mac OS에서는 read만 지원, LINUX에서는 배포판/버전에 따라 불분명)
UNIX File System Overview • Boot block : 부팅에 필요한 정보 저장 • Super block : file system 전체 정보 저장 • i-list : 각 file당 하나의 i-node 할당, file 정보 저장 • Data blocks : 실제 파일 내용을 블록 단위 저장 • Block : 데이터를 한번에 읽고 쓰는 단위 (주로 1KB~4KB)
File System Layout • Root directory의 i-node 는 미리 정해져 있음.
File Sharing Each user has a “file descriptor table” (or “per-user open file table”) Each entry in the channel table is a pointer to an entry in the system-wide “open file table” Each entry in the open file table contains a file offset (file pointer) and a pointer to an entry in the “memory-resident i-node table” If a process opens an already-open file, a new open file table entry is created (with a new file offset), pointing to the same entry in the memory-resident i-node table If a process forks, the child gets a copy of the channel table (and thus the same file offset)
EXT2 file system • LINUX 기본 file system • Ext2 특징 • block group 도입 : 같은 파일은 같은 block group (하드 디스크상 같은 cylinder)을 가능한 사용하여 성능 향상(disk seek time, fragmentation ) • Ext3 특징 • Journaling 기능 도입: Logging and Fast Recovery • Ext4 특징 • Large size 파일 지원
Table of Contents • Inodes • Structure of a regular file • Directories • Conversion of a path name to an Inode • Super block • Inode assignment to a new file • Allocation of disk blocks • Other file types • Summary
Summary • Inode is the data structure that describes the attributes of a file, including the layout of its data on disk. • Two versions of the inode • Disk copy : store the inode information when file is not in use • In-core copy : record the information about active files. • Inode disk blocks • Directories : files that correlate file name components to inode numbers • namei : convert file names to inodes • Inode assignment • Disk block assignment
Table of Contents • Inodes • Structure of a Regular File • Directories • Conversion of a Path Name to an Inode • Super Block • Inode Assignment to a New File • Allocation of Disk Blocks • Summary
Definition of Inodes • Every file has a unique inode • Contain the information necessary for a process to access a file • Exist in a static form on disk • Kernel reads them into an in-core inode to manipulate them.
Contents of Disk inodes • File owner identifier (individual/group owner) • File type (regular, directory,..) • File access permission (owner,group,other) • File access time • Number of links to the file • Table of contents for the disk address of data in a file (byte stream vs discontiguous disk blocks) • File size *inode does not specify the path name that access the file
File owner identifier File type File access permission File access time Number of links to the file Table of contents for the disk address of data in a file File size Owner mjb Group os Type regular file Perms rwxr-xr-x Accessed Oct 23 1984 1:45 P.M Modified Oct 22 1984 10:3 A.M Inode Oct 23 1984 1:30 P.M Size 6030 bytes Disk addresses Sample Disk Inode
Distinction Between Writing inode and File • File change only when writing it. • Inode change when changing the file, or when changing its owner, permisson,or link settings. • Changing a file implies a change to the inode, • But, changing the inode does not imply that the file change.
Contents of The In-core copy of The Inode • Fields of the disk inode • Status of the in-core inode, indicating whether • Inode is locked • Process is waiting for the inode to become unlocked • In-core inode differs from the disk copy as a result of a change to the data in the inode • In-core inode differs from the disk copy as a result of a change to the file data • Inode number (linear array on disk, disk inode does not need this field) • Reference count
Inode Lock and Reference Count • Kernel manipulates them independently • Inode lock • Set during execution of a system call to prevent other processes from accessing the inode while it is in use. • Kernel releases the lock at the conclusion of the system call • Reference count • Kernel increase/decrease when reference is active/inactive • Prevent the kernel from reallocating an active in-core inode
Table of Contents • Inodes • Structure of a Regular File • Directories • Conversion of a Path Name to an Inode • Super Block • Inode Assignment to a New File • Allocation of Disk Blocks • Summary
Inode Data Blocks direct0 direct1 direct2 direct3 direct4 direct5 direct6 direct7 direct8 direct9 single indirect double indirect triple indirect Direct and Indirect Blocks in Inode
Byte Capacity of a File • System V UNIX. Assume that • Run with 13 entries • 1 logical block : 1K bytes • Block number address : a 32 bit (4byte) integer • 1 block can hold up to 256 block number (1024byte / 4byte) • 10 direct blocks with 1K bytes each=10K bytes • 1 indirect block with 256 direct blocks= 1K*256=256K bytes • 1 double indirect block with 256 indirect blocks=256K*256=64M bytes • 1 triple indirect block with 256 double indirect blocks=64M*256=16G • Size of a file : 4G (232), if file size field in inode is 32bits * Refer to the link on our class webpage for the case of current UNIX file system.
Byte Offset and Block Number • Process access data in a file by byte offset. • The file starts at logical block 0 and continues to a logical block number corresponding to the file size • Kernel accesses the inode and converts the logical file block into the appropriate disk block
Conversion of Byte Offset to Block Number Algorithm bmap /* block map of logical file byte offset to file system block */ Input : inode, byte offset Output: (1)block number in file system, (2)byte offset into block, (3)bytes of I/O in block, (4)read ahead block number calculate logical block number in file from byte offset; calculate start byte in block for I/O; /* output 2 */ calculate number of bytes to copy to user; /* output 3 */ check if read-ahead applicable, mark inode; /* output 4*/ determine level of indirection; while(not at necessary level of indirection) calculate index into inode or indirect block from logical block number in file; get disk block number from inode or indirect block; release buffer from previous disk read, if any (algorithm brelse); if(no more levels of indirection) return (block number); read indirect disk block (algorithm bread); adjust logical block number in file according to level of indirection;
4096 228 45423 0 0 11111 0 101 367 0 428 9156 824 367 Data block 3333 0 75 331 3333 Data block 331 Single indirect 9156 Double indirect Block Layout of a Sample File and Its inode 0 Byte 9000 in a file -> 8block 808th byte 8 8 816th byte (10K+256K) Byte 350,000 in a file 11
Block Entry in the Inode is 0 • Logical block entry contain no data. • Process never wrote data into the file at that byte offset • No disk space is wasted • Cause by using the lseek and write system call
Table of Contents • Inodes • Structure of a Regular File • Directories • Conversion of a Path Name to an Inode • Super Block • Inode Assignment to a New File • Allocation of Disk Blocks • Summary
Directories • A directory is a file • Its data is a sequence of entries, each consisting of an inode number and the name of a file contained in the directory • Path name is a null terminated character string divided by “/” • Each component except the last must be the name of a directory, last component may be a non-directory file
Directory Layout for /etc Each entry : inode number and filename
Directories : How • How do you find files? • Read the directory, search for the name you want (checking for wildcards) • How do you list files (ls) • Read directory contents, print name field • How do you list file attributes (ls -l) • Read directory contents, open inodes, print name + attributes
Table of Contents • Inodes • Structure of a Regular File • Directories • Conversion of a Path Name to an Inode • Super Block • Inode Assignment to a New File • Allocation of Disk Blocks • Summary
Algorithm for Conversion of a Path Name to an Inode Algorithm namei /* convert path name to inode */ Input : path name Output : locked inode { if(path name starts from root) working inode = root inode (algorithm iget); else working inode = current directory inode (algorithm iget); while(there is more path name){ read next path name component from input; verify that working inode is of directory,access permission OK; if(working inode is of root and component is “..”) continue; /* loop back to while */ read directory (working inode) by repeated use of algorithms bmap,bread and brelse; …
Algorithm for Conversion of a Path Name to an Inode if(component matches an entry in directory (working inode)){ get inode number for matched component; release working inode (algorithm iput); working inode=inode of matched component(algorithm iget); } else /* component not in directory return (no inode); } return (working inode); }
Table of Contents • Inodes • Structure of a Regular File • Directories • Conversion of a Path Name to an Inode • Super Block • Inode Assignment to a New File • Allocation of Disk Blocks • Other File Types • Summary
Super block • File System • Super block consists of • the size of the file system • the number of free blocks in the file system • a list of free blocks available on the file system • the index of the next free block in the free block list • the size of the inode list • the number of free inodes in the file system • a list of free inodes in the file system • the index of the next free inode in the free inode list • lock fields for the free block and free inode lists • a flag indicating that the super block has been modified boot block super block inode list data blocks
Table of Contents • Inodes • Structure of a Regular File • Directories • Conversion of a Path Name to an Inode • Super Block • Inode Assignment to a New File • Allocation of Disk Blocks • Summary
Inode Assignment to a New File • Super block contains an array to store the index numbers of free inodes in the file system • remembered inode • 이 inode보다 작은 번호의 inode는 superblock free list에 없음을 보장한다. • reduce free inode search time
Algorithm for Assigning New Inodes Algorithm ialloc /* allocate inode */ Input : file system Output : locked inode { while(not done){ if(super block locked) { sleep(event super block becomes free); continue; } if(inode list in super block is empty){ lock super block; get remembered inode for free inode search; search disk for free inodes until super block full, or no more free inodes (bread and brelese); unlock super block; wake up (event super block becomes free); if(no free inodes found on disk) return (no inode); set remembered inode for next free inode search; }
Algorithm for Assigning New Inodes /* there are inodes in super block inode list */ get inode number from super block inode list; get inode (algorithm iget); if(inode not free after all) { write inode to disk; release inode (algorithm iput); continue; /* while loop */ } /* inode is free */ initialize inode; write inode to disk; decrement file system free inode count; return (inode); } // end of while }
Super Block Free Inode List free inodes 83 48 empty 18 19 20 array1 index Super Block Free Inode List free inodes 83 empty 18 19 20 array2 index Assigning Free Inode from Middle of List
Super Block Free Inode List 470 empty 0 array1 remembered inode index Super Block Free Inode List array2 535 free inodes 476 475 471 0 48 49 50 index Assigning Free Inode – Super Block List Empty
Algorithm for Freeing Inode Algorithm ifree /* inode free */ Input : file system inode number Output : none { increment file system free inode count; if(super block locked) return; if(inode list full){ if(inode number less than remembered inode for search) set remembered inode for search = input inode number; } else store inode number in inode list; return; }
535 476 475 471 499 476 475 471 499 476 475 471 free inodes free inodes free inodes remembered inode remembered inode remembered inode index index index Original Super Block List of Free Inodes Free Inode 499 Free Inode 601 Placing Free Inode Numbers Into the Super Block