2.09k likes | 2.25k Views
e cs150 Spring 2014 : Operating System #3: File Systems. Dr. S. Felix Wu Computer Science Department University of California, Davis. File Disk. separate the disk into blocks separate the file into blocks as well “ paging ” from file to disk. blocks: 4 - 7- 2- 10- 12.
E N D
ecs150 Spring 2014:Operating System#3: File Systems Dr. S. Felix Wu Computer Science Department University of California, Davis ecs150, spring 2014
File Disk • separate the disk into blocks • separate the file into blocks as well • “paging” from file to disk blocks: 4 - 7- 2- 10- 12 How to represent the file?? How to link these 5 pages together?? ecs150, spring 2014
Bit Torrent pieces • 1 big file (X Gigabytes) with a number of pieces (5%) already in (and sharing with others). • How much disk space do we need at this moment? ecs150, spring 2014
One Logical File Physical Disk Blocks efficient representation & access File - Pieces/Blocks Free Pieces/Blocks ecs150, spring 2014
File Disk blocks 0 file block 0 file block 1 file block 2 file block 3 file block 4 4 7 2 10 12 • What are the disadvantages? • disk access can be slow for “random access”. • How big is each block? 2^X bytes? 2^X+8 bytes? ecs150, spring 2014
One Logical File Physical Disk Blocks efficient representation & access File - Pieces/Blocks Free Pieces/Blocks ecs150, spring 2014
An i-node A file ??? entries in one disk block ecs150, spring 2014
Requirements • performance/efficiency • Robustness/recovery • Extensibility/Flexibility ecs150, spring 2014
125 struct ufs2_dinode { 126 u_int16_t di_mode; /* 0: IFMT, permissions; see below. */ 127 int16_t di_nlink; /* 2: File link count. */ 128 u_int32_t di_uid; /* 4: File owner. */ 129 u_int32_t di_gid; /* 8: File group. */ 130 u_int32_t di_blksize; /* 12: Inode blocksize. */ 131 u_int64_t di_size; /* 16: File byte count. */ 132 u_int64_t di_blocks; /* 24: Bytes actually held. */ 133 ufs_time_t di_atime; /* 32: Last access time. */ 134 ufs_time_t di_mtime; /* 40: Last modified time. */ 135 ufs_time_t di_ctime; /* 48: Last inode change time. */ 136 ufs_time_t di_birthtime; /* 56: Inode creation time. */ 137 int32_t di_mtimensec; /* 64: Last modified time. */ 138 int32_t di_atimensec; /* 68: Last access time. */ 139 int32_t di_ctimensec; /* 72: Last inode change time. */ 140 int32_t di_birthnsec; /* 76: Inode creation time. */ 141 int32_t di_gen; /* 80: Generation number. */ 142 u_int32_t di_kernflags; /* 84: Kernel flags. */ 143 u_int32_t di_flags; /* 88: Status flags (chflags). */ 144 int32_t di_extsize; /* 92: External attributes block. */ 145 ufs2_daddr_t di_extb[NXADDR];/* 96: External attributes block. */ 146 ufs2_daddr_t di_db[NDADDR]; /* 112: Direct disk blocks. */ 147 ufs2_daddr_t di_ib[NIADDR]; /* 208: Indirect disk blocks. */ 148 int64_t di_spare[3]; /* 232: Reserved; currently unused */ 149 }; ecs150, spring 2014
125 struct ufs2_dinode { 126 u_int16_t di_mode; /* 0: IFMT, permissions; see below. */ 127 int16_tdi_nlink; /* 2: File link count. */ 128 u_int32_t di_uid; /* 4: File owner. */ 129 u_int32_t di_gid; /* 8: File group. */ 130 u_int32_t di_blksize; /* 12: Inode blocksize. */ 131 u_int64_tdi_size; /* 16: File byte count. */ 132 u_int64_tdi_blocks; /* 24: Bytes actually held. */ 133 ufs_time_t di_atime; /* 32: Last access time. */ 134 ufs_time_t di_mtime; /* 40: Last modified time. */ 135 ufs_time_t di_ctime; /* 48: Last inode change time. */ 136 ufs_time_t di_birthtime; /* 56: Inode creation time. */ 137 int32_t di_mtimensec; /* 64: Last modified time. */ 138 int32_t di_atimensec; /* 68: Last access time. */ 139 int32_t di_ctimensec; /* 72: Last inode change time. */ 140 int32_t di_birthnsec; /* 76: Inode creation time. */ 141 int32_t di_gen; /* 80: Generation number. */ 142 u_int32_t di_kernflags; /* 84: Kernel flags. */ 143 u_int32_t di_flags; /* 88: Status flags (chflags). */ 144 int32_t di_extsize; /* 92: External attributes block. */ 145 ufs2_daddr_t di_extb[NXADDR];/* 96: External attributes block. */ 146 ufs2_daddr_tdi_db[NDADDR]; /* 112: Direct disk blocks. */ 147 ufs2_daddr_tdi_ib[NIADDR]; /* 208: Indirect disk blocks. */ 148 int64_t di_spare[3]; /* 232: Reserved; currently unused */ 149 }; ecs150, spring 2014
Di_size Di_blocks * (1K) ecs150, spring 2014
166 struct ufs1_dinode { 167 u_int16_t di_mode; /* 0: IFMT, permissions; see below. */ 168 int16_t di_nlink; /* 2: File link count. */ 169 union { 170 u_int16_t oldids[2]; /* 4: Ffs: old user and group ids. */ 171 } di_u; 172 u_int64_t di_size; /* 8: File byte count. */ 173 int32_t di_atime; /* 16: Last access time. */ 174 int32_t di_atimensec; /* 20: Last access time. */ 175 int32_t di_mtime; /* 24: Last modified time. */ 176 int32_t di_mtimensec; /* 28: Last modified time. */ 177 int32_t di_ctime; /* 32: Last inode change time. */ 178 int32_t di_ctimensec; /* 36: Last inode change time. */ 179 ufs1_daddr_t di_db[NDADDR]; /* 40: Direct disk blocks. */ 180 ufs1_daddr_t di_ib[NIADDR]; /* 88: Indirect disk blocks. */ 181 u_int32_t di_flags; /* 100: Status flags (chflags). */ 182 int32_t di_blocks; /* 104: Blocks actually held. */ 183 int32_t di_gen; /* 108: Generation number. */ 184 u_int32_t di_uid; /* 112: File owner. */ 185 u_int32_t di_gid; /* 116: File group. */ 186 int32_t di_spare[2]; /* 120: Reserved; currently unused */ 187 }; ecs150, spring 2014
#include <stdio.h> #include <stdlib.h> int main (void) { FILE *f1 = fopen("./sss.txt", "w"); int i; for (i = 0; i < 1000; i++) { fseek(f1, rand(), SEEK_SET); fprintf(f1, "%d%d%d%d", rand(), rand(), rand(), rand()); if (i % 100 == 0) sleep(1); } fflush(f1); } # ./t # ls –l ./sss.txt ecs150, spring 2014
di_size vs. di_blocks • Logical • Physical • Fstat • du • ls –i • df –i • stat <filename> ecs150, spring 2014
Try on different platforms ecs150, spring 2014
One Logical File Physical Disk Blocks efficient representation & access ecs150, spring 2014
An i-node A file ??? entries in one disk block Typical: each block 1K ecs150, spring 2014
An i-node A file ??? entries in one disk block Typical: each block 1K ecs150, spring 2014
Dirent/iNode Directory entries – naming the files (under a hierarchy) iNode – pointers to the content (blocks) But, nevertheless, it (iNode) provides a “sequential” list of pointers to every single block. The order of “access” (to the whole file) is still under the control of the user. ecs150, spring 2014
Dirent/iNode Directory entries – naming the files (under a hierarchy) iNode – pointers to the content (blocks) But, nevertheless, it (iNode) provides a “sequential” list of pointers to every single block. The order of “access” (to the whole file) is still under the control of the user. “Control” of Access Sequences in Sharing ecs150, spring 2014
Access-control i-node A file ??? entries in one disk block Typical: each block 1K ecs150, spring 2014
i-node 1K block, 32 bits ptr (2^32)^3 • How many disk blocks can a FS have? • How many levels of i-node indirection will be necessary to store a file of 2G bytes? (I.e., 0, 1, 2 or 3) • What is the largest possible file size in i-node? • What is the size of the i-node itself for a file of 10GB with only 512 MB downloaded? ecs150, spring 2014
Answers • How many disk blocks can a FS have? • 264 or 232: Pointer (to blocks) size is 8/4 bytes. • How many levels of i-node indirection will be necessary to store a file of 2G (231) bytes? (I.e., 0, 1, 2 or 3) • 12*210 + 28 * 210 + 28 *28 *2 10 +28 *28 *28 *2 10 >? 231 • What is the largest possible file size in i-node? • 12*210 + 28 * 210 + 28 *28 *2 10 +28 *28 *28 *2 10 • 264 –1 • 232 * 210 You need to consider three issues and find the minimum! ecs150, spring 2014
1K = 2^10 4 bytes = 2^2 BYTES ecs150, spring 2014
i-node • How many disk blocks can a FS have? • How many levels of i-node indirection will be necessary to store a file of 2G bytes? (I.e., 0, 1, 2 or 3) • What is the largest possible file size in i-node? • What is the size of the i-node itself for a file of 10GB with only 512 MB downloaded? ecs150, spring 2014
A Little Review • File systems have to be • Hierarchical (directory), Efficient in time and space (i-node), robust against all sorts of failure (soft update, fsck, snapshots), extensibility to new functionalities (v-node) • But, why I still cannot find the file(s) I really want to get on my 500 BG HD? • What are pieces of the social informatics around your and my hard drives (or those on the cloud or on the cache of some routers)? ecs150, spring 2014
A File System partition partition partition b s i-list directory and data blocks d i-node i-node ……. i-node ecs150, spring 2014
dirp = opendir(const char *filename); struct dirent *direntp = readdir(dirp); struct dirent { ino_t d_ino; char d_name[NAME_MAX+1]; }; directory dirent inode file_name dirent inode file_name dirent inode file_name file file file ecs150, spring 2014
root wheel . 2 directory / 2 drwxr-xr-x .. 2 Apr 1 2004 usr 4 3 vmunix 5 root wheel . 4 drwxr-xr-x directory /usr .. 2 4 Apr 1 2004 bin 7 root wheel foo 6 rwxr-xr-x 5 file /vmunix Apr 15 2004 text data kirk staff 6 rw-rw-r-- file /usr/foo Hello World! Jan 19 2004 root wheel . 7 7 drwxr-xr-x .. 4 directory /usr/bin Apr 1 2004 ex 9 8 groff 10 bin bin vi 9 file /usr/bin/vi 9 rwxr-xr-x text data Apr 15 2004 ecs150, spring 2014
struct dirent { ino_t d_ino; char d_name[NAME_MAX+1]; }; struct stat {… short nlinks; …}; directory dirent inode file_name dirent inode file_name dirent inode file_name file file file ecs150, spring 2014
A File System partition partition partition b s i-list directory and data blocks d i-node i-node ……. i-node ecs150, spring 2014
What is the difference? • ln –s /usr/src/sys/sys/proc.h ppp.h • ln /usr/src/sys/sys/proc.h ppp.h ecs150, spring 2014
What is the difference? • ln –s /usr/src/sys/sys/proc.h ppp.h • ln /usr/src/sys/sys/proc.h ppp.h • How about? • ln –social fbid:372567192/usr/src/sys/sys/proc.h ppp.h ecs150, spring 2014
A File System We have been using this model “for a while”… partition partition partition b s i-list directory and data blocks d i-node i-node ……. i-node ecs150, spring 2014
nlinks • A counter – how many dirents (directory entries) are pointing into this i-node? ecs150, spring 2014
nlinks • A counter – how many dirents (directory entries) are pointing into this i-node? • How about snlinks? (maybe with a weight, positive and negative, representing trust and recommendation) • BTW, how many copies of “X” (or its derivatives) are out there? (even Google might not be able to tell you, FYI.) ecs150, spring 2014
An i-node A file ??? entries in one disk block Typical: each block 1K ecs150, spring 2014
A Cloud-based i-node A file ??? entries in one disk block ecs150, spring 2014
Social cloud-based i-node A file ??? entries in one disk block ecs150, spring 2014
/usr/src/sys/ufs/ffs/ffs_balloc.c Buffer pages to Disk/I-Node ecs150, spring 2014
Snapshot of the FS • backup and restore • dump reliably an active File System • what will we do today to dump our 40GB FS “consistent” snapshots? (in the midnight…) • “background FSCK checks” ecs150, spring 2014