1 / 104

File System Design: Issues and Objectives

This outline discusses the issues and objectives related to file system design, including file naming, access control, performance optimization, and the role of files. It also covers file system data structures, file sharing, and the directory subsystem.

jcrabtree
Download Presentation

File System Design: Issues and Objectives

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Outline Administrative: • Issues? Objective: • File System Design

  2. Introduction to File Systems

  3. File System Issues • What is the role of files? What is the file abstraction? • File naming. How to find the file we want?Sharing files. Controlling access to files. • Performance issues - how to deal with the bottleneck of disks? What is the “right” way to optimize file access?

  4. Role of Files • Persistence - long-lived - data for posterity • non-volatile storage media • semantically meaningful (memorable) names

  5. Abstractions User view Addressbook, record for Duke CPS Application addrfile ->fid, byte range* fid File System bytes block# device, block # Disk Subsystem surface, cylinder, sector

  6. *File Abstractions • UNIX-like files • Sequence of bytes • Operations: open (create), close, read, write, seek • Memory mapped files • Sequence of bytes • Mapped into address space • Page fault mechanism does data transfer • Named, Possibly typed

  7. User grp others rwx rwx rwx 111 100 000 O_RDONLYO_WRONLY O_RDWR O_CREAT O_APPEND ... Relative to beginning, current position, end of file Unix File Syscalls int fd, num, success, bufsize; char data[bufsize]; long offset, pos; fd = open (filename, mode [,permissions]); success = close (fd); pos = lseek (fd, offset, mode); num = read (fd, data, bufsize); num = write (fd, data, bufsize);

  8. R, W, X, none Shared, Private, Fixed, Noreserve Memory Mapped Files fd = open (somefile, consistent_mode); pa = mmap(addr, len, prot, flags, fd, offset); fd + offset pa len len VAS Reading performed by Load instr.

  9. Nachos File Syscalls/Operations Create(“zot”); OpenFileId fd; fd = Open(“zot”); Close(fd); char data[bufsize]; Write(data, count, fd); Read(data, count, fd); Limitations: 1. small, fixed-size files and directories 2. single disk with a single directory 3. stream files only: no seek syscall 4. file size is specified at creation time 5. no access control, etc.

  10. Functions of File System • Determine layout of files and metadata on disk in terms of blocks. Disk block allocation. Bad blocks. • Handle read and write system calls • Initiate I/O operations for movement of blocks to/from disk. • Maintain buffer cache

  11. r-w pos, mode r-w pos, mode pos pos File System Data Structures System-wide Open file table System-wide File descriptor table Process descriptor in-memory copy of inode ptr to on-disk inode File data stdin stdout per-process file ptr array stderr

  12. Data Block Addr ... File Attributes ... ... ... ... ... UNIX Inodes 3 3 3 3 Data blocks Block Addr 1 2 2 ... Decoupling meta-data from directory entries 1 2 2 1

  13. File Sharing Between Parent/Child main(int argc, char *argv[]) { char c; int fdrd, fdwt, fdpriv; if ((fdrd = open(argv[1], O_RDONLY)) == -1) exit(1); if ((fdwt = creat([argv[2], 0666)) == -1) exit(1); fork(); if ((fdpriv = open([argv[3], O_RDONLY)) == -1) exit(1); while (TRUE) { if (read(fdrd, &c, 1) != 1) exit(0); write(fdwt, &c, 1); } }

  14. r-w pos, mode r-w pos, mode forked process’s Process descriptor openafterfork File System Data Structures System-wide Open file table System-wide File descriptor table Process descriptor in-memory copy of inode ptr to on-disk inode stdin stdout per-process file ptr array stderr

  15. user ID process ID process group ID parent PID signal state siblings children user ID process ID process group ID parent PID signal state siblings children Sharing Open File Instances shared seek offset in shared file table entry parent shared file (inode or vnode) child system open file table process file descriptors process objects

  16. Directory Subsystem • Map filenames to fileids-open (create) syscall. Create kernel data structures. • Maintain naming structure (unlink, mkdir, rmdir)

  17. File Attributes File Attributes Directory node Directory node File Attributes current inode# Proj inode# File Attributes Proj Directory node proj3 inode# Pathname Resolution cps110 “cps110/current/Proj/proj3” current proj3 data file index node of wd

  18. open(/foo/bar/file);read(fd,buf,sizeof(buf)); read(fd,buf,sizeof(buf)); read(fd,buf,sizeof(buf)); close(fd); read(rootdir); read(inode); read(foo);read(inode); read(bar); read(inode); read(filedatablock); read(rootdir); read(inode); read(foo);read(inode); read(bar); read(inode); read(fd,buf,sizeof(buf)); read(fd,buf,sizeof(buf)); read(fd,buf,sizeof(buf)); Access Patterns Along the Way Proc Dirsubsys F.S

  19. Functions of Device Subsystem In general, deal with device characteristics • Translate block numbers (the abstraction of device shown to file system) to physical disk addresses. Device specific (subject to change with upgrades in technology) intelligent placement of blocks. • Schedule (reorder?) disk operations

  20. What to do about Disks? • Avoid them altogether! Caching • Disk scheduling • Idea is to reorder outstanding requests to minimize seeks. • Layout on disk • Placement to minimize disk overhead • Build a better disk (or substitute) • Example: RAID

  21. File Naming

  22. Goals of File Naming • Foremost function - to find files (e.g., in open() ), Map file name to file object. • To store meta-data about files. • To allow users to choose their own file names without undue name conflict problems. • To allow sharing. • Convenience: short names, groupings. • To avoid implementation complications

  23. Naming Structures • Flat name space - 1 system-wide table, • Unique naming with multiple users is hard.Name conflicts. • Easy sharing, need for protection • Per-user name space • Protection by isolation, no sharing • Easy to avoid name conflicts • Register identifies with directory to use to resolve names, possibility of user-settable (cd)

  24. Naming Structures Naming network • Component names - pathnames • Absolute pathnames - from a designated root • Relative pathnames - from a working directory • Each name carries how to resolve it. • Short names to files anywhere in the network produce cycles, but convenience in naming things.

  25. Full Naming Network* Terry A • /Jamie/lynn/project/D • /Jamie/d • /Jamie/lynn/jam/proj1/C • (relative from Terry)A • (relative from Jamie)d grp1 root Lynn TA Jamie lynn project jam B proj1 d D E C D project * not Unix

  26. Full Naming Network* Terry A • /Jamie/lynn/project/D • /Jamie/d • /Jamie/lynn/jam/proj1/C • (relative from Terry)A • (relative from Jamie)d grp1 root Lynn TA Jamie lynn project jam B proj1 d D E C D project Why? * Unix

  27. File size File type Protection - access control information History: creation time, last modification,last access. Location of file - which device Location of individual blocks of the file on disk. Owner of file Group(s) of users associated with file Meta-Data

  28. Restricting to a Hierarchy • Problems with full naming network • What does it mean to “delete” a file? • Meta-data interpretation

  29. Operations on Directories (UNIX) • link (oldpathname, newpathname) - make entry pointing to file • unlink (filename) - remove entry pointing to file • mknod (dirname, type, device) - used (e.g. by mkdir utility function) to create a directory (or named pipe, or special file) • getdents(fd, buf, structsize) - reads dir entries

  30. Reclaiming Storage Terry A grp1 X root Jo X TA Series of unlinks X Jamie joe project jam B proj1 d D E What should be dealloc? C D project

  31. Reclaiming Storage Terry A grp1 X root Jo X TA Series of unlinks X Jamie joe project jam B proj1 d D E C D project

  32. Reference Counting? Terry A 2 grp1 X root Jo X TA 3 1 Series of unlinks X Jamie joe 1 2 project jam B proj1 d D E 2 C D project

  33. Garbage Collection * Terry Phase 1 marking A * Phase 2 collect grp1 X root * Jo X TA Series of unlinks X Jamie joe project jam B proj1 d D E C D project

  34. Restricting to a Hierarchy • Problems with full naming network • What does it mean to “delete” a file? • Meta-data interpretation • Eliminating cycles • allows use of reference counts for reclaiming file space • avoids garbage collection

  35. Given: Naming Hierarchy (because of implementation issues) / bin etc tmp usr vmunix ls sh project users packages mount point (volume root) leaf tex emacs

  36. A Typical Unix File Tree Each volume is a set of directories and files; a host’s file tree is the set of directories and files visible to processes on a given host. / File trees are built by grafting volumes from different devices or from network servers. bin etc tmp usr vmunix In Unix, the graft operation is the privileged mount system call, and each volume is a filesystem. ls sh project users packages mount point coveredDir • mount (coveredDir, volume) • coveredDir: directory pathname • volume: device • volume root contents become visible at pathname coveredDir

  37. A Typical Unix File Tree Each volume is a set of directories and files; a host’s file tree is the set of directories and files visible to processes on a given host. / File trees are built by grafting volumes from different devices or from network servers. bin etc tmp usr vmunix In Unix, the graft operation is the privileged mount system call, and each volume is a filesystem. ls sh project users packages mount point (volume root) coveredDir mount (coveredDir, volume) tex emacs /usr/project/packages/coveredDir/emacs

  38. A Typical Unix File Tree Each volume is a set of directories and files; a host’s file tree is the set of directories and files visible to processes on a given host. / File trees are built by grafting volumes from different devices or from network servers. bin etc tmp usr vmunix In Unix, the graft operation is the privileged mount system call, and each volume is a filesystem. ls sh project users packages mount point • mount (coveredDir, volume) • coveredDir: directory pathname • volume: device specifier or network volume • volume root contents become visible at pathname coveredDir (volume root) tex emacs /usr/project/packages/coveredDir/emacs

  39. Reclaiming Convenience • Symbolic links - indirect filesfilename maps, not to file object, but to another pathname • allows short aliases • slightly different semantics • Search path rules

  40. directory A directory B wind: 18 0 0 inode link count = 2 sleet: 48 rain: 32 hail: 48 inode 48 Unix File Naming (Hard Links) A Unix file may have multiple names. Each directory entry naming the file is called a hard link. Each inode contains a reference count showing how many hard links name it. link system call link (existing name, new name) create a new name for an existing file increment inode link count unlink system call (“remove”) unlink(name) destroy directory entry decrement inode link count if count = 0 and file is not in active use free blocks (recursively) and on-disk inode

  41. wind: 18 0 0 directory A directory B sleet: 67 rain: 32 hail: 48 inode link count = 1 ../A/hail/0 inode 48 inode 67 Unix Symbolic (Soft) Links • Unix files may also be named by symbolic (soft) links. • A soft link is a file containing a pathname of some other file. symlink system call symlink (existing name, new name) allocate a new file (inode) with type symlink initialize file contents with existing name create directory entry for new file with new name The target of the link may be removed at any time, leaving a dangling reference. How should the kernel handle recursive soft links? Convenience, but not performance!

  42. Soft vs. Hard Links What’s the difference in behavior?

  43. Soft vs. Hard Links What’s the difference in behavior? / Lynn Terry Jamie

  44. Soft vs. Hard Links What’s the difference in behavior? / Lynn Terry Jamie X

  45. X Soft vs. Hard Links What’s the difference in behavior? / Lynn Terry Jamie

  46. File Structure

  47. After Resolving Long PathnamesOPEN(“/usr/faculty/carla/classes/cps110/spring02/lectures/lecture13.ppt”,…)Finally Arrive at File • What do users seem to want from the file abstraction? • What do these usage patterns mean for file structure and implementation decisions? • What operations should be optimized 1st? • How should files be structured? • Is there temporal locality in file usage? • How long do files really live?

  48. File System design and impl Usage patterns observed today Know your Workload! • File usage patterns should influence design decisions. Do things differently depending: • How large are most files? How long-lived?Read vs. write activity. Shared often? • Different levels “see” a different workload. • Feedback loop

  49. Generalizations from UNIX Workloads • Standard Disclaimers that you can’t generalize…but anyway… • Most files are small (fit into one disk block) although most bytes are transferred from longer files. • Most opens are for read mode, most bytes transferred are by read operations • Accesses tend to be sequential and 100%

  50. More on Access Patterns • There is significant reuse (re-opens) - most opens go to files repeatedly opened & quickly. Directory nodes and executables also exhibit good temporal locality. • Looks good for caching! • Use of temp files is significant part of file system activity in UNIX - very limited reuse, short lifetimes (less than a minute).

More Related