1 / 42

File Systems and Mass Storage

Sorin Manolache sorma@ida.liu.se. File Systems and Mass Storage. Last on TTIT61. Binding Compile time, load time, execution time Swapping Contiguous memory allocation External fragmentation Paging Internal fragmentation, sharing, protection Segmentation

shamus
Download Presentation

File Systems and Mass Storage

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sorin Manolache sorma@ida.liu.se File Systems and Mass Storage

  2. Last on TTIT61 Binding Compile time, load time, execution time Swapping Contiguous memory allocation External fragmentation Paging Internal fragmentation, sharing, protection Segmentation External fragmentation, sharing, protection Virtual memory Page replacement Thrashing

  3. Lecture Plan What is an operating system? What are its functions? Basics of computer architectures. (Part I of the textbook) Processes, threads, schedulers (Part II , chap. IV-VI) Synchronisation (Part II, chap. VII) Primary memory management. (Part III, chap. IX, X) File systems and secondary memory management (Part III, chap. XI, XII, Part IV) Security (Part VI)

  4. Outline • The concept of file • Operations on files • Access methods • Directories • Operations on directories • Directory hierarchies • File sharing • Protection • Disk scheduling

  5. Files Named collection of related information that is stored on secondary storage Smallest allotment of logical secondary storage (when we want to store something on the secondary storage, we store it in files) Format of files is typically defined by the creator

  6. File Attributes Name (identifier for human use) Identifier (typically a numeric identifier for the internal use of the OS) Type (for OS that support file types) Size Time, date, user identification (last access, last modification, creation) Location (on the device) Protection (permissions to read/write/execute/etc.)

  7. Operations on Files Creation Name Protection information Deletion Name Writing Reading Truncating Repositioning within a file

  8. Opening a File If the file to be read from (written to) was specified by its name to the read (write) system calls The OS would have to lookup the disk blocks corresponding to the named file for each system call invocation Þ performance penalty Most OS require the user to perform an open system call that Maps the file name to an identifier Initialises a memory structure with the disk location of the opened file and other data (see slides) Implicit opening: automatically open at first access, close at process exit

  9. Disks Heads Cylinder

  10. Disk Organisation Tracks Sector Gap

  11. Disk Organisation • The geometry of disks is given in C/H/S (Cylinders/Heads/Sectors) • Initially used to corresponded to the true physical geometry • The access granularity is the physical block (sector) • The operating system maps logical records on physical blocks • Disk space is always allocated in blocks • Files may not have a size equal to an integer multiple of the block size  last block not fully used • Internal fragmentation

  12. Access Methods • Direct access • Sequential access • Indexed access

  13. Direct Access • Direct access • Read (write) system call do specify the relative block number from where to read (where to write to) • E.g. • write(fd, buf, sizeof(buf), 30); -- writes sizeof(buf) bytes from the buffer buf to the 30th block of the file identified by fd.

  14. Sequential Access • Sequential access • The block from where to read (where to write) is not specified. The OS keeps a file pointer that it modifies accordingly. • A read (write) operation reads (writes) data from the current file offset, stored in the file pointer • After the read or write, the value of the file pointer is incremented with the amount of transferred records. • E.g.: • write(fd, buf, sizeof(buf)) – write sizeof(buf) bytes from the buffer buf to the file identified by fd at the current file offset. Increment the file pointer with sizeof(buf)

  15. Block 1311 Block 1312 Block 1400 143520, $10 245679, $30 509877, $5 510978, $15 607896, $20 610942, $10 790134, $15 829842, $8 853661, $10 898541, $20 934625, $6 973147, $7 Indexed Access 143520, 1311 607896, 1312 853661, 1400 Index file File

  16. Writing to Files in Unix • Writing • Which file to write to • What to write • The write system call does not specify a file offset (a write pointer pointing at the position in the file where the writing should begin) • Sequential access

  17. Reading from Files in Unix • Reading • Which file to read from • How much to read • Where to put what we read • The read system call does not specify a file offset (a read pointer pointing at the position in the file where the reading should begin)

  18. Read/Write Pointers • Typically, a file is used either for reading or for writing by a process • The OS keeps just one single file pointer for both reading and writing

  19. File Operations in Unix • int creat(const char *name, int permissions) • int open(const char *name, int flags) • int read(int fd, char *buffer, int requested_size); • int write(int fd, const char *buffer, int size); • int close(int fd); • long lseek(int fd, long offset, int whence); • int stat(const char *name, struct stat *status);

  20. Usage Example int fd, n; char wbuf[] = “Hello!\n”, rbuf[MAX_BUF]; fd = creat(“source.txt”, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH); write(fd, wbuf, sizeof(wbuf)); n = read(fd, rbuf, MAX_BUF); close(fd); • From which file offset will readread? What's the value of n? What will rbuf contain?

  21. Usage Example int fd, n; char wbuf[] = “Hello!\n”, rbuf[MAX_BUF]; fd = creat(“source.txt”, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH); write(fd, wbuf, sizeof(wbuf)); close(fd); fd = open(“source.txt”, O_RDONLY); n = read(fd, rbuf, MAX_BUF); close(fd);

  22. Directories • Special files that contain directory entries • A directory entry is a data structure containing the file attributes bin, dir, root:root, rwxr-xr-x, 10050 lib, dir, root:root, rwxr-xr-x, 52175 vmlinux-2.6.11, reg, root:root, r-x------, 120311 / root dir bin/ ls directory entry lib/ bin dir ls, reg, root:root, r-xr-xr-x, 10052 libc.so libgcc.so vmlinux-2.6.11 libc.so, reg, root:root, r-xr-xr-x, 52177 libgcc.so, reg, root:root, r-xr-xr-x, 60621 lib dir

  23. Operations on Directories • List the contents of the directory • Delete (unlink) a file • Rename a file • Open and close the directory • Search for a file • Traverse the file system

  24. Directory Hierarchies • Tree-structured directories • Leaf nodes are files, all other nodes are directories • Acyclic-graph directories • Same with the exception that the structure is an acyclic graph • General graph directories • May contain cycles

  25. Acyclic Graph Directories bin, dir, root:root, rwxr-xr-x, 10050 lib, dir, root:root, rwxr-xr-x, 52175 vmlinux-2.6.11, reg, root:root, r-x------, 120311 • We need reference counters. A file is removed and its blocks marked as free when the reference counter reaches 0. • Removing = unlinking / root dir bin/ ls lib/ bin dir ls, reg, root:root, r-xr-xr-x, 10052 libc.so libgcc.so kernel libc.so, reg, root:root, r-xr-xr-x, 52177 libgcc.so, reg, root:root, r-xr-xr-x, 60621 kernel, reg, sys:root, r-xr-x---, 120311 lib dir vmlinux-2.6.11

  26. Symbolic Links bin, dir, root:root, rwxr-xr-x, 10050 lib, dir, root:root, rwxr-xr-x, 52175 vmlinux-2.6.11, reg, root:root, r-x------, 120311 • A hard link is a directory entry pointing to a different file. It contains no blocks of its own • A symbolic link is a special file, very short one, that contains the name of the file that it points to / root dir bin/ ls lib/ bin dir ls, reg, root:root, r-xr-xr-x, 10052 libc.so libgcc.so kernel libc.so, reg, root:root, r-xr-xr-x, 52177 libgcc.so, reg, root:root, r-xr-xr-x, 60621 kernel, link, guest:guest, rwxrwxrwx, 71220 lib dir vmlinux-2.6.11

  27. File Sharing • When files can be shared • Should all writes be allowed to occur or should the OS protect the user actions from each other? • Should a write be immediately visible to all the other users who share the file?

  28. 0 1 … 3 18 18 file.txt, ftp:users, rw-rw-rw-, ,10228 6 18 File Sharing fd = open(“file.txt”, O_RDWR); fd = open(“file.txt”, O_RDWR); n = read(fd, buf, 10); n = read(fd, buf, 10); printf(“%s\n”, buf); printf(“%s\n”, buf); n = read(fd, buf, 5); write(fd, “xxxxxxxxxx”, 10); 0000000000 printf(“%s\n”, buf); n = read(fd, buf, 10); xxxxxxxxxx 8888888888 n = read(fd, buf, 5); printf(“%s\n”, buf); 5555555555 yyyyyyyyyy printf(“%s\n”, buf); close(fd); write(fd, “yyyyyyyyyy”, 10); OS-wide open file table close(fd); Proc A open file table Proc B open file table 0 10 30 20 15 1 2 1 0 20 30 10 0

  29. Protection • Keep safe from improper access • We introduce the notion of file owner • A user ID kept on the disk in the directory entry • Users have IDs • Processes, besides their process IDs (pid), have user IDs, typically the user ID of the user that executes them (user ID (uid) or effective user ID (euid)) • Files typically are owned by the user who creates them • The effective user ID of the file creator is written in the directory entry • Besides owner, a file is characterised by its group

  30. Protection • Controlled access is introduced by specifying which users (or user groups) are allowed to perform operations on the file • Examples of controlled operations: • Read • Write • Execute • Append • Delete • List

  31. Unix File Protection • Read: • Read for files, list for directories • Write: • Write/modify for files, create/delete new entries for directories • Execute: • Execute for files, change directory rights for directories

  32. Access Control Lists (ACL) • Each file (directory) has an access control list attached • The access control list specifies for each controlled operation the users that are allowed to perform this operation • Advantage: • Very general and flexible • Disadvantages: • Difficult to construct if we do not know all the users beforehand • Directory entry of variable size, more difficult to manage

  33. Condensed ACL • Use of condensed ACLs instead • Use per-owner, per-group, per-others permissions • E.g.: Sara writes a book, Jim, Dawn, and Jill help her. Sara has all the rights, Jim, Dawn, and Jill may read or write but not delete, all the others may only read • Sara is the owner, has rw- permissions • A group book is created, the file is owned by user Sara and group book, Jim, Dawn, and Jill are added to group book, the group has rw- permissions • Others have r-- permissions • The directory in which the book resides has rwxr-xr-x permissions • If Sara wants Joe to have read/write access to chapter 1, she cannot add him to group book • Instead, user Joe is added to the ACL

  34. Condensed ACL • What if: • kim:staff rw-r-xr-- script.sh • User kim belongs to group staff • Should kim be allowed to execute script.sh? • If we consider that the permissions of the owner apply, then no • If we consider that the permissions of the group apply, then yes • Precedence given to most specific

  35. Disk Access Scheduling • OS has to ensure that the resources are used efficiently • Bandwidth = transferred bytes / length of interval between first request and completion of last request Rotational latency Seek time

  36. FCFS Disk Scheduling • Following requests: cylinders 98, 183, 37, 122, 14, 124, 65, 67 • Head initially at cylinder 53 • Total head movement of 640 cylinders 0 14 37 53 65 98 122 183

  37. Shortest Seek Time First • Movement of only 236 cylinders • May cause starvation • Not optimal!! If we moved from 53 to 37 and then 14, before 65, 67, etc.  208 cylinders 0 14 37 53 65 98 122 183

  38. SCAN Scheduling • When we reach one end, it is more likely that requests are closer to the other end than close to the read/write head • Those also waited the longest  C-SCAN algorithm 0 14 37 53 65 98 122 183

  39. Circular-SCAN Algorithm 0 14 37 53 65 98 122 183

  40. LOOK and C-LOOK Algorithms 0 14 37 53 65 98 122 183 0 14 37 53 65 98 122 183

  41. Which One? • Depends on the load • Depends on file allocation • SSTF and LOOK seem reasonable alternatives • Real disk geometry is hidden to the OS • Disk manufacturers include a scheduling in the hard disk controller • Then why not let the hard disk do all the scheduling? • Because some requests have different semantics and have to be treated differently (accesses of a higher priority process, or paging, for example)

  42. Summary • Files • Operations • Sharing • Protection • Directory hierarchies • Disk scheduling algorithms

More Related