320 likes | 488 Views
System Programming. Chapter 3 File I/O. Announcement. The first exercise is due today. We will NOT accept late assignment this time. Submission site is up. Sign up your account, activate your account, and submit your exercise on-line if possible.
E N D
System Programming Chapter 3 File I/O
Announcement • The first exercise is due today. We will NOT accept late assignment this time. • Submission site is up. • Sign up your account, activate your account, and submit your exercise on-line if possible. • You can also hand in your exercise to TAs. • MP1 is out and is due on March 21st.
Fun Project • Topobo (MIT media lab) • A new way of “programming” without “writing programs”?
Tips for Writing Good Codes – Design to Test • Avoid the errors is much easier than fixing the errors. • Tips: • Writing accessible code: be prepared for giving your code away when you start. • Document your code. • Giving the variables good names, or things can get very confusing, very fast, as the cognitive dissonance interferes with your brain's normal processing. • InputFromFile is better than buf1/data_2. • validsize(InputNum) is better than checksize(InputNum) • Automatically test your code not manually • Writing unit test
Writing Unit Tests • Write a test module for your function. • Example: double mySqrt(double num). • What to test your function? • Pass in a negative argument and ensure that it is rejected. • Pass in an argument of zero to ensure that it is accepted. • Pass in values between zero and the maximum expressible argument and verify that the different between the square of the result and the original argument is less than some value epsilon.
24 void testValue(double num, double expected) { 25 double result = mySqrt(num); 26 if (num < 0) { 27 // Should be NaN if negative arg 28 assert(isnan(result)); 29 } 30 else { 31 // Should be within tolerance otherwise 32 assert(fabs(expected-result) < epsilon); 33 } 34 }
Contents • Preface/Introduction • Standardization and Implementation • File I/O • Standard I/O Library • Files and Directories • System Data Files and Information • Environment of a Unix Process • Process Control • Signals • Inter-process Communication
File I/O • Objective of this chapter: • Functions available for file I/O • Atomic operations in multiprogramming environments • Unbuffered I/O • Popular functions: open, close, read, write, lseek, dup, fcntl, ioctl • Each read() and write() invokes a system call!
File I/O • File • A sequence of bytes • Directory • A file that includes info on how to find other files.
File I/O • Path name • Absolute path name • Start at the root / of the file system • /user/john/fileA • Relative path name • Start at the “current directory” which is an attribute of the process accessing the path name. • ./dirA/fileB • Links (wait till later chapters)
File I/O • File Descriptor • Non-negative integer returned by open() or creat(): 0 .. OPEN_MAX • Virtually un-bounded for SVR4 & 4.3+BSD • Per-process base • POSIX.1 – 0: STDIN_FILENO, 1: STDOUT_FILENO, 2: STDERR_FILENO • <unistd.h> • Convention employed by the Unix shells and applications
File I/O - File Manipulation • Operations • open, close, read, write, lseek, dup, fcntl, ioctl, trunc, rename, chmod, chown, mkdir, cd, opendir, readdir, closedir, etc. • File descriptor data block data block i-node i-node Read(4, …) i-node sync System Open File Table In-core v-node list Tables of Opened Files (per process)
File I/O – open() #include <sys/types> #include <sys/stat.h> #include <fcntl.h> int open(const char*pathname, int oflag, …/*, mode_t mode */); • File/Path Name • PATH_MAX, NAME_MAX • _POSIX_NO_TRUNC -> ENAMETOOLONG if error occurs (NAME_MAX or PATH_MAX). • O_RDONLY, O_WRONLY, O_RDWR • O_APPEND, O_TRUNC, (O_NOCTTY) • O_CREAT, O_EXCL -> (atomicity) • O_NONBLOCK • O_DSYNC (write data only), O_RSYNC (read waits for write completion), O_SYNC (write data & attr)
File I/O – creat() and close() #include <sys/types> #include <sys/stat.h> #include <fcntl.h> int creat(const char*pathname, mode_t mode); • = open(pathname, O_WRONLY | O_CREAT | O_TRUNC, mode) • Only for write-access #include <unistd.h> int close(int filedes); • All open files are automatically closed by the kernel when a process terminates.
Why need atomicity? fd = open(“foo”, O_CREAT | O_EXCL, mode); Can be rewritten to … if ((fd = open(“foo”, O_WRONLY)) < 0) { // multiprogramming can do something bad unexpectedly if (errno == ENOENT) { if ((fd = creat(“foo”, mode)) < 0) { err_sys(“create error”); } ….. One complex operation with multiple functiion calls -> all steps or none executed
File I/O - lseek #include <sys/types> #include <unistd.h> off_t lseek(int filedes, off_t offset, int whence); • Change the offset in a file descriptor • whence: SEEK_SET(beginning), SEEK_CUR, SEEK_END • Example: fd = open(); lseek(fd, 100, SEEK_SET) • (Say you lose track) how do you find the current offset? currpos = lseek(fd, 0, SEEK_CUR) • off_t: typedef long off_t; /* 231 bytes */ • or typedef longlong_t off_t; /* 263 bytes */ • No I/O takes place until next read or write.
File I/O - lseek • Program 3.1 – Page 52 • Test if “standard input” is capable of seeking? • cat < /etc/motd | seek cannot seek a FIFO or pipe (EPIPE) • seek < /var/spool/cron/FIFO • Program 3.2 – Page 53, hole creating! • od –c file.hole 000000 a b c \0 \0 \n 000006
File I/O – read and write #include <unistd.h> ssize_t read(int filedes, void *buf, size_t nbytes); • Less than nbytes of data are read: • EOF, terminal device (line-input), network buffering, record-oriented devices (e.g., tape) • Offset is increased for every read() – SSIZE_MAX #include <unistd.h> ssize_t write(int filedes, const void *buf, size_t nbytes); • Write errors for disk-full or file-size-limit causes. • When O_APPEND is set, the file offset is set to the end of the file before each write operation.
File I/O - Efficiency • Program 3.3 – Page 56 • No needs to open/close standard input/output • Copy stdin to stdout (> /dev/null) • Try I/O redirection in reading an 1.4M file • Buffersize UsrCPU SysCPU Clock #loops 1 23.8s 397.9s 423.4s 1468802 64 0.3s 6.6s 7.0s 22951 512 0.0s 1.0s 1.1s 2869 1024 0.0s 0.6s 0.6s 1435 8192 0.0s 0.3s 0.3s 180 131072 0.0s 0.3s 0.3s 12
File I/O – Sharing • Table per process: filedes flags (close-on-exec), a pointer • Sys open file table: file status(open flags), offset, a v-node pointer • V-node (V = virtual) • i-node: owner, file size, residing device, block ptr,.. data block data block i-node i-node Read(4, …) i-node sync System Open File Table In-core V-node list Tables of Opened Files (per process)
filedes flags file size file status / offset System Open File Table In-core i-node list Tables of Opened Files (per process) File I/O – Sharing • Each “independently opened file” has its offset. • Examples • Write offset is incremented! • O_APPEND offset = current file size before each write • lseek() causes no I/O (only on the system open file table) • fork() & dup()causes the sharing of entries in the (system open) file table. • filedes flags versus file status flags
Why needs Atomic Operation? • Atomic Operation • Composed of multiple steps (function calls)? • Say you want to append to the end of a file • What can go wrong in multiple processes? if (lseek(fd, 0L, 2) < 0) err_sys(“lseek err”); if (write(fd, buf, 10) != 10) err_sys(“wr err”); • Problem solved with O_APPEND fd = open(pathname, O_WRONLY | O_APPEND, mode); write(fd, buf, 10)
File I/O – pread and pwrite #include <unistd.h> ssize_t pread(int filedes, void *buf, size_t nbytes, off_t offset); ssize_t pwrite(int filedes, const void *buf, size_t nbytes, off_t offset); • Read and write from a filedes at a “specified” offset. • Why create them? (= lseek + read/write ?) • Do not update the file offset • Atomic operation
File I/O – dup and dup2 #include <unistd.h> int dup(int filedes); int dup2(int filedes, int newfiledes); • dup(): duplicate filedes & return the lowest available filedes. • dup2(): duplicate filedes to newfiledes • What if newfiledes is already open? • close(newfiledes); fcntl(fildes, F_DUPFD, newfiledes); // atomic Tables of Opened Files (per process) System Open File Table In-core i-node list
File I/O: sync(), fsync(), fdatasync() Kernel maintains a buffer cache between apps & disks. #include <unistd.h> int fsync(int filedes); // data + attr sync int fdatasync(int filedes); // data sync void sync(); // returns immediately Apps write() Kernel buffer cache Disk DD queue Disks
File I/O - fcntl #include <sys/types> #include <unistd.h> #include <fcntl.h> int fcntl(int filedes, int cmd, … /* int arg */); • Changes the properties of opened files • F_DUPFD: duplicate an existing file descriptor (>= arg). • FD_CLOEXEC is cleared (for exec()). • F_GETFD, F_SETFD: file descriptor flag, e.g., FD_CLOEXEC • F_GETFL, F_SETFL: file status flags • O_APPEND, O_NOBLOCK, O_SYNC, O_ASYNC, O_RDONLY, O_WRONLY … val = fcntl(fd, F_GET_FL, 0); val |= flags; fcntl(fd, F_SETFL, val) • F_GETOWN, F_SETOWN
File I/O - fcntl • Figure 3.10 • Print file flags for a specified descriptor • Figure 3.11 • Turn on one or more flags • val &= ~flags: clear the flag. • set_fl(STDOUT_FILENO, O_SYNC);
File I/O - ioctl #include <unistd.h> #include <sys/ioctl.h> int ioctl(int filedes, int request, …); • Catchall for I/O operations (not just disk I/O) • E.g., setting of the size of a terminal’s window. • SVR4 prototype • More headers could be required: • Disk labels (<disklabel.h>), file I/O (<ioctl.h>), mag tape (<mtio.h>), socket I/O (<ioctl.h>), terminal I/O (<ioctl.h>)
File I/O - /dev/fd • /dev/fd/n • open(“/dev/fd/n”,mode) duplicate descriptor n (assuming that n is open) • open(“/dev/fd/0”,mode) same as fd=dup(0) • The new mode must be a subset of that of the referenced file. • Uniformity and Cleanliness! • cat /dev/fd/0
What did I learn today? • What can you do with File I/O? • Why is atomicity needed? • What is the impact of sync on I/O performance?