Source: Robbins and Robbins, UNIX Systems Programming, Prentice Hall, 2003. Chapter 4 UNIX I/O. 4.1 Device Terminology. Device Terminology. A peripheral device is a piece of hardware accessed by a computer system
Device Terminology • A peripheral device is a piece of hardware accessed by a computer system • Examples are floppy disk, hard disk, CD-ROM drive, monitor, keyboard, printer, mouse, and network interface • User programs perform control and I/O to these devices through system calls to operating system modules called device drivers • A device driver hides the details of device operation and protects the device from unauthorized use • Some operating systems provide specific system calls for each type of peripheral device • This requires the systems programmer to learn a complex set of calls to control the various devices • UNIX greatly simplifies the programmer device interface by providing uniform access to most devices through five functions – open, close, read, write, and ioctl • All devices are represented by files, called special files, that are located in the /dev directory • Thus, disk files and other devices are named and accessed in the same way • A regular file is just an ordinary data file on a disk • A block special file represents a device with characteristics similar to a disk (i.e., block transfer and random access) • A character special file represents a device with characteristics similar to a terminal consisting of a stream of bytes that must be accessed in sequential order
Standard File Descriptors • The file descriptor represents a file device that is open and can be considered an index into the process file descriptor table • The file descriptor table is in the process user area and provides access to the system information for the associated file or device • When a program is executed, it automatically starts with three open file streams • Symbolic NameC I/O NameMeaningDefault • STDIN_FILENO stdin standard input device keyboard • STDOUT_FILENO stdout standard output device monitor • STDERR_FILENO stderr standard error report device monitor
The read() Function • The read() function provides sequential retrieval of data from files and other devices#include <unistd.h>ssize_t read(int fileDes, void *buffer, size_t nByte); • The function attempts to retrieve nByte bytes from the file or device denoted by fileDes into the user variable buffer • The programmer must provide a buffer that is large enough to hold at least nByte bytes of data, rather than an uninitialized buffer pointer variable • If successful, the function returns the number of bytes actually read • If unsuccessful, it return –1 and sets errno • A read operation for a regular file may return fewer bytes than requested if it reached end of file before completely satisfying the read request • A read operation on a regular file returns 0 to indicate end of file • When reading a terminal, EOF is commonly detected when Ctrl-D is entered
Example use of read() #include <stdio.h> #include <unistd.h> #include <errno.h> #define BUFFER_SIZE 20 // Function Prototype int readLine(int fileDescriptor, char *buffer, int nbrOfBytes); // ******************************************** int main(void) { char myBuffer[BUFFER_SIZE]; ssize_t status; status = readLine(STDIN_FILENO, myBuffer, BUFFER_SIZE); return 0; } // End main (more on next slide)
Example use of read() (continued) int readLine(int fileDescriptor, char *buffer, int nbrOfBytes) { int bufferIndex = 0; int returnValue; while (bufferIndex < nbrOfBytes – 1) // Leave space for '\0' { returnValue = read(fileDescriptor, buffer + bufferIndex, 1); if (returnValue == -1) // An error occurred return -1; if (returnValue == 0) // EOF encountered { if (bufferIndex == 0) return 0; // EOF encountered as the first character read else // bufferIndex is greater than zero // EOF encountered before a newline character was read { errno = EINVAL; // Invalid value return -1; } // End else } // End if (more on next slide)
Example use of read() (continued) bufferIndex++; if (buffer[bufferIndex - 1] == '\n') // Read end of line character { buffer[bufferIndex] = '\0'; // Add null byte to end of string return bufferIndex; } // End if } // End while // nbrOfBytes was reached before encountering end of line character errno = EINVAL; return -1; } // End readLine
The write() Function • The write() function provides sequential transfer of data to files and other devices#include <unistd.h>ssize_t write(int fileDes, void *buffer, size_t nByte); • The function attempts to output nByte bytes from the user variable buffer to the file or device denoted by fileDes • The programmer must provide the buffer holding at least nByte bytes of data • If successful, the function returns the number of bytes actually written • If unsuccessful, it return –1 and sets errno • Example call#define MAX_SIZE 100// Code segment inside a functionchar buffer[MAX_SIZE];ssize_t nbrBytesWritten;nbrBytesWritten = write(STDOUT_FILENO, buffer, MAX_SIZE/2);
Example use of read() and write() #include <stdio.h> #include <unistd.h> #define BUFFER_SIZE 200 // Function Prototype int copyFile(int sourceFD, int targetFD); // ******************************************** int main(void) { int nbrBytesCopied; nbrBytesCopied = copyFile(STDIN_FILENO, STDOUT_FILENO); fprintf(stderr, "\nNumber of bytes copied: %d\n", nbrBytesCopied); return 0; } // End main (more on next slide)
Example use of read() and write() (continued) // ********************************************** int copyFile(int sourceFD, int targetFD) { char buffer[BUFFER_SIZE]; int nbrBytesRead; int nbrBytesWritten; int totalBytes = 0; for ( ; ; ) // Infinite loop { nbrBytesRead = read(sourceFD, buffer, BUFFER_SIZE); if (nbrBytesRead <= 0) break; nbrBytesWritten = write(targetFD, buffer, nbrBytesRead); if (nbrBytesWritten <= 0) break; totalBytes = totalBytes + nbrBytesWritten; } // End for return totalBytes; } // End copyFile
The open() Function • The open() function associates a file descriptor in a program with a physical file or device#include <fcntl.h>#include <sys/stat.h>int open(const char *path, int openFlag, ...); • The path parameter points to the pathname of the file or device • The openFlag parameter specifies status flags and access modes for the opened file • A third permissions parameter must be included to specify access permissions if a file is being created • If successful, open() returns a nonnegative integer representing the open file descriptor • If unsuccessful, open() returns –1 and sets errno
The openFlag Parameter • The openFlag parameter is formed by the taking the bitwise OR ( | ) of the desired combination of the access mode and additional flags • The access mode flags are O_RDONLY, O_WRONLY, and O_RDWR • Only one of these modes may be used at a time • Additional flags are • O_APPEND Allows data to be appended to a file • O_CREAT Causes a file to be created if it doesn't exist • A third parameter is required to designate the file permissions • O_EXCL Can be used with O_CREAT to avoid writing over a file that already exists • O_NONBLOCK Controls whether the open function returns immediately or blocks until the device is ready • Below is an example of an open call to read a fileint inputFileFD;inputFileFD = open("/home/user/numbers.dat", O_RDONLY);
The Permissions Parameter • Each file has three permission classes associated with it: user, group, and others • The possible permissions in each class are read(r), write(w), and execute(x) • When a file is opened with the O_CREAT flag, a third permissions parameter of type mode_t must be specified • The table on the next slide lists the symbolic names and their meanings for designating the permissions • To form the permissions value, the programmer performs a bitwise OR of the symbols corresponding to the desired permissionsint fileID;mode_t mode = (S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);fileID = open("info.dat", O_RDWR | O_CREAT, mode);if (fileID == -1) perror("Could not open file");.
Symbolic Permission Names • S_IRUSR read by owner • S_IWUSR write by owner • S_IXUSR execute by owner • S_IRGRP read by group • S_IWGRP write by group • S_IXGRP execute by group • S_IROTH read by others • S_IWOTH write by others • S_IXOTH execute by others
UNIX Permissions • The contents of each class can be viewed as three sets of three binary numbers • rwx rwx rwx 111 111 111 • In each class, a 1 in a column means that specific permission is granted; otherwise the value is 0 • The contents of the three columns in a class can be interpreted as an octal value • For example, 700 means the permissions are user 7, group 0, and others 0 • The 7 means r, w, and x each contain the value 1 111 binary is 7 octal • A value of 1 means only x has the value 1 001 binary is 1 octal • A value of 4 means only r has the value 1 100 binary is 4 octal • A value of 5 means r and x have the value of 1 101 binary is 5 octal • A value of 6 means r and w have the value of 1 110 binary is 6 octal • The chmod command can be used to change the permissions of a file or directory • Examples • chmod 600 myfile.dat • chmod 700 privateDir • chmod 700 programA • chmod 711 http • chmod 644 index.html -rw------- myfile.dat drwx------ privateDir -rwx------ programA drwx--x--x http -rw-r--r-- index.html
The close() Function • The close() function releases the resources of the specified open file descriptor in a program#include <unistd.h>int close(int fileDescriptor); • The fileDescriptor parameter should be a valid open file descriptor • If successful, close() returns zero • If unsuccessful, close() returns –1 and sets errno
Example use of open() and close() #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/stat.h> #define READ_FLAGS O_RDONLY #define WRITE_FLAGS (O_WRONLY | O_CREAT | O_EXCL) #define WRITE_PERMS (S_IRUSR | S_IWUSR) #define BUFFER_SIZE 200 // Function Prototype int copyFile(int sourceFD, int targetFD); // ************************************************** int main(int argc, char *argv[]) { int byteCount; int fromFile; int toFile; if (argc != 3) { fprintf(stderr, "Usage: %s from_file to_file\n", argv[0]); return 1; } // End if (more on next slide)
Example use of open() and close()(continued) fromFile = open(argv[1], READ_FLAGS); if (fromFile == -1) { perror("Failed to open input file"); return 1; } // End if toFile = open(argv[2], WRITE_FLAGS, WRITE_PERMS); if (toFile == -1) { perror("Failed to create output file"); return 1; } // End if byteCount = copyFile(fromFile, toFile); printf("%d bytes copied from %s to %s\n", byteCount, argv[1], argv[2]); close(fromFile); close(toFile); return 0; } // End main
File Representation • Files can be designated within C programs either by a file descriptor of type int or a file pointer or type FILE * int inputfile; FILE *dataFile; • The UNIX I/O functions (open, read, write, close, and ioctl) use file descriptors of type int • The standard C I/O functions (fopen, fscanf, fprintf, fread, fwrite, fclose, etc.) use file pointers to type FILE * • File descriptors and file pointers provide logical designations called handles for performing device-independent input and output operations • In the stdio.h file are defined the symbolic names for the standard input (stdin), output (stdout), and error (stderr) file pointers • These are analogous to the STDIN_FILENO, STDOUT_FILENO, and STDERR_FILENO symbolic names defined in the unistd.h file
File Descriptors • The open() function associates a file or physical device with the logical handle used in the program • The file or physical device is specified by a character string such as "/home/user/myfile.dat" or "/dev/tty" • The handle is an integer that services as an index into a file descriptor table that is specific to a process • This table contains an entry for each open file in a process • It is part of the process user area, but the program cannot access it except through functions using the file descriptor • Refer to the figure on the next slide • The open() function creates an entry in the process file descriptor table that points to an entry in the system file table; it returns the value 3, specifying that the file descriptor entry is in position three of the process file descriptor table • The system file table, which is shared by all the processes in the system, has an entry for each active open file; each system file table entry contains the file offset, an indication of the access mode, and a count of the number of file descriptor table entries pointing to it • Several system file table entries may correspond to the same physical file; each of these entries points to the same entry in the in-memory inode table, which contains an entry for each active file in the system
Relationship between File Tables myfd = open("/home/ann/my.dat", O_RDONLY);
File Pointers and Buffering • When using the I/O functions in the C function library, file pointers serve as the handles • A file pointer points to a data structure named FILE that is created in the user area of the processFILE *myfp;myfp = fopen("/home/ann/my.dat", "w");if (myfp == NULL) { perror("Failed to open the file"); return 1; }else fprintf(stderr, "This is a test"); • Refer to the figure on the next slide • The FILE structure contains a buffer and a file descriptor value, which is the index of the entry in the file descriptor table • The contents of disk files are usually buffered, which means the phrase "This is a test" is not written immediately to the disk, but instead is written to the buffer in the FILE structure • When the buffer fills, or the when the file is closed, the I/O subsystem calls the write() function and transfers the buffer contents to the disk
Handling of a File Pointer myfp = fopen("/home/ann/my.dat", "w");
More on Output Buffering • The delay between the time when a program executes fprintf() and the time when the disk writing actually occurs may have consequences, especially if the program crashes • A program can avoid the effects of buffering by using thefflush() call, which forces the current buffer contents to be written to the disk • Terminal I/O works a little differently • Files associated with terminals are line buffered rather than fully buffered • The only exception to this is standard error, which by default is not buffered • On output, line buffering means that the line is not written out until the buffer is full or until a newline symbol is encountered • If a program that uses file pointers for a buffered device crashes, the last partial buffer created from the fprintf() calls may not be written out • Even the completion of a write() operation doesn't mean that the data has finally made it to the disk • The operating system copies the data to a system buffer cache and periodically writes these blocks to the disk • If the operating system crashes, the data in this cache could be lost • When the fork() function is used to create a child process, the child inherits a copy of the parent's file descriptor table
Filters • UNIX provides many utilities (i.e., programs) that are written as filters • A filter reads from standard input, performs a transformation on the data, and writes the results to standard output • Error messages are written to standard error • All of the parameters of a filter are communicated as command line arguments • The input data should have no headers or trailers, and the filter should not require any interaction with the user • Examples of useful UNIX filters are head, tail, more, sort, grep, and awk • The cat utility can also act as a filter • It takes a sequence of one or more filenames as command-line arguments, reads each of the files in succession, and echoes the contents of each file to standard output • If no input files are specified, cat takes its input from standard input and writes its results to standard output; in this case, cat behaves like a filter • It also can function as a simple editor as shown in the example below% cat >letter.txtThis is a test<Ctrl-D>%
Redirection • Each of the file streams can be redirected at the command line as shown below- Redirect standard inputa.out < inputfile.dat-Redirect standard output a.out > outfile.dat-Redirect standard error a.out 2> errorfile.dat-Redirect standard input and standard output a.out < inputfile.dat > outputfile.dat • A simple pipe ( | ) can also be used for redirection; it directs the standard out from the first program to the standard input of the second program a.out | sort • Redirection and piping can be combined on one command lineps -ef | grep "jjt107" > myProcesses.txt