300 likes | 555 Views
Unix Programming Environment Part 6-2 – Standard I/O Library, File and Directory Prepared by Xu Zhenya( xzy@buaa.edu.cn ). Draft – Xu Zhenya( 2002/10/01 ). Agenda. 1. Overview 2. Standard I/O Routines Chapter 6 3. Files and Directories Chapter 7( 1, 2, 3 ). Overview (1).
E N D
Unix Programming Environment Part 6-2 – Standard I/O Library, File and Directory Prepared by Xu Zhenya( xzy@buaa.edu.cn ) Draft – Xu Zhenya( 2002/10/01 )
Agenda • 1. Overview • 2. Standard I/O Routines • Chapter 6 • 3. Files and Directories • Chapter 7( 1, 2, 3 )
Overview (2) • 1. POSIX I/O: • ssize_t pread( int fd, void * buf, size_t nbytes, off_t offset ); • ssize_t pwrite( int fd, const void * buf, size_t nbytes, off_t offset ); • 2. scatter /gather I/O: • ssize_t readv / writev ( int fd, const struct iovec *iov, int iovcnt ); • struct iovec { caddr_t iov_base; int iov_len; } • 3. Nonblocking I/O • O_NONBLOCK & O_NDELAY • 4. I/O Multiplexing • select() & poll() • 5. Async I/O • Performance: Commercial RDMS, concurrency model • SIGIO • Implementation: user-level, kernel-level • 6. Memory-mapping file I/O • void *mmap(void *addr, size_t len, in prot, int flags, int fd, off_t off ); • 7. 64-bit files: off_t => offset_t, llseek
Standard I/O (1) • 1. History • The standard I/O library was rewritten by Dennis Ritchie around 1975. Surprisingly, little has changed in the standard I/O library after more than 27 years. • 2. Streams and FILE objects • Linux: /usr/include/libio.h – struct _IO_FILE; • On some UNIX systems, < 256 /1024 • STDIN_FILENO, STDOUT_FILENO, STDERR_FILENO • 3. Buffering • Fully buffered: • files that reside on disks • fflush(): forces a write of all user-space buffered data for the given output or update stream
Standard I/O (2) • Line buffered: • perform I/O when a new line charater is encountered on input or output. • the stream refers to a terminal. • Unbuffered: • The standard I/O library does not buffer the characters. • stderr • Notes: • Default setting: • Standard error is always unbuffered. • All other streams are line buffered if they refer to a terminal device; otherwise they are fully buffered. • The size of line buffers are 128bytes, and the size of full buffers are 1K bytes. • setbuf(), setvbuf() • make sure that the buffer space still exists…
Standard I/O (3) • 4. Reading the textbook: • P131: table6-3 • P141: table6-4 • 5. Misc. • FILE * freopen(const char *pathname, const char *type, FILE *fp ); • Typically used to open a specified file as one of the predefined streams: standard input, output or error. • FILE * fdopen( int fd, const char * type ); • Often used with descriptors that are returned by the functions that create pipes and network communication channels.
Standard I/O (4) • int fileno( FILE * stream ); • ferror(), feof(), clearerr() • Positioning a stream: • ftell(), fseek(), rewind() • fgetpos(), fsetpos() • Formatted output: • vprintf, vfprintf, vsprintf, • va_list; man vprintf; man 3 stdarg • Binary I/O • size_t fread( void *p_buf, size_t size, size_t nobj, FILE *fp ); • Read and write a C structure? • Compilers: the binary layout of a structure: alignment, (#pragma pack(1)) • CPUs’ architecture: Floating-point values • To exchange binary data : Network protocols
Standard I/O (5) • 6. Temporary files • char * tmpnam( char * ptr ); • If ptr is NULL, then return a pointer to a static area. => copy • If ptr is not NULL, then lengthof( ptr ) >= L_tmpnam ( <stdio.h> ) • char *tempnam(const char *directory, const char *prefix ); • to specify both the directory and a prefix of the generated pathname. • FILE * tmpfile( void ); • A temporary file( wb+ ) that is automatically removed when it is close or on program termination. • 7. Cautions for Using Standard I/O • Don’t mix standard I/O and system-level I/O. • If we write to the file at system-level, but then read at the standard I/O level => we might lose the changes.
File and Directory (1) File Descriptor 0 Process: the file descriptor flags: CLOSE_ON_EXEC a pointer to a file table entry The file structure in the kernel: the file status flags( read, write, append, sync, nonblocking, etc ) the current file offset, reference count and a pointer to the v-node table entry for this file v-node structure: On Solaris, we can use “crash” to trace all data structures in the diagram.
inode->i_op => ramfs.c::ramfs_get_inode OO ::Object Model
File Operations (1) • 1. open & create • open, create • 2. read & write, synchronize the file to disk, change the offset location, change the size of a file • read, write • sync, fsync • lseek, llseek • ftruncate, truncate • 3. close, delete • close, unlink, remove
File Operations (2) • 4. check accessiblility, files’ attributes • stat, fstat, lstat • access • umask, chmod, fchmod, chown, fchown, lchown, utime • 5. fcntl, ioctl • 6. processes • chdir, fchdir, getcwd, chroot • umask, dup, dup2
open and creat • int open( const char *pathname, int flags, mode_t mode ); • O_RDONLY, O_WRONLY, O_RDWR • O_APPEND, O_TRUNC • O_CREAT, O_EXCL • O_SYNC, • O_NONBLOCK, O_NDELAY • int creat( const char *pathname, mode_t mode ); • open( pathname, O_WRONLY | O_CREAT | O_TRUNC, mode ); • Temporary files: • open( pathname, O_RDWR | O_CREAT | O_TRUNC, mode );
close, unlink, lseek, truncate • close, unlink, remove • DS in the kernel memory, links of an inode in the file system • When a process terminates, all open files are automatically closed by the kernel. • unlink: links– • If the name was not the last link to a file • If the name was the last link to a file and no process have the file open • If the name was the last link to a file and any process have the file open • remove deletes a name from the file system. It calls unlink for files, and rmdir for directories. • lseek • Symbol const: SEEK_SET(0), SEEK_CUR(1), SEEK_END(2) • Only recoding the current file offset within the kernel => NO I/O operations to take place. • A hole in the file • int truncate( const char *path, off_t length ); • int ftruncate( int fd, off_t length );
read and write • read • The number of bytes actually read is less than the requested amount: • A regular file: the end of file • A terminal device: line • From a network endpoint: • Record-oriented devices like a magnetic tape • write • The return value is usually equal to the nbytes argument, otherwise an error has occurred: exceeding the size limit, etc • I/O Efficiency • Textbook, p148
fcntl • int fcntl( int fd, int cmd, … /* int arg */ ); • Change the properties of a file that is already opened. • Example: flag = fcntl( fd, F_GETFL, 0 ); fcntl( fd, F_SETFL, flag | O_NONBLOCK ); • The fcntl function is used for five different purposes: • Duplicate an existing descriptor • Get/set file descriptor flags • Get/set file status flags • Get/set asynchronous I/O ownership • Get/set record locks
ioctl • int ioctl( int fd, int request, … ); • manipulates the underlying device parameters of special files: device drivers, etc • Terminal I/O => POSIX.1 standard
stat, fstat and lstat • int stat( const char *file_name, struct stat *buf ); • int fstat( int filedes, struct stat *buf ); • int lstat( const char *file_name, struct stat *buf ); struct stat { • dev_t st_dev; /* device */ • ino_t st_ino; /* inode */ • mode_t st_mode; /* protection */ • nlink_t st_nlink; /* number of hard links */ • uid_t st_uid; /* user ID of owner */ • gid_t st_gid; /* group ID of owner */ • dev_t st_rdev; /* device type (if inode device) */ • off_t st_size; /* total size, in bytes */ • unsigned long st_blksize; /* blocksize for filesystem I/O */ • unsigned long st_blocks; /* number of blocks allocated */ • time_t st_atime; /* time of last access */ • time_t st_mtime; /* time of last modification */ • time_t st_ctime; /* time of last change */ }; Textbook: p156, example: checkmail.c
access, umask and chmod • int access( const char *pathname, int mode); • checks whether the process would be allowed to read, write or test for existence of the file (or other file system object). • a symbolic link: the file referred to by this symbolic link • R_OK, W_OK, X_OK, F_OK • umask, chmod, fchmod, chown, fchown, lchown, utime
File Sharing • In one process • int dup( int oldfd ); • int dup2( int oldfd, int newfd ); • Textbook, p163, system() • I/O redirection, some Servers like WWW • Among processes • O_CREAT & O_EXCL • Atomic Operations: • lseek and write => pread & pwrite • the offset of a open file in the kernel
Directory • 1. directory format ( textbook, p153 ) struct dirent { #ifndef __USE_FILE_OFFSET64 __ino_t d_ino; __off_t d_off; #else __ino64_t d_ino; __off64_t d_off; #endif unsigned short int d_reclen; unsigned char d_type; char d_name[256]; /* We must not include limits.h! */ }; • 2. accessing a directory • DIR *opendir(const char *name); • struct dirent *readdir(DIR *dir); • void rewinddir(DIR *dir); • int closedir(DIR *dir); • exec(): close all directory stream • 3. creating and deleting, renaming a directory • mkdir, rmdir, rename
Symbol Link • readlink, symlink
void * mmap(void *start, size_t length, int prot , int flags, int fd, off_t offset); 1. Map a file or a POSIX shared memory object into the calling process’s address space. The underlying objects may be: a regular file a special device(/dev/zero): anonymous mapping shm_open(): unrelated process ( POSIX ) 2. function list: mmap, munmap, msync, mprotect 3. Linux: shared library, executable binary files, read/write operations. mmap() – (1)
1. vm_area, vm_area driver = filemap • 2. vm_area is linked to the inode, its page too. • 3. when page fault during accessing the memory, the fault handler will call the driver, which should invoke the inode’ mmap() to load from the file.
mmap() – (2) • Notes for mmap(): • 1. flag == MAP_PRIVATE: don’t change the underlying object; during the first writing, the kernel would dup a private copy. It imply that we can see the change before the first time writing. • 2. flag == MAP_SHARED: the underlying object will be changed. • 3. the mapped length & the size of the underlying object • PAGESIZE: sysconf( _SC_PAGESIZE ); • SIGBUS: out of the length of the underlying object • SIGSEGV: out of the mapped segment • Use truncate & ftruncate frist to resize the underlying object • 4. Fork(): if flag == MAP_SHARED, the child process inherits the mapped memory.
mmap() – (3) • 5. how to use mmap() • Directly access the memory, not using read() & write() • !!!! But the system calls read() & write are atomic operations • Not all FD can be mapped into memory. Like a terminal or socket • Implement the shared memory among unrelated processes • The underlying object provides the initial values for the mapped memory. • Any change will be write back to the underlying object. • 6. how to use anonymous mapping? • On SYSV, we can use /dev/zero to map: ZFOD( zero-fill-on-demand) • After loading the program, BSS, heap and stack will use anonymous mapping. • In the applications, a parent and children processes can use anonymous mapping to share memory without creating or opening a real file. • 7. truncate & ftruncate to resize the underlying object