730 likes | 992 Views
COIT29222-Structured Programming Lecture Week 12 . Reading: Textbook (4 th Ed.), Chapter 14 Textbook (6 th Ed.), Chapter 17 Study Guide Book 3, Module 4 This week, we will cover the following topics: Files Physical and Logical Files Text and Binary Files
E N D
COIT29222-Structured Programming Lecture Week 12 • Reading: Textbook (4th Ed.), Chapter 14 Textbook (6th Ed.), Chapter 17 Study Guide Book 3, Module 4 • This week, we will cover the following topics: • Files • Physical and Logical Files • Text and Binary Files • File Processing • Sequential Access File (SAF) • Random Access File (RAF) • Reading/Writing Data from/to SAF • Reading/Writing Data from/to RAF
Physical files • You are familiar with files, e.g.: • word processor documents • text-editor files • C++ source, object & executable files • etc. • Files are stored on: • hard disk, floppy disk, CD-ROM, memory stick, etc. • their storage is persistent • i.e. the computer can be turned off and the files are accessible when the computer is turned on again
Primary & secondary storage • Files are stored in secondary storage • a collective name for all storage not consisting of the computer’s main memory • A computer’s main memory or primary storage is volatile (not persistent) • storage for variables in a running program is allocated in primary storage -data stored in variables is temporary—it can be accessed during program execution, but is lost when the program terminatesexecution
Files – a means of communication • Filesprovide a means of communication between a running program and the ‘outside world’—the environment in which the program runs • data can be read from a file into a program and • a program can communicate with the ‘outside world’ by writing data to a file • Example: • a programgets and validates employee time-sheet entries input by a user and stores the data in a file • this fileis subsequently read by another program to generate fortnightly pay cheques
Secondary storage devices • Sequential-access devices • tape devices • to get to point q on the tape, the drive needs to pass through points a to p • analogous to audio tapes • Direct-/random-access devices • magnetic disk, floppy disk and CD-ROM • allows direct access to a particular file or a particular position within a file • analogous to audio CDs
Data encoding in files • All data stored in physical files is encoded into ones and zeros -2 basic types of encodings: • Text files • separate encoding for each character in the underlying character set (typically ASCII) • human readable i.e. can be read by a text editor • Binary files • data must be interpreted by a program (or processor) that understands the formatting of the file • not human readable • executable programs and certain data files are encoded in binary format
Efficiency: Text files vs. binary files • Storage efficiency • Binary files are morestorage-efficient than text files - Example: • In binary files the integers are stored in the same fixed number of bytes as in main memory • Example: 123 stored in 1 byte (in fact, 7bits) • In text files the length of a formatted integer determines the storage required • Example: “123” requires 3 bytes
Logical files • In order for a program to read from, or write to, a physical file, we must be able to represent the file (at an abstract level) in the program • alogical file is an abstraction that can be viewed as a ‘channel’ that connects the program to a physical file • All references to a physical file within the program are made via its logical representation • The logical file has a logical name (a variable) which is used to refer to the file inside the program
Logical file representation of a physical file • The operating system is responsible for associating a logical file in a program to a physical file on an external storage medium • I/O devices (e.g. keyboard, console, printer) are also represented by logical filesin a computer program
The logical file is a data structure • A logical file is a data structure which consists of a sequence of components of the same type • Similar to the array construct • Significant differences – a logical file: • is (theoretically) of unlimited size • has a concept of current position which is an implicit reference to some element in the sequence
C++ file streams • The file stream is the C++ logical file structure • C++ programs communicate with I/O devices (keyboard, printer, console) and physical files in secondary storage via file stream objects – familiar examples: • cin, the pre-defined input file stream object which, by default, is connected to the keyboard • cout, the pre-defined output file stream object which, by default, is connected to the console
What are file stream objects? • File stream objects are objects of a pre-defined C++ class • you can think of a class as a type and an object as a variable declared to be of that type • cin and cout are objects (variables) of the iostream class type • these objects are pre-defined in the iostream library • the requirement for #include <iostream> in programs which use these objects
C++ logical file – A sequence of bytes • A C++ file streamis a sequence of bytes • i.e. the components of C++ logical files are bytes • there is no inherent “record” structure in the C++ view of a file • any such structure must be imposed by the C++ program reading or writing the file • individual bytes could be read into, and written from, the fields of a struct-type object
Defining your own file stream objects • To access files in secondary storage from a C++ program you need to define your own file stream objects. • These objects must be defined to be one of the following class types: ifstream– for input (read) operations only ofstream– for output (write) operations only fstream– for input & output (read/write) operations • These classes are all declared in the fstreamheader file (#include <fstream>)
Defining your own file stream objects - Example • Example: 3 file stream objects defined: • OutFile- an output file stream object • InFile - an input file stream object • InOutFile- an object that can be used for both input and output
Current position • During program execution: • When a logical fileisassociated with a physical file thenotion of current position becomes well-defined • i.e. refers to a particular element in the linear sequence of components • Each read or write operation advances this reference one position • i.e. successive operations access successive elements automatically. • this reference to the current position in the file is called the file window (or file pointer).
File window (file pointer) • The file window is automatically created when a logical file is associated with a physical file. • An individual component of a file can be “seen” (is accessible) in the program, only when the file window is positioned over it:
File access types • The logical file access types are: • Sequential access • components can be accessed in the sequence in which the data is stored in the file • automatic advance of the file window after a read/write operation is the only way of changing the current position • Random (direct) access • components can be accessed in any order (including sequentially) • the file window is implicitly advanced after read/write operations and can be explicitly positioned with a seek operation
Associating logical & physical files • At the point of association between a logical file name and a physical file, the following are usually specified or take certain default values: • the file window is set to some specified position in the file • the type of data encoding (text or binary) • the access type (sequential or random)
Attaching file stream objects to external files/devices • In C++, before a file stream object (logical file) can be used, it must be associated with an external file or device (physical file). • achieved with a call to the open() function– Syntax: <file stream object>.open(<physical file name>, <file access mode>) • <physical file name>: • a C-style character array • must specify a file name which adheres to the file-naming requirements of the operating system …/Cont’d
Attaching file stream objects to external files/devices • <file access mode>: • allows specification of: • file window position: defaults to beginning of file (can be set to end of file) • data encoding: defaults to text (can be set to binary) • Note: no file access type is specified since files in C++ are not distinguished as direct-access or sequential-access files • optional for file stream objects of type ifstream and ofstream • must be specified for objects of type fstream
Attaching file stream objects to external files/devices - Example -associates clients.dat with OutFile writing to OutFile generates output to clients.dat -associates trans.datwith InFile reading from InFile will take input from trans.dat
C++ file access modes • The file mode designators are ORed together, using the bitwise OR operator, |, to achieve the required file-access type. • Thus, to open the disk file, emp.dat, for both input and output:
C++ file access modes • Position of the file window • by default, the beginning of the file • mode designators, ios::ate and ios::app can be specified to alter this default status • Data encoding • by default, C++ files are text files • The underlying character set on an IBM compatible PC is the ASCII character set C++ text files are encoded in ASCII format. • the mode designator, ios::binary, must be usedto specify binary encoding in a file
C++ file access modes - Defaults for output file streams • When a file stream object is opened for output the default file status depends on whether the file associated with the object exists or not. • Examples: • if a disk file exists and is opened for output the contents of the file will be lost • if the same file is opened for input and output the contents of the file remains unchanged
Testing the success of the open() operation • An attempt to associate a file stream object with a physical file/device might fail for various reasons – Examples: • attempting to open a non-existent file for reading • attempting to open a file for writing when no disk space is available (i.e., the disk is full) • The success of an open() operation can be determined with the use of the fail() function - invoked on a file stream object. • returns false if the last operation on the file stream was successful, and true otherwise
Fatal error—Terminate program execution • The inability to open a file stream object is usually a fatal error – one from which the program cannot recover. • When a fatal error occurs, program execution is generally terminated. • In C++, this can be achieved with the use of the exit() function (#include <cstdlib>) • The argument to exit() is returned to the environment in which the program was executed. • An argument of 0 program terminated normally; • an argument other than 0 the program terminated due to an error.
Closing the association between a logical & physical file • This operation is like “hanging up” the connection between a physical file and a program. • In C++ the association between an external file/device and a file stream is terminated with a call to the close() function – Syntax: <file stream object>.close() • Example: InOutFile.close(); • File stream objects should be closed once processing on the file is complete.
Writing sequentially to a C++ file stream • An output file stream object that has been defined and associated with an external file via a call to the open() function can be used in the same way as the pre-defined output stream object, cout- Example:
To generate output to a sequential-access file in C++ • 6 basic steps: • include the file stream library: #include <fstream> • define an output file stream object of the ofstreamclass • associate the output file stream object with an external file using theopen() function on the file stream object • test to ensure that the open() operation succeeded • transfer data to the external file by using the stream insertion operator on the file stream object • close the file stream object when data transfer is complete
Reading sequentially from a C++ file stream • To read a sequential-access file we define an input file stream object and associate it with an external file via a call to the open() function. • Data can be read from this file stream object using the stream extraction operator, >>, in the same way as data can be read from the pre-defined input stream object, cin. • Recall from your experience with cin, that when reading from an input stream, “white space” serves to separate data items.
Copying files • To copy the entire contents of a text file (including “white space” characters) to the screen/another file, we can use the get() function on an input file stream object. • Example: InFile.get(CharRead); • stores the next character in the input file stream, InFile, in the character variable, CharRead. • However, to read all the characters in a file we would need to know the number of characters in the file (in general, an unlikely scenario), or we need to know when we have reached the end of the file.
End-of-file status • A logical file is theoretically of unlimited size since it must be able to represent physical files of arbitrary size. • However, given a logical file that has been associated with an existing physical file, there are a fixed number of components that can be read from the file. • a logical file must provide a means of determining when all the components in the file have been read • we can view a logical file as having an end-of-file component following the last component in the file
End-of-file status Conceptual view of the logical file after a read operation:
Detecting end-of-filein C++ • There is an end-of-file function, eof(), which can be called on objects of the file stream classes. • This function returns true if an attempt has been made to read beyond the last component in the file. • With reference to the previous slide, this corresponds to the file window being positioned over the EOF component when a read operation is performed.
C++ I/O state bits • However, the eof() function tests only for the end-of-file condition. • In the example of the previous slide there is no test to ensure that the get() operation succeeded. • to do this requires a review of the C++ I/O state bits: • eofbit:set if an attempt has been made to read the EOF marker • failbit:set if an operation failed on a stream—for example, on bad format of input data—note that this includes an attempt toread the EOF marker • badbit:set when the stream becomes unstable due to someunrecoverable I/O systems or hardware error –usually involves a loss of data
Checking the status of C++ file streams • The status of a file stream can be tested with the following functions: • eof()true if the eofbit is set • bad()true if the badbit is set • fail()true if either the badbit is setorthe failbit is set • good()true if none of the state bits(eofbit, failbit, badbit) are set
Copying files - Example • The success of the get() operation can be ensured with the use of the fail() function – i.e. loop until: • an operation fails on the stream or • the end-of-file status has been set
Reading from a sequential-access file in C++ • 6 basic steps: • include the file stream library: #include <fstream> • define an input file stream object -ifstreamclass • associate the input file stream object with an external file using the open()function • test to ensure that the open() operation succeeded • transfer data from the external file to the program • use stream extraction operator if “white space” characters are to beignored • use the get()/getline() function if “white space” characters are to be read • close the file stream when data transfer complete
Numeric data in sequential-access files– Example: • A disk file, in.dat, contains a set of integer value pairs. • A program is to read this file and generate a disk file, out.dat, consisting of the product of each pair of numbers. • The format of these disk files is shown below: