380 likes | 532 Views
CS1010: Programming Methodology http://www.comp.nus.edu.sg/~cs1010/. Week 13: File Processing. Objectives: Understand concepts of file I/O Learn how to use stdlib functions for formatted, character and line I/O References: Chapter 3, Chapter 7 Lessons 3.6, 3.8, 7.4. Week13 - 2.
E N D
CS1010: Programming Methodologyhttp://www.comp.nus.edu.sg/~cs1010/
Week 13: File Processing Objectives: Understand concepts of file I/O Learn how to use stdlib functions for formatted, character and line I/O References: Chapter 3, Chapter 7 Lessons 3.6, 3.8, 7.4 CS1010 (AY2011/2 Semester 1) Week13 - 2
Week 13: Outline 0. Week 12 Exercise #3: Health Screen Introduction Opening a File and File Modes Closing a File I/O Functions Formatted I/O with Demo #1 Detecting End of File & Errors with Demo #2 Character I/O with Demo #3 Exercise #1: Trimming Blanks Line I/O with Demo #4 Video on Data Display Debugger (DDD) CS1010 (AY2011/2 Semester 1) Week13 - 3
Wk12. Ex #3 (take-home): Health Screen (1/2) • Write a program Week12_Health_Screen.c to read in a list of health screen readings • Each input line represents a reading consisting of 2 numbers: a float value indicating the health score, and an int value indicating the number of people with that score • You may assume that there are at most 100 readings • The input should end with the reading 0 0, or when 100 readings have been read. • As the readings are gathered from various clinics, there might be duplicate scores in the input. You are to determine how many unique scores there are. • A skeleton program Week12_Health_Screen.c is given. • We’ll discuss this at our next lecture. CS1010 (AY2011/2 Semester 1)
Wk12. Ex #3 (take-home): Health Screen (2/2) • A sample run is shown below Enter score and frequency (end with 0 0): 5.2135 3 3.123 4 2.9 3 0.87 2 2.9 2 8.123 6 3.123 2 7.6 3 2.9 4 0.111 5 0 0 Number of unique readings = 7 • Possible extension: Which is the score that has the highest combined frequency? (Do this on your own.) CS1010 (AY2011/2 Semester 1)
1. Introduction: Input/Output • In C, input/output is done based on the concept of a stream • A stream can be a file or a consumer/producer of data • Examples: screen, keyboard, hard disk, printer, network port • A stream is accessed using file pointer variable of type FILE * • The I/O functions/macros are defined in header file stdio.h • Two types of streams: text and binary CS1010 (AY2011/2 Semester 1) Week13 - 6
1. Introduction: Text vs Binary Streams • Text stream • Consists of a sequence of characters organized into lines • Each line contains 0 or more characters followed by a newline character ‘\n’ • Text streams stored in files can be viewed/edited easily • Example: source code of C program • Binary stream • Beyond the scope of CS1010 CS1010 (AY2011/2 Semester 1) Week13 - 7
1. Introduction: Standard Streams • Three streams are predefined for each C program: stdin, stdout, stderr • stdin points to a default input stream (keyboard) • stdout points to a default output stream (screen) • stderr points to a default output stream for error messages (screen) • printf writes output to stdout • scanf reads input from stdin • The 3 standard streams do not need to be declared, opened, and closed CS1010 (AY2011/2 Semester 1) Week13 - 8
1. Introduction: Constants • There are two useful constants used in file processing • NULL: null pointer constant • EOF: used to represent end of file or error condition CS1010 (AY2011/2 Semester 1) Week13 - 9
2. Opening a File and File Modes (1/2) • Prototype: • FILE *fopen(const char *filename, const char *mode) • Returns NULL if error; otherwise, returns a pointer of FILE type • Possible errors: non-existent file, or don’t have permission to open the file • File mode depends on intended file operations CS1010 (AY2011/2 Semester 1) Week13 - 10
2. Opening a File and File Modes (2/2) • File modes for text files: • We will focus only on “r” and “w” • Note that • In “r” mode, file must already exist • In “w” mode, new data overwrite existing data (if any) CS1010 (AY2011/2 Semester 1) Week13 - 11
3. Closing a File • Prototype: • int fclose(FILE *fp) • Allows a file that is no longer used to be closed • Returns EOF if error is detected; otherwise, returns 0 • It is good practice to close a file after use. CS1010 (AY2011/2 Semester 1) Week13 - 12
4. I/O Functions • Formatted I/O: fprintf, fscanf • Uses format strings to control conversion between character and numeric data • Character I/O: fputc, putc , putchar , fgetc , getc , getchar , ungetc • Reads and writes single characters • Line I/O: fputs, puts , fgets , gets • Reads and writes lines • Used mostly for text streams • Block I/O: fread, fwrite • Used mostly for binary streams CS1010 (AY2011/2 Semester 1) Week13 - 13
5. Formatted I/O (1/4) • Uses format strings to control conversion between character and numeric data • fprintf: converts numeric data to character form and writes to an output stream • fscanf: reads and converts character data from an input stream to numeric form • Both fprintf and fscanf functions can have variable numbers of arguments • Example: float weight, height; FILE *fp1, *fp2; . . . fscanf(fp1, "%f %f", &weight, &height); fprintf(fp2, "Wt: %f, Ht: %f\n", weight, height); CS1010 (AY2011/2 Semester 1) Week13 - 14
5. Formatted I/O (2/4) • fprintf returns a negative value if an error occurs; otherwise, returns the number of characters written • fscanf returns EOF if an input failure occurs before any data items can be read; otherwise, returns the number of data items that were read and stored printf(" … "); = fprintf(stdout, " … "); scanf(" … "); = fscanf(stdin, " … "); CS1010 (AY2011/2 Semester 1) Week13 - 15
5. Formatted I/O (3/4): Demo #1 Week13_Demo1.c #include <stdio.h> int main(void) { FILE *infile, *outfile; char x; int y; float z; infile = fopen("demo1.in", "r"); outfile = fopen("demo1.out", "w"); fscanf(infile, "%c %d %f", &x, &y, &z); fprintf(outfile, "Data read: %c %d %.2f\n", x, y, z); fclose(infile); fclose(outfile); return 0; } File “demo1.in”: 10 20 30 What’s the output in “demo1.out”? CS1010 (AY2011/2 Semester 1) Week13 - 16
5. Formatted I/O (4/4): Demo #1 ver2 Week13_Demo1_v2.c #include <stdio.h> #include <stdlib.h> int main(void) { . . . if ((infile = fopen("demo1.in", "r")) == NULL) { printf("Cannot open file demo1.in\n"); exit(1); } if ((outfile = fopen("demo1.out", "w")) == NULL) { printf("Cannot open file demo1.out\n"); exit(2); } . . . } To use exit() Check if file can be opened. CS1010 (AY2011/2 Semester 1) Week13 - 17
6. Detecting End of File & Errors (1/2) • Each stream is associated with two indicators: error indicator & end-of-file (EOF) indicator • Both indicators are cleared when the stream is opened • Encountering end-of-file sets end-of-file indicator • Encountering read/write error sets error indicator • An indicator once set remains set until it is explicitly cleared by calling clearerr or some other library function • feof returns a non-zero value if the end-of-file indicator is set; otherwise returns 0 • ferrorreturns a non-zero value if the error indicator is set; otherwise returns 0 • Need to include <stdio.h> CS1010 (AY2011/2 Semester 1) Week13 - 18
6. Demo #2: Using feof( ) (2/2) Week13_Demo2.c #include <stdio.h> #include <stdlib.h> int main(void) { . . . while (!feof(infile)) { fscanf(infile, "%d", &num); printf("Value read: %d\n", num); } . . . } • Caution on using feof() Input file “demo2.in” 10 20 30 Output: Value read: 10 Value read: 20 Value read: 30 Value read: 30 Why does the last line appear twice? To be discussed in Week 13 discussion. CS1010 (AY2011/2 Semester 1) Week13 - 19
7. Character I/O: Output Functions (1/4) • Output functions: fputc, putchar int ch = 'A'; FILE *fp; putchar(ch); // writes ch to stdout fp = fopen( ... ); fputc(ch, fp); // writes ch to fp • fputc and putchar return EOF if a write error occurs; otherwise, return character written CS1010 (AY2011/2 Semester 1) Week13 - 20
7. Character I/O: Input Functions (2/4) • Input functions: fgetc, getchar, ungetc int ch; FILE *fp; ch = getchar(); // reads a char from stdin fp = fopen( ... ); ch = fgetc(fp); // reads a char from fp • fgetcand getchar return EOF if a read error occurs or end of file is reached; otherwise, return character read • Need to call either feof or ferror to distinguish the 2 cases CS1010 (AY2011/2 Semester 1) Week13 - 21
7. Character I/O: Demo #3 (3/4) Week13_CopyFile.c int copyFile(char *sourcefile, char *destfile) { FILE *sfp, *dfp; if ((sfp = fopen(sourcefile, "r")) == NULL) exit(1); // error - can't open source file if ((dfp = fopen(destfile, "w")) == NULL) { fclose(sfp); // close source file exit(2); // error - can't open destination file } while ((ch = fgetc(sfp)) != EOF) { if (fputc(ch, dfp) == EOF) { fclose(sfp); fclose(dfp); exit(3); // error - can't write to file } } fclose(sfp); fclose(dfp); return 0; } CS1010 (AY2011/2 Semester 1) Week13 - 22
7. Character I/O: ungetc (4/4) • ungetc pushes back a character read from a stream • ungetc returns the character it pushes back • Example: Read a sequence of digits and stop at the first non-digit int ch; FILE *fp = fopen( ... ); while (isdigit(ch = getc(fp))) { // process digit read . . . } ungetc(ch, fp); // pushes back last char read CS1010 (AY2011/2 Semester 1) Week13 - 23
8. Exercise #1: Trimming Blanks • Write a program Week13_TrimBlanks.c that contains a functioninttrimBlanks(char *infile, char *outfile)that takes an input text file and produces a new text file that is a duplicate copy of the input file except that each sequence of consecutive blank characters is replaced by a single blank character. • The function returns -1 if there is an error; otherwise, it returns the number of blank characters trimmed. • An incomplete program Week13_TrimBlanks.c is given. A test input file trimblanks.in is also given. CS1010 (AY2011/2 Semester 1) Week13 - 24
9. Line I/O: Output Functions (1/6) • Output functions: fputs, puts FILE *fp; // writes to stdout with newline character appended puts("Hello world!"); fp = fopen( ... ); // writes to fp without newline character appended fputs("Hello world!", fp); • fputs and puts return EOF if a write error occurs; otherwise, return a non-negative number CS1010 (AY2011/2 Semester 1) Week13 - 25
9. Line I/O: Input Functions (2/6) • Input functions: fgets, gets char s[100]; FILE *fp; gets(s); // reads a line from stdin fp = fopen( ... ); fgets(s, 100, fp); // reads a line from fp • fgets and gets store a null character at the end of the string • fgets and gets return a null pointer if a read error occurs or end-of-file is encountered before storing any character; otherwise, return first argument • Avoid using gets due to security issue CS1010 (AY2011/2 Semester 1) Week13 - 26
9. Line I/O: fgets (3/6) • Prototype: char *fgets(char *s, int n, FILE *fp) • s is a pointer to the beginning of a character array • n is a count • fp is an input stream • Characters are read from the input stream fp into s until • a newline character is seen, • end-of-file is reached, or • n – 1 characters have been read without encountering newline character or end-of-file • If the input was terminated because of a newline character, the newline character will be stored in the array before the terminating null character (‘\0’) CS1010 (AY2011/2 Semester 1) Week13 - 27
9. Line I/O: fgets (cont.) (4/6) • If end-of-file is encountered before any characters have been read from the stream, • fgets returns a null pointer • The contents of the array s are unchanged • If a read error is encountered, • fgets returns a null pointer • The contents of the array s are indeterminate • Whenever NULL is returned, feofor ferror should be used to determine the status CS1010 (AY2011/2 Semester 1) Week13 - 28
9. Demo #5: Counting Lines (5/6) • Write a function that takes as input the name of a text file and returns the number of lines in the input file. • If an error occurs, the function should return a negative number. • Assume that the length of each line in the file is at most 80 characters. CS1010 (AY2011/2 Semester 1) Week13 - 29
9. Demo #5: Counting Lines (6/6) Week13_CountLines.c #define MAX_LINE_LENGTH 80 intcountLines(char *filename) { int count = 0; FILE *fp; char s[MAX_LINE_LENGTH+1]; if ((fp = fopen(filename, "r")) == NULL) return -1; // error while (fgets(s, MAX_LINE_LENGTH+1, fp) != NULL) count++; if (!feof(fp)) // read error encountered count = -1; fclose(fp); return count; } CS1010 (AY2011/2 Semester 1) Week13 - 30
10. Data Display Debugger (DDD) • See video • http://www.youtube.com/watch?v=VF7IBEAA8Ig CS1010 (AY2011/2 Semester 1) Week13 - 31
Summary for Today • In today’s lecture we learnt • I/O on text files • I/O functions: • File operations: fopen, fclose, feof, ferror • Formatted I/O: fprintf, fscanf • Character I/O: fputc, putchar, fgets, getchar, ungetc • Line I/O: fputs, puts, fgets, gets CS1010 (AY2011/2 Semester 1)
Post-CS1010 (1/2) • For CS/CM/CB students: CS1010 CS1020 CS2010 or CS1010 CS2020 (accelerated) • For IS and EC students: CS1010 CS1020 • CS1020 Data Structures and Algorithms I • Emphasis on algorithms and data structures • Using an object-oriented programming language Java • Textbook: Data Abstraction & Problem Solving with Java: Walls and Mirrors by F. Carrano and J. Prichard CS1010 (AY2011/2 Semester 1) Week13 - 33
Post-CS1010 (2/2) • CS1020 has sit-in labs (like mini-PEs) every other week • Strengthen your programming skills to prepare for CS1020 CS1010 (AY2011/2 Semester 1) Week13 - 34
CS1010 Final Examination • CS1010 Exam • 21 November 2011, Monday, 1 – 3pm • Venue: To be announced by Registrar’s Office • Format • MCQ section: 6 questions • 5 other questions (comprising tracing questions and programming questions) • Grading of CS1010 is not based on bell curve CS1010 (AY2011/2 Semester 1) Week13 - 35
How to Prepare for Exams? • Preparing for Exams: A Guide for NUS Students • http://www.cdtl.nus.edu.sg/examprep/ CS1010 (AY2011/2 Semester 1) Week13 - 36
CS1010 (AY2011/2 Semester 1) Week13 - 37
End of File CS1010 (AY2011/2 Semester 1)