280 likes | 289 Views
Computing Science 1P. Lecture 15: Friday 9 th February. Simon Gay Department of Computing Science University of Glasgow. 2006/07. Using Files (Chapter 11). The unit of long-term data storage is the file . Most computer programs work with data stored in files, to some extent.
E N D
Computing Science 1P Lecture 15: Friday 9th February Simon Gay Department of Computing Science University of Glasgow 2006/07
Using Files (Chapter 11) The unit of long-term data storage is the file. Most computer programs work with data stored in files, to some extent. Python provides functions for using files: the open function makes a file available for reading or writing file1 = open("data.txt","r") file2 = open("output.txt","w") Computing Science 1P Lecture 15 - Simon Gay
Using Files If we have a file that is open for reading, there are several ways of getting data from it. returns the entire contents, as a string file1.read() returns the next 10 characters file1.read(10) returns the next line, as a string file1.readline() returns all of the remaining lines, as a list of strings file1.readlines() When we have finished using the file: file1.close() Computing Science 1P Lecture 15 - Simon Gay
What is a file? A file is a sequence of bytes. We will mostly work with text files. A text file can be thought of as a sequence of characters. Some of these characters will be newlines, so that the file is separated into lines. Examples We're going to look at examples in which data is stored in a text file, with a specified format for each line. A convenient way of working is to read one line at a time, then work out how to process the line. Computing Science 1P Lecture 15 - Simon Gay
Case Study: Planning / Problem Solving This is a problem from last year's version of CS1P. It's about calculating students' grade point averages (it's a dirty job, but someone has to do it…) Data on students, their modules and grades is stored in a file: 0606559 VBN2 30 B FGH9 50 A SDF2 50 N DFG5 60 C 0500300 HHJ3 40 D FGH3 10 E JHG3 10 F 0605456 LKJ1 10 G CDS2 30 C 0601234 XDX2 40 A ABC1 40 B credits grade module code matric number Computing Science 1P Lecture 15 - Simon Gay
Grade Point Averages For each student, the following calculation must be done. Each grade is converted into grade points using the following table: all other results (H, N, CR, …) have 0 grade points. For each module, the number of credits is multiplied by the grade points corresponding to the grade. These are added up to give the total grade points for the student. The grade point average (GPA) is the total grade points divided by the total number of credits. Computing Science 1P Lecture 15 - Simon Gay
Grade Point Average: Example The student whose details are 0605456 LKJ1 10 G CDS2 30 C has 10 * 2 + 30 * 12 = 380 grade points and a GPA of 380 / 40 = 9.5 Computing Science 1P Lecture 15 - Simon Gay
The Problem Write a program to read in students' data from a file, calculate the GPA for each student, and then print the GPA of each student, separated into two categories: students with a GPA of at least 10.0, and students with a GPA of less than 10.0. Students with GPA >= 10.0 0606559 190 10.2 0601234 80 15.0 Students with GPA < 10.0 0500300 60 9.0 0605456 40 9.5 GPA total credits Computing Science 1P Lecture 15 - Simon Gay
Planning How do we begin to write the program? We need a plan. To develop a plan, we need to think about the important data structures for the problem, and the necessary algorithms (i.e. computational processes). In other words: what data do we need to store, and how do we need to process it? Algorithms + Data Structures = Programs Niklaus Wirth (1976) Computing Science 1P Lecture 15 - Simon Gay
Do we need to store anything at all? Do we need to store information about students and their GPAs, for example in a list or dictionary, or can we just do the processing and produce the output while we are reading the original data in? (Remember that in the lab exam bin-packing problem, there was no need to store information about which bin each object went into.) Computing Science 1P Lecture 15 - Simon Gay
Do we need to store information about students? • Yes • No • Don't know Computing Science 1P Lecture 15 - Simon Gay
Storing information about students Because the output has to be separated into two groups (high GPAs and then low GPAs), we need to store the data. We have information about each student, so we'll use a dictionary in which the keys are matric numbers, represented by strings. What do we store for each student? The total credits and the GPA. Put them in a dictionary. { "0605456": { "credits":40, "gpa":9.5 }, ... } Computing Science 1P Lecture 15 - Simon Gay
Top-level Plan • Read data from the file into the dictionary • Print data from the dictionary in the required format Is this enough? If someone gave you this plan and told you to complete the program, would it be straightforward? Or would you still have to do some problem-solving and make design decisions? Computing Science 1P Lecture 15 - Simon Gay
Refining step 1 of the plan • Read data from the file into the dictionary • For each line in the file: • 1.1 Calculate the total credits and GPA • 1.2 Store them in the dictionary, with the matric no. as key 1.2 seems straightforward now, but 1.1 is still unclear. Computing Science 1P Lecture 15 - Simon Gay
Refining step 1.1 of the plan 1.1 Calculate the total credits and GPA 1.1.1 Split the current line into a list of strings 1.1.2 For each module, add the credits to a running total, and add the grade points to a running total 1.1.3 Calculate the GPA This is almost enough, but there is one detail: converting the grade into the corresponding grade points. The word FUNCTION should be leaping into your mind! Computing Science 1P Lecture 15 - Simon Gay
Converting grades to grade points There are various ways to consider implementing this table: Quite a nice way is to use a dictionary: def gp(g): table = { 'A':16, 'B':14, 'C':12, 'D':10, 'E':8, 'F':6, 'G':2 } return table.get(g,0) Computing Science 1P Lecture 15 - Simon Gay
Refining step 2 of the top-level plan 2. Print data from the dictionary in the required format 2.1 For each student in the dictionary: If GPA >= 10.0: print details 2.2 For each student in the dictionary: If GPA < 10.0: print details Computing Science 1P Lecture 15 - Simon Gay
The complete plan • Read data from the file into the dictionary • For each line in the file: • 1.1 Calculate the total credits and GPA • 1.1.1 Split the current line into a list of strings • 1.1.2 For each module, add the credits to a • running total; likewise the grade points • 1.1.3 Calculate the GPA • 1.2 Store them in the dictionary, with the matric no. as key • Print data from the dictionary in the required format • 2.1 For each student in the dictionary: • If GPA >= 10.0: print details • 2.2 For each student in the dictionary: • If GPA < 10.0: print details Computing Science 1P Lecture 15 - Simon Gay
Developing the program f = open("grades.txt","r") record = process(f) # Step 1 f.close() output(record) # Step 2 Now we need to define the functions process and output. Computing Science 1P Lecture 15 - Simon Gay
The function process def process(f): record = {} # For each line in the file f, do something return record Computing Science 1P Lecture 15 - Simon Gay
The function process def process(f): record = {} line = f.readline() while line != '': # Process the information in line line = f.readline() return record Computing Science 1P Lecture 15 - Simon Gay
The function process def process(f): record = {} line = f.readline() while line != '': line = line[:-1] # Remove the final newline character # Process the information in line line = f.readline() return record Computing Science 1P Lecture 15 - Simon Gay
The function process def process(f): record = {} line = f.readline() while line != '': line = line[:-1] data = split(line,' ') # Step 1.1.1 matric = data[0] credits = 0 # Initialise running totals points = 0 # Step 1.1.2: for each module... line = f.readline() return record 0606559 VBN2 30 B FGH9 50 A SDF2 50 N DFG5 60 C data = [ "0606559", "VBN2", "30", "B", "FGH9", "50", … ] Computing Science 1P Lecture 15 - Simon Gay
The function process def process(f): record = {} line = f.readline() while line != '': line = line[:-1] data = split(line,' ') matric = data[0] credits = 0 points = 0 modules = (len(data)-1)/3 # for each for i in range(modules): # module ... credit = int(data[i*3+2]) # credits = credits + credit # add credit grade = data[i*3+3] # points = points + gp(grade)*credit # add points # Step 1.1.3: calculate GPA line = f.readline() return record Computing Science 1P Lecture 15 - Simon Gay
The function process def process(f): record = {} line = f.readline() while line != '': line = line[:-1] data = split(line,' ') matric = data[0] credits = 0 points = 0 modules = (len(data)-1)/3 for i in range(modules): credit = int(data[i*3+2]) credits = credits + credit grade = data[i*3+3] points = points + gp(grade)*credit gpa = float(points)/credits record[matric] = { "credits":credits, "gpa":gpa } line = f.readline() return record Computing Science 1P Lecture 15 - Simon Gay
The function output def output(r): print "Students with GPA >= 10.0" print for m in r: # m is each matric number in turn if r[m]["gpa"] >= 10.0: print m, r[m]["credits"], r[m]["gpa"] print print "Students with GPA < 10.0" print for m in r: # m is each matric number in turn if r[m]["gpa"] < 10.0: print m, r[m]["credits"], r[m]["gpa"] Computing Science 1P Lecture 15 - Simon Gay
The function output def output(r): print "Students with GPA >= 10.0" print for m in r: if r[m]["gpa"] >= 10.0: print m, "%3d" % r[m]["credits"], "%4.1f" % r[m]["gpa"] print print "Students with GPA < 10.0" print for m in r: if r[m]["gpa"] < 10.0: print m, "%3d" % r[m]["credits"], "%4.1f" % r[m]["gpa"] read Section 11.2 Computing Science 1P Lecture 15 - Simon Gay
One more idea The following outline program will be useful for Unit 12. def getChoice(): print "Choose from the following options:" print "a: operation 1" print "b: operation 2" print "q: quit" choice = raw_input("Enter your choice: ") return choice c = getChoice() while c != "q": if c == "a": print "You chose operation 1" elif c == "b": print "You chose operation 2" else: print "You did not choose a valid option" c = getChoice() Computing Science 1P Lecture 15 - Simon Gay