280 likes | 401 Views
Expanded I/O options. Building on basics. We had Input from the keyboard nameIn = raw_input(“What is your name?”) and output to the console print “Hello”, nameIn Additions: default for value not input: nameIn = raw_input(“What is your name?”) if not nameIn :
E N D
Building on basics • We had • Input from the keyboard • nameIn = raw_input(“What is your name?”) • and output to the console • print “Hello”, nameIn • Additions: • default for value not input: • nameIn = raw_input(“What is your name?”) • if not nameIn: nameIn = “Anonymous” No input provided
More additions • Printing a simple list of strings includes a space between each pair. • Unwanted space between team name and : • to fix this use concatenation of strings (+ operator). Must explicitly convert numbers to strings. Gain full control of the spacing >>> team = "Wildcats" >>> rank = 5 >>> print team, ": ranked", rank, "this week." Wildcats : ranked 5 this week. >>> >>> print team+": ranked " + str(rank) +" this week." Wildcats: ranked 5 this week.
Formatting Strings • Further control of how individual fields of output will be presented. • % is used to indicate a formatting code and also a tuple of items to be formatted • %s for strings • %d for integers (d for digits?) • %f for floats (numbers with decimal parts) • %.3f displays three decimal places
Formatted Strings (continued) • Can write previous statement using formatting strings like this. • Format strings are: • %s is for strings • %d is for integers • %f is for floats. %.2f gives two decimal places. >>> print '%s: ranked %d this week.'%(team, rank) Wildcats: ranked 5 this week. Notice quotes around the whole specification of the formatting.
Formatting details • Further options • %10s -- string with 10 spaces, minimum • %4d -- number with 4 spaces, minimum • -%5.2f -- float with 5 places, of which two are decimal positions Note: %n.df makes the total columns for the number =n, of whichd are for the decimal places >>> print 'Rank %5.2f as a float.'%rank Rank 5.00 as a float. >>> print 'Rank %10.2f as a float.'%rank Rank 5.00 as a float. >>> rank = 100 >>> print "Rank %3.2f with field too small"%rank Rank 100.00 with field too small %3.2f means total 3 spaces, one is for the decimal point and two for the decimal digits, none for the whole number. Automatically expanded to fit the actual value.
Working with Files • Information stored in RAM (main memory) goes away (is volatile) when the computer is shut off. • Information stored on disk is non-volatile (does not go away when the computer is turned off). • Writing to and reading from a file can help preserve information between different executions of a program.
Python File Type • creating a new file instance is accomplished in the same way that a new list object is made. fileObj = file(filename)
Reading from a File:Counting lines, words, and charactersversion 1 – corrected typos and added formatting filename = raw_input('What is the filename? ') source = file(filename) text = source.read() # Read entire file as one string numchars = len(text) numwords = len(text.split()) numlines = len(text.split('\n')) print '%10d Lines\n%10d Words\n%10d Characters'%(numlines,numwords,numchars) source.close() Note – this version reads the whole file at once, as a single string What is the filename? citeseertermcount.txt 30002 Lines 156521 Words 920255 Characters
Reading from a File:Counting lines, words, and characters version 2 numlines=numwords=numchars=0 line=source.readline() while line: # line length is not zero numchars+=len(line) numwords +=len(line.split()) numlines+=1 # Done with current line. Read the next line=source.readline() print '%10d Lines\n%10d Words\n%10d Characters'%(numlines,numwords,numchars) source.close() Now, we read one line at a time, process it, and read the next. What is the filename? citeseertermcount.txt 30001 Lines 156521 Words 920255 Characters Note different number of lines
Reading from a File:Counting lines, words, and characters version 3 filename = raw_input('What is the filename? ') source = file(filename) numlines = numwords = numchars = 0 for line in source: #reads one line at a time until no more. numchars += len(line) numwords += len(line.split()) numlines += 1 print '%10d Lines\n%10d Words\n%10d Characters'%(numlines,numwords,numchars) source.close() Note that “for line in source” actually does the read of a line. No explicit readline is used. 30001 Lines 156521 Words 920255 Characters Note the number of lines
Spot check 1 • Why was there a difference in the number of lines found by the three versions of the program? • Discuss on the blackboard forum, then enter your answer. Consultation and collaboration is good, but write your own answer and be sure you understand it.
Writing to a File • Creating a new file object that can be written to in Python with a file name of filename. result = file(filename, 'w') • If the file with filename already exists then it will be overwritten. • Only strings can be written to a file pi = 3.14159 result.write(pi) #this is illegal result.write(str(pi)) #this is legal
Writing to a File • When is the information actually written to a file? • File writing is time expensive so files may not be written immediately. • A file can be forced to be written in two ways: • flush(): file written but not closed. • close(): file written and then closed.
File Write Danger • Note that there is no built-in protection against destroying a file that already exists! • If you want to safeguard against accidentally overwriting an existing file, what would you do? • Discuss
Trying to Read a File That Doesn't Exist. • What if opening file for reading and no file with that name exists? IOError – crashes program. To avoid this use an exception. filename = raw_input('Enter filename: ') try: source = file(filename) except IOError: print 'Sorry, unable to open file', filename
File Utilities # Prompt for filename until file is successfully opened. deffileReadRobust(): source = None while not source: filename = raw_input('Input filename: ') try: source = file(filename) exceptIOError: print 'Sorry, unable to open file', filename return source
File Utilities (continued) def openFileWriteRobust(defaultName): """Repeatedly prompt user for filename until successfully opening with write access. Return a newly open file object with write access. defaultName a suggested filename. This will be offered within the prompt and used when the return key is pressed without specifying another name. """ writable = None while not writable: # still no successfully opened file prompt = 'What should the output be named [%s]? '% defaultName filename = raw_input(prompt) if not filename: # user gave blank response filename = defaultName # try the suggested default try: writable = file(filename, 'w') except IOError: print 'Sorry. Unable to write to file', filename return writable
Testing the File Utilities from FileUtilities import * sourceFile=openFileReadRobust() if sourceFile <> None: print "Successful read of ",sourceFile filenone="anyname" outFile=openFileWriteRobust(filenone) if outFile <> None: print "File ", outFile, " opened for writing" What is the filename? citeseertermcount.txt Successful read of <open file 'citeseertermcount.txt', mode 'r' at 0x60f9d0> What should the output be named [anyname]? abc.txt File <open file 'abc.txt', mode 'w' at 0x60fa20> opened for writing
# Program: annotate.py # Authors: Michael H. Goldwasser # David Letscher # # This example is discussed in Chapter 8 of the book # Object-Oriented Programming in Python # from FileUtilities import openFileReadRobust, openFileWriteRobust print 'This program annotates a file, by adding' print 'Line numbers to the left of each line.\n' source = openFileReadRobust() annotated = openFileWriteRobust('annotated.txt') # process the file linenum = 1 for line in source: annotated.write('%4d %s' % (linenum, line) ) linenum += 1 source.close() annotated.close() print 'The annotation is complete.' Numbering lines in a file
Running the annotation program This program annotates a file, by adding Line numbers to the left of each line. What is the filename? fileUtilities.py What should the output be named [annotated.txt]? annotatedUtilities.txt The annotation is complete. Directory after the program runs: FileUtilities.pycciteseertermcount.txt readfile1.py abc.txtfileUtilTest.py readfile2.py annotate.pyfileUtilities.py readfile3.py annotatedUtilities.txtreadexception.py
1 # Program: FileUtilities.py 2 # Authors: Michael H. Goldwasser 3 # David Letscher 4 # 5 # This example is discussed in Chapter 8 of the book 6 # Object-Oriented Programming in Python 7 # 8 """A few utility functions for opening files.""" 9 def openFileReadRobust(): 10 """Repeatedly prompt user for filename until successfully opening with read access. 11 12 Return the newly open file object. 13 """ 14 source = None 15 while not source: # still no successfully opened file 16 filename = raw_input('What is the filename? ') 17 try: 18 source = file(filename) 19 except IOError: 20 print 'Sorry. Unable to open file', filename 21 return source 22 23 def openFileWriteRobust(defaultName): 24 """Repeatedly prompt user for filename until successfully opening with write access. 25 26 Return a newly open file object with write access. The annotated file Rest not shown for space limitations
Spot Check 2 • Run the annotate program against a file of your choosing and get the line numbers added. • Be careful not to overwrite the original file. • What would be the effect if you added line numbers to a program file? • How would you remove the line numbers if you got them into the wrong file?
Tally • Read through the case study of constructing a tally sheet class. • Compare what you see here to the frequency distribution content that you saw in the NLTK book.
NLTK chapter 3 • That is written very much as a tutorial and I don’t think I can do much with slides and no narration. • Please read through that chapter and do the “Your turn” exercises. Use the Discussion board to comment on what you do and to share observations and ask questions.
Assignment • In Two weeks: • Do either exercise 8.18 or exercise 8.21 • (Do you prefer to work with numbers or words?) • Be sure to design good test cases for your program. • For chapter review (and quiz preparation) be sure you can do exercises 8.7 – 8.9