300 likes | 327 Views
Lecture 4B Thursday, January 17, 2008. File input and output if-then-else. Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble. File input and output. Opening files. The open() command returns a file object.
E N D
Lecture 4B Thursday, January 17, 2008 File input and outputif-then-else Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble
Opening files • The open() command returns a file object. <filehandle> = open(<filename>, <access type>) • Python can read, write or append to a file: • 'r' = read • 'w' = write • 'a' = append • Create a file called “hello.txt” containing one line: “Hello, world!” >>> myFile = open("hello.txt", "r")
Reading the whole file • You can read the contents of the file into a single string. >>> myString = myFile.read() >>> print myString Hello, world! >>> Why is there a blank line here?
Reading the whole file • Now add a second line to your file (“How ya doin’?”) and try again. >>> myFile = open("hello.txt", "r") >>> myString = myFile.read() >>> print myString Hello, world! How ya doin'? >>>
Reading the whole file • Alternatively, you can read the file into a list of strings. >>> myFile = open("hello.txt", "r") >>> myStringList = myFile.readlines() >>> print myStringList ['Hello, world!\n', "How ya doin'?\n"] >>> print myStringList[1] How ya doin'?
Reading one line at a time • The readlines() command puts all the lines into a list of strings. • The readline() command returns the next line. >>> myFile = open("hello.txt", "r") >>> myString = myFile.readline() >>> print myString Hello, world! >>> myString = myFile.readline() >>> print myString How ya doin'? >>>
Writing to a file • Open the file for writing or appending. >>> myFile = open("new.txt", "w") • Use the <file>.write() method. >>> myFile.write("This is a new file\n") >>> myFile.close() >>> ^D > cat new.txt This is a new file Always close a file after you are finished reading from or writing to it.
Print vs write • <file>.write() does not automatically append an end-of-line character. • <file>.write() requires a string as input >>> newFile.write("foo") >>> newFile.write(1) Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: argument 1 must be string or read-only character buffer, not int
The if statement >>> if (seq.startswith("C")): ... print "Starts with C" ... Starts with C >>> • A block is a group of lines of code that belong together. if (<test evaluates to true>): <execute this block of code> • In the Python interpreter, the ellipse indicates that you are inside a block. • Python uses indentation to keep track of blocks. • You can use any number of spaces to indicate blocks, but you must be consistent. • An unindented or blank line indicates the end of a block.
The if statement • Try doing an if statement without indentation. >>> if (seq.startswith("C")): ... print "Starts with C" File "<stdin>", line 2 print "Starts with C" ^ IndentationError: expected an indented block
Multiline blocks • Try doing an if statement with multiple lines in the block. >>> if (seq.startswith("C")): ... print "Starts with C" ... print "All right by me!" ... Starts with C All right by me!
Multiline blocks • What happens if you don’t use the same number of spaces to indent the block? >>> if (seq.startswith("C")): ... print "Starts with C" ... print "All right by me!" File "<stdin>", line 4 print "All right by me!" ^ SyntaxError: invalid syntax
Comparison operators • Boolean: and, or, not • Numeric: < , > , ==, !=, <>, >=, <= • String: in
Examples seq = 'CAGGT' >>> if ('C' == seq[0]): ... print 'C is first' ... C is first >>> if ('CA' in seq): ... print 'CA in', seq ... CA in CAGGT >>> if (('CA' in seq) and ('CG' in seq)): ... print "Both there!" ... >>>
Single equal assigns a variable name. >>> myString == "foo" Traceback (most recent call last): File "<stdin>", line 1, in ? NameError: name 'myString' is not defined >>> myString = "foo" >>> myString == "foo" True Double equal tests for equality. >>> if (myString = "foo"): File "<stdin>", line 1 if (myString = "foo"): ^ SyntaxError: invalid syntax >>> if (myString == "foo"): ... print "Yes!" ... Yes! Beware!= versus ==
if-else statements if <test1>: <statement> else: <statement> • The else block executes only if <test1> is false. >>> if (seq.startswith('T')): ... print 'T start' ... else: ... print 'starts with', seq[0] ... starts with C >>> Evaluates to FALSE: no print.
If-elif-else if <test1>: <statement> elif <test2>: <statement> else: <statement> • elif block executes if <test1> is false and then performs a second <test2>
Example >>> base = 'C' >>> if (base == 'A'): ... print "adenine" ... elif (base == 'C'): ... print "cytosine" ... elif (base == 'G'): ... print "guanine" ... elif (base == 'T'): ... print "thymine" ... else: ... print "Invalid base!“ ... cytosine
<file> = open(<filename>, r|w|a> <string> = <file>.read() <string> = <file>.readline() <string list> = <file>.readlines() <file>.write(<string>) <file>.close() if <test1>: <statement> elif <test2>: <statement> else: <statement> • Boolean: and, or, not • Numeric: < , > , ==, !=, <>, >=, <= • String: in, not in
Sample problem #1 • Write a program read-first-line.py that takes a file name from the command line, opens the file, reads the first line, and prints the result to the screen. > python read-first-line.py hello.txt Hello, world! >
Solution #1 import sys filename = sys.argv[1] myFile = open(filename, "r") firstLine = myFile.readline() myFile.close() print firstLine
Sample problem #2 • Modify your program to print the first line without an extra carriage return. > python read-first-line.py hello.txt Hello, world! >
Solution #2 import sys filename = sys.argv[1] myFile = open(filename, "r") firstLine = myFile.readline() firstLine = firstLine[:-1] myFile.close() print firstLine
Sample problem #3 • Write a program add-two-numbers.py that reads one integer from the first line of one file and a second integer from the first line of a second file and then prints their sum. > add-two-numbers.py nine.txt four.txt 9 + 4 = 13 >
Solution #3 import sys fileOne = open(sys.argv[1], "r") valueOne = int(fileOne.readline()) fileTwo = open(sys.argv[2], "r") valueTwo = int(fileTwo.readline()) print valueOne, "+", valueTwo, "=", valueOne + valueTwo
Sample problem #4 • Write a program find-base.py that takes as input a DNA sequence and a nucleotide. The program should print the number of times the nucleotide occurs in the sequence, or a message saying it’s not there. > python find-base.py A GTAGCTA A occurs at position 3 > python find-base.py A GTGCT A does not occur at all Hint:S.find('G') returns -1 if it can't find the requested sequence.
Solution #4 import sys base = sys.argv[1] sequence = sys.argv[2] position = sequence.find(base) if (position == -1): print base, "does not occur at all" else: print base, "occurs at position", position
Reading • Chapter 13 of Learning Python (3rd edition) by Lutz.