130 likes | 265 Views
CISC 101: Fall 2011. Hossain Shahriar shahriar@cs.queensu.ca. Announcement and reminder!. Tentative date for final exam need to be fixed! Topics to be covered in this lecture(s) List File Regular expression. List. A flexible data structure that can store one or more items L = [1, 2, 3]
E N D
CISC 101: Fall 2011 HossainShahriar shahriar@cs.queensu.ca
Announcement and reminder! • Tentative date for final exam need to be fixed! • Topics to be covered in this lecture(s) • List • File • Regular expression
List • A flexible data structure that can store one or more items • L = [1, 2, 3] • Some useful methods of list • list.append(x): Add an item to the end of the list • list.insert(i, x): Insert an item at a given position • list.remove(x): Remove the first item whose value is x. It is an error if there is no such item (use with exception)
List (cont.) • list.pop([i]): Remove the item at the given position in the list, and return it. If no index is specified, a.pop() removes and returns the last item in the list. • list.index(x): Return the index in the list of the first item whose value is x. • list.count(x): Return the number of times x appears in the list. • list.sort(): Sort the items of the list, in place. • list.reverse(): Reverse the elements of a list.
List -operators (cont.) • Slice operator allows to choose a subset from a list • L = [1, 2, 3, 4, 5, 6] • Slice [ :3] = [4, 5, 6] #select elements from 0 to (3-1) • Slice [3 :3] = [ ] #select elements from 3 to (3-1) • Slice [2 : ] = [3, 4, 5, 6] #select elements from 2 to (6-1) • Concatenating two lists (+) • L1= [1, 2, 3], L2= [4, 5, 6] • L = L1+ L2 = [1, 2, 3, 4, 5, 6] • Multiplication (*) • L1= [1, 2, 3] • L2 = L1*2 = [1, 2, 3, 1, 2, 3]
List -operators (cont.) • del operator deletes an element • L = [1, 3, 5, 7] • del L[1] = [1, 5, 7] • in tests whether an element is in a list or not • Returns true or false • L = [1, 3, 5, 7] • 3 in L = true • 3 not in L = false # not operator
File • Stores data permanently • How to read a file? • Open a file • Read data by each line • Close the file • Do these steps in try-exception blocks • def fileread(name): • try: • f = open(name, "r") • lines = f.readlines() • f.close() • for line in lines: • print line • except IOError: • raise ValueError("Filename does not exist, or cannot be read.") • fileread("C:/Users/shahriar/Desktop/CISC101/test.txt")
File • How to write a file? • Open a file • Write data • Close the file • Do the steps in try-exception blocks • def filewrite(name): • try: • f = open(name, “w") # this will erase any previous content • f.write(‘hello world') • f.close() • except IOError: • raise ValueError("Filen cannot be open for writing") • fileread("C:/Users/shahriar/Desktop/CISC101/test.txt")
File • More one opening mode • f = open(name, “w") • 'r‘: file will be opened for read only • 'w‘: file will be opened for writing only and it will erase all the contents if a file already exists • 'a‘: fill will be opened for appending; any data written to the file is automatically added to the end
Regular expression • Very useful for string data processing • Tokenization • First, we import regular expression library re • We need to know how to write regular expression • Match any sequence of alpha numeric characters– ‘\w+’ • str = “I know that you do not know” • Mylist = re.findall (‘\w+’, str) • ['I', 'know', 'that', 'you', 'do', 'not', 'know'] • We have just performed string tokenization!! • Each element of the list is a token that we obtained from str • We could have obtained Mylist using str.split() too!
Regular expression • \d – identifies all numbers in a string and return in a list • How can we recognize numbers in a string? -- \d+ • numlist = re.findall ('\d+', "i am 19 years old ") • print numlist • >>> [19] • Take home assignment • Can you find two more examples of meta characters and show examples how they provide outputs?
Regular expression • \d: Matches any decimal digit; [0-9]. • \D: Matches any non-digit character; [^0-9]. • \s: Matches any whitespace character; [ \t\n\r\f\v]. • \S: Matches any non-whitespace character; [^ \t\n\r\f\v]. • \w: Matches any alphanumeric character; [a-zA-Z0-9_]. • \W: Matches any non-alphanumeric character; [^a-zA-Z0-9_]. • + : One or more character • *: zero or more character
Regular expression • match(): Determine if the RE matches at the beginning of the string. • search(): Scan through a string, looking for any location where this RE matches. • findall(): Find all substrings where the RE matches, and returns them as a list.