390 likes | 539 Views
Outline. Outline. Lab 1 Solution Program 2 Scoping Algorithm efficiency Sorting Hashes Review for midterm Quiz 3. Lab 1 Solution. BINF634 Fall 2013 Regular Expression Lab (Key) All problems except number 9 are worth 11 points. Number 9 is worth 12 points.
E N D
Outline Outline • Lab 1 Solution • Program 2 • Scoping • Algorithm efficiency • Sorting • Hashes • Review for midterm • Quiz 3 BINF 634 Fall 2013 - LECTURE06
Lab 1 Solution BINF634 Fall 2013 Regular Expression Lab (Key) All problems except number 9 are worth 11 points. Number 9 is worth 12 points. 1) Write a PERL regular expression that would match only the strings: “bat”, “at”, and “t”. /^b?a?t$/ 2) Write a PERL regular expression to recognize any string that contains the substring “jeff”. /jeff/ BINF 634 Fall 2013 - LECTURE06
Lab 1 Solution 3) Write a PERL regular expression that would match the strings: “bat”, “baat”, “baaat”, “baa…aat”, etc. (strings that start with b, followed by one or more a’s, ending with a t). /^ba+t$/ 4) Write a PERL regular expression that matches the strings: “hog”, “Hog”, “hOg”, “HOG”, “hOG”, etc. (That is, “hog” written in any combination of uppercase or lowercase letters.) /^[hH][oO][Gg]$/ 5) Write a PERL regular expression that matches any positive number (with or without a decimal point). Hint #1: if there is a decimal point, there must be at least one digit following the decimal point. Hint #2: Since the dot “.” matches any character, you must use \. to match a decimal point. /^\d+(\.\d+)?$/ BINF 634 Fall 2013 - LECTURE06
Lab 1 Solution 6) Write a PERL regular expression to match any integer that doesn’t end in 8. /^\d*[^8]$/ 7) Write a PERL regular expression to match any line with exactly two words (or numbers) separated by any amount of whitespace (spaces or tabs). There may or may not be whitespace at the beginning or end of the line. ^\s*\w+\s+\w+\s*$ BINF 634 Fall 2013 - LECTURE06
Program 2 Discussions • Questions on Program 2? • Discussions on the permute function BINF 634 Fall 2013 - LECTURE06
#!/usr/bin/perl use strict; use warnings; my $x = 23; print "value in main body is $x \n"; mysub($x); print "value in main body is $x \n"; exit; sub mysub{ print "value in subroutine is $x \n"; $x=33; } value in main body is 23 value in subroutine is 23 value in main body is 33 #!/usr/bin/perl use strict; use warnings; { my $x = 23; print "value in main body is $x \n"; mysub($x); print "value in main body is $x \n"; exit; } sub mysub{ print "value in subroutine is $x \n"; $x=33; } This will not compile Be Careful With Scope Scoping BINF 634 Fall 2013 - LECTURE06
#!/usr/bin/perl use strict; use warnings; { my $x = 23; print "value in main body is $x \n"; mysub($x); print "value in main body is $x \n"; exit; } sub mysub{ my($x) = @_; $x=33; print "value in subroutine is $x \n"; } value in main body is 23 value in subroutine is 33 value in main body is 23 Be Careful With Scope (cont.) Scoping BINF 634 Fall 2013 - LECTURE06
Data Structures and Algorithm Efficiency Algorithm Efficiency Algorithm is O(N2) # An inefficient way to compute intersections my @a = qw/ A B C D E F G H I J K X Y Z /; my @b = qw/ Q R S A C D T U G H V I J K X Z /; my @intersection = (); for my $i (@a) { for my $j (@b) { if ($i eq $j) { push @intersection, $i; last; } } } print "@intersection\n"; exit; Output: A C D G H I J K X Z N = size of Lists BINF 634 Fall 2013 - LECTURE06
Algorithm is O(N) N = size of Lists Data Structures and Algorithm Efficiency Algorithm Efficiency # A better way to compute intersections my @a = qw/ A B C D E F G H I J K X Y Z /; my @b = qw/ Q R S A C D T U G H V I J K X Z /; my @intersection = (); # "mark" each item in @a my %mark = (); for my $i (@a) { $mark{$i} = 1 } # intersection = any "marked" item in @b for my $j (@b) { if (exists $mark{$j}) { push @intersection, $j; } } print "@intersection\n"; exit; Output: A C D G H I J K X Z version 1 version 2 BINF 634 Fall 2013 - LECTURE06
Demonstration Algorithm Efficiency • Unix commands: • /usr/bin/time • head • diff • cmp % wc -l list1 list2 24762 list1 12381 list2 37143 total % /usr/bin/time intersect1.pl list1 list2 > out1 22.91 real 22.88 user 0.02 sys % /usr/bin/time intersect2.pl list1 list2 > out2 0.06 real 0.05 user 0.00 sys 22.88/.05 = 458 BINF 634 Fall 2013 - LECTURE06
Hashes and Efficiency Hashes • Hashes provide a very fast way to look up information associated with a set of scalar values (keys) • Examples: • Count how many time each word appears in a file • Also: whether or not a certain work appeared in a file • Count how many time each codon appears in a DNA sequence • Whether a given codon appears in a sequence • How many time an item appears in a given list • Intersections BINF 634 Fall 2013 - LECTURE06
Examples Hashes • Write a subroutine get_intersection(\@a, \@b) that returns the intersection of two lists. • Write a subroutine first_list_only(\@a, \@b) that returns the items that are in list @a but not in @b. • Write a subroutine unique(@a) that return the unique items in list @a (that is, remove the duplicates). • Write a subroutine dups($n, @a) that returns a list of items that appear in @a at least $n times. BINF 634 Fall 2013 - LECTURE06
Sorting Sorting • sort LIST -- returns list sorted in string order • sort BLOCK LIST -- compares according to BLOCK • sort USERSUB LIST -- compares according subroutine SUB BINF 634 Fall 2013 - LECTURE06
#!/usr/bin/perl use strict; use warnings; { my(@unsorted) = (17, 8, 2, 111); my(@sorted) = sort @unsorted; print "@unsorted \n"; print "@sorted \n"; exit; } Output: 17 8 2 111 111 17 2 8 Sorting Our First Attempt Sorting BINF 634 Fall 2013 - LECTURE06
The Comparison Operator Sorting 1. $a <=> $b returns 0 if equal, 1 if $a > $b, -1 if $a < $b 2. The "cmp" operator gives similar results for strings 3. $a and $b are special global variables: do NOT declare with "my" and do NOT modify. BINF 634 Fall 2013 - LECTURE06
#!/usr/bin/perl use strict; use warnings; { my(@unsorted) = (17, 8, 2, 111); my(@sorted) = sort { $a <=> $b }@unsorted; print "@unsorted \n"; print "@sorted \n"; exit; } Output: 17 8 2 111 2 8 17 111 Sorting Numerically Sorting BINF 634 Fall 2013 - LECTURE06
#!/usr/bin/perl use strict; use warnings; { my(@unsorted) = (17, 8, 2, 111); my(@sorted) = sort numerically @unsorted; print "@unsorted \n"; print "@sorted \n"; exit; } sub numerically { $a <=> $b } Output: 17 8 2 111 2 8 17 111 Sorting Using a Subroutine Sorting BINF 634 Fall 2013 - LECTURE06
#!/usr/bin/perl use strict; use warnings; { my(@unsorted) = (17, 8, 2, 111); my(@reversesorted) = reverse sort numerically @unsorted; print "@unsorted \n"; print "@reversesorted \n"; exit; } sub numerically { $a <=> $b } Output: 17 8 2 111 111 17 8 2 Sorting Descending Sorting BINF 634 Fall 2013 - LECTURE06
!/usr/bin/perl use strict; use warnings; { # Sorting strings: my @dna = qw/ TATAATG TTTT GT CTCAT /; ## Sort @dna by length: @dna = sort { length($a) <=> length($b) }@dna; print "@dna\n"; # Output: GT TTTT CTCAT TATAATG exit; } Output: GT TTTT CTCAT TATAATG Sorting DNA by Length Sorting BINF 634 Fall 2013 - LECTURE06
#!/usr/bin/perl use strict; use warnings; { # Sorting strings: my @dna = qw/ TATAATG TTTT GT CTCAT /; @dna = sort { ($b =~ tr/Tt//) <=> ($a =~ tr/Tt//) } @dna; print "@dna\n"; # Output: TTTT TATAATG CTCAT GT exit; } Output: TTTT TATAATG CTCAT GT Sorting DNA by Number of T’s (Largest First) Sorting BINF 634 Fall 2013 - LECTURE06
#!/usr/bin/perl use strict; use warnings; { # Sorting strings: my @dna = qw/ TATAATG TTTT GT CTCAT /; @dna = reverse sort { ($a =~ tr/Tt//) <=> ($b =~ tr/Tt//) } @dna; print "@dna\n"; # Output: TTTT TATAATG CTCAT GT exit; } Output: TTTT TATAATG CTCAT GT Sorting DNA by Number of T’s (Largest First) (Take 2) Sorting BINF 634 Fall 2013 - LECTURE06
#!/usr/bin/perl use strict; use warnings; { # Sort strings without regard to case: my(@unsorted) = qw/ mouse Rat HUMAN eColi /; my(@sorted) = sort { lc($a) cmp lc($b) } @unsorted; print "@unsorted \n"; print "@sorted \n"; exit; } Output: mouse Rat HUMAN eColi eColi HUMAN mouse Rat Sorting Strings Without Regard to Case Sorting BINF 634 Fall 2013 - LECTURE06
#!/usr/bin/perl use strict; use warnings; { my(%sales_amount) = ( auto=>100, kitchen=>2000, hardware=>200 ); sub bysales { $sales_amount{$b} <=> $sales_amount{$a} } for my $dept (sort bysales keys %sales_amount) { printf "%s:\t%4d\n", $dept, $sales_amount{$dept}; } exit; } Output: kitchen:2000 hardware: 200 auto: 100 Sorting Hashes by Value Sorting BINF 634 Fall 2013 - LECTURE06
Review for Midterm BINF634 Midterm • Material • Tisdall Chapters 1-9 • Wall Chapter 5 • Lecture notes • The exam will be open book and notes • You cannot work together on it • You cannot use outside material • You will have the full period to take the midterm • You will be asked to program BINF 634 Fall 2013 - LECTURE06
Some Example Questions Midterm • Given two DNA fragments contained in $DNA1 and $DNA2 how can we concatenate these to make a third string $DNA3? BINF 634 Fall 2013 - LECTURE06
Some Example Questions Midterm • What does this line of code do? $RNA = ~ s/T/U/ig BINF 634 Fall 2013 - LECTURE06
Some Example Questions Midterm • What does this statement do? $revcom =~ tr/ACGT/TGCA/; BINF 634 Fall 2013 - LECTURE06
Some Example Questions Midterm • What do these four lines do? @bases = (‘A’, ‘C’, ‘G’, ‘T’); $base1 = pop @bases; unshift (@bases, $base1); print “@bases\n\n”; BINF 634 Fall 2013 - LECTURE06
Some Example Questions Midterm • What does this code snippet do if COND is true unless(COND){ #do something } BINF 634 Fall 2013 - LECTURE06
Some Example Questions Midterm • What does this code fragment do? $protein = join(‘’,@protein) BINF 634 Fall 2013 - LECTURE06
Some Example Questions Midterm • What does this code fragment do? $myfile = “myfile”; Open(MYFILE, “>$myfile”) BINF 634 Fall 2013 - LECTURE06
Some Example Questions Midterm • What does this code fragment do? while($DNA =~ /a/ig){$a++} BINF 634 Fall 2013 - LECTURE06
Some Example Questions Midterm • What is the effect of using the command use strict; • at the beginning of your program? BINF 634 Fall 2013 - LECTURE06
Some Example Questions Midterm • What is contained in the reserved variable $0 and in the array @ARGV ? BINF 634 Fall 2013 - LECTURE06
Some Example Questions Midterm • What is the difference between “pass by value” and “pass by reference” ? BINF 634 Fall 2013 - LECTURE06
Some Example Questions Midterm • What is a pointer and what does it mean to dereference a pointer? BINF 634 Fall 2013 - LECTURE06
Some Example Questions Midterm • How do you invoke perl with the debugger? BINF 634 Fall 2013 - LECTURE06
Some Example Questions Midterm • Given an array @verbs what is going on here? $verbs[rand @verbs] BINF 634 Fall 2013 - LECTURE06
For the Curious Regarding Data Structures and Their Implications • Niklaus Wirth, Algorithms + Data Structures = Programs, Prentice Hall 1976. • Dated in terms of language, Pascal, but very well written and understandable BINF 634 Fall 2013 - LECTURE06