1 / 39

Outline

Outline. Outline. Lab 1 Solution Program 2 Scoping Algorithm efficiency Sorting Hashes Review for midterm Quiz 3. Lab 1 Solution. BINF634 Fall 2013 Regular Expression Lab (Key) All problems except number 9 are worth 11 points. Number 9 is worth 12 points.

riva
Download Presentation

Outline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Outline Outline • Lab 1 Solution • Program 2 • Scoping • Algorithm efficiency • Sorting • Hashes • Review for midterm • Quiz 3 BINF 634 Fall 2013 - LECTURE06

  2. Lab 1 Solution BINF634 Fall 2013 Regular Expression Lab (Key) All problems except number 9 are worth 11 points. Number 9 is worth 12 points. 1) Write a PERL regular expression that would match only the strings: “bat”, “at”, and “t”. /^b?a?t$/ 2) Write a PERL regular expression to recognize any string that contains the substring “jeff”. /jeff/ BINF 634 Fall 2013 - LECTURE06

  3. Lab 1 Solution 3) Write a PERL regular expression that would match the strings: “bat”, “baat”, “baaat”, “baa…aat”, etc. (strings that start with b, followed by one or more a’s, ending with a t). /^ba+t$/ 4) Write a PERL regular expression that matches the strings: “hog”, “Hog”, “hOg”, “HOG”, “hOG”, etc. (That is, “hog” written in any combination of uppercase or lowercase letters.) /^[hH][oO][Gg]$/ 5) Write a PERL regular expression that matches any positive number (with or without a decimal point). Hint #1: if there is a decimal point, there must be at least one digit following the decimal point. Hint #2: Since the dot “.” matches any character, you must use \. to match a decimal point. /^\d+(\.\d+)?$/ BINF 634 Fall 2013 - LECTURE06

  4. Lab 1 Solution 6) Write a PERL regular expression to match any integer that doesn’t end in 8. /^\d*[^8]$/ 7) Write a PERL regular expression to match any line with exactly two words (or numbers) separated by any amount of whitespace (spaces or tabs). There may or may not be whitespace at the beginning or end of the line. ^\s*\w+\s+\w+\s*$ BINF 634 Fall 2013 - LECTURE06

  5. Program 2 Discussions • Questions on Program 2? • Discussions on the permute function BINF 634 Fall 2013 - LECTURE06

  6. #!/usr/bin/perl use strict; use warnings; my $x = 23; print "value in main body is $x \n"; mysub($x); print "value in main body is $x \n"; exit; sub mysub{ print "value in subroutine is $x \n"; $x=33; } value in main body is 23 value in subroutine is 23 value in main body is 33 #!/usr/bin/perl use strict; use warnings; { my $x = 23; print "value in main body is $x \n"; mysub($x); print "value in main body is $x \n"; exit; } sub mysub{ print "value in subroutine is $x \n"; $x=33; } This will not compile Be Careful With Scope Scoping BINF 634 Fall 2013 - LECTURE06

  7. #!/usr/bin/perl use strict; use warnings; { my $x = 23; print "value in main body is $x \n"; mysub($x); print "value in main body is $x \n"; exit; } sub mysub{ my($x) = @_; $x=33; print "value in subroutine is $x \n"; } value in main body is 23 value in subroutine is 33 value in main body is 23 Be Careful With Scope (cont.) Scoping BINF 634 Fall 2013 - LECTURE06

  8. Data Structures and Algorithm Efficiency Algorithm Efficiency Algorithm is O(N2) # An inefficient way to compute intersections my @a = qw/ A B C D E F G H I J K X Y Z /; my @b = qw/ Q R S A C D T U G H V I J K X Z /; my @intersection = (); for my $i (@a) { for my $j (@b) { if ($i eq $j) { push @intersection, $i; last; } } } print "@intersection\n"; exit; Output: A C D G H I J K X Z N = size of Lists BINF 634 Fall 2013 - LECTURE06

  9. Algorithm is O(N) N = size of Lists Data Structures and Algorithm Efficiency Algorithm Efficiency # A better way to compute intersections my @a = qw/ A B C D E F G H I J K X Y Z /; my @b = qw/ Q R S A C D T U G H V I J K X Z /; my @intersection = (); # "mark" each item in @a my %mark = (); for my $i (@a) { $mark{$i} = 1 } # intersection = any "marked" item in @b for my $j (@b) { if (exists $mark{$j}) { push @intersection, $j; } } print "@intersection\n"; exit; Output: A C D G H I J K X Z version 1 version 2 BINF 634 Fall 2013 - LECTURE06

  10. Demonstration Algorithm Efficiency • Unix commands: • /usr/bin/time • head • diff • cmp % wc -l list1 list2 24762 list1 12381 list2 37143 total % /usr/bin/time intersect1.pl list1 list2 > out1 22.91 real 22.88 user 0.02 sys % /usr/bin/time intersect2.pl list1 list2 > out2 0.06 real 0.05 user 0.00 sys 22.88/.05 = 458 BINF 634 Fall 2013 - LECTURE06

  11. Hashes and Efficiency Hashes • Hashes provide a very fast way to look up information associated with a set of scalar values (keys) • Examples: • Count how many time each word appears in a file • Also: whether or not a certain work appeared in a file • Count how many time each codon appears in a DNA sequence • Whether a given codon appears in a sequence • How many time an item appears in a given list • Intersections BINF 634 Fall 2013 - LECTURE06

  12. Examples Hashes • Write a subroutine get_intersection(\@a, \@b) that returns the intersection of two lists. • Write a subroutine first_list_only(\@a, \@b) that returns the items that are in list @a but not in @b. • Write a subroutine unique(@a) that return the unique items in list @a (that is, remove the duplicates). • Write a subroutine dups($n, @a) that returns a list of items that appear in @a at least $n times. BINF 634 Fall 2013 - LECTURE06

  13. Sorting Sorting • sort LIST -- returns list sorted in string order • sort BLOCK LIST -- compares according to BLOCK • sort USERSUB LIST -- compares according subroutine SUB BINF 634 Fall 2013 - LECTURE06

  14. #!/usr/bin/perl use strict; use warnings; { my(@unsorted) = (17, 8, 2, 111); my(@sorted) = sort @unsorted; print "@unsorted \n"; print "@sorted \n"; exit; } Output: 17 8 2 111 111 17 2 8 Sorting Our First Attempt Sorting BINF 634 Fall 2013 - LECTURE06

  15. The Comparison Operator Sorting 1. $a <=> $b returns 0 if equal, 1 if $a > $b, -1 if $a < $b 2. The "cmp" operator gives similar results for strings 3. $a and $b are special global variables: do NOT declare with "my" and do NOT modify. BINF 634 Fall 2013 - LECTURE06

  16. #!/usr/bin/perl use strict; use warnings; { my(@unsorted) = (17, 8, 2, 111); my(@sorted) = sort { $a <=> $b }@unsorted; print "@unsorted \n"; print "@sorted \n"; exit; } Output: 17 8 2 111 2 8 17 111 Sorting Numerically Sorting BINF 634 Fall 2013 - LECTURE06

  17. #!/usr/bin/perl use strict; use warnings; { my(@unsorted) = (17, 8, 2, 111); my(@sorted) = sort numerically @unsorted; print "@unsorted \n"; print "@sorted \n"; exit; } sub numerically { $a <=> $b } Output: 17 8 2 111 2 8 17 111 Sorting Using a Subroutine Sorting BINF 634 Fall 2013 - LECTURE06

  18. #!/usr/bin/perl use strict; use warnings; { my(@unsorted) = (17, 8, 2, 111); my(@reversesorted) = reverse sort numerically @unsorted; print "@unsorted \n"; print "@reversesorted \n"; exit; } sub numerically { $a <=> $b } Output: 17 8 2 111 111 17 8 2 Sorting Descending Sorting BINF 634 Fall 2013 - LECTURE06

  19. !/usr/bin/perl use strict; use warnings; { # Sorting strings: my @dna = qw/ TATAATG TTTT GT CTCAT /; ## Sort @dna by length: @dna = sort { length($a) <=> length($b) }@dna; print "@dna\n"; # Output: GT TTTT CTCAT TATAATG exit; } Output: GT TTTT CTCAT TATAATG Sorting DNA by Length Sorting BINF 634 Fall 2013 - LECTURE06

  20. #!/usr/bin/perl use strict; use warnings; { # Sorting strings: my @dna = qw/ TATAATG TTTT GT CTCAT /; @dna = sort { ($b =~ tr/Tt//) <=> ($a =~ tr/Tt//) } @dna; print "@dna\n"; # Output: TTTT TATAATG CTCAT GT exit; } Output: TTTT TATAATG CTCAT GT Sorting DNA by Number of T’s (Largest First) Sorting BINF 634 Fall 2013 - LECTURE06

  21. #!/usr/bin/perl use strict; use warnings; { # Sorting strings: my @dna = qw/ TATAATG TTTT GT CTCAT /; @dna = reverse sort { ($a =~ tr/Tt//) <=> ($b =~ tr/Tt//) } @dna; print "@dna\n"; # Output: TTTT TATAATG CTCAT GT exit; } Output: TTTT TATAATG CTCAT GT Sorting DNA by Number of T’s (Largest First) (Take 2) Sorting BINF 634 Fall 2013 - LECTURE06

  22. #!/usr/bin/perl use strict; use warnings; { # Sort strings without regard to case: my(@unsorted) = qw/ mouse Rat HUMAN eColi /; my(@sorted) = sort { lc($a) cmp lc($b) } @unsorted; print "@unsorted \n"; print "@sorted \n"; exit; } Output: mouse Rat HUMAN eColi eColi HUMAN mouse Rat Sorting Strings Without Regard to Case Sorting BINF 634 Fall 2013 - LECTURE06

  23. #!/usr/bin/perl use strict; use warnings; { my(%sales_amount) = ( auto=>100, kitchen=>2000, hardware=>200 ); sub bysales { $sales_amount{$b} <=> $sales_amount{$a} } for my $dept (sort bysales keys %sales_amount) { printf "%s:\t%4d\n", $dept, $sales_amount{$dept}; } exit; } Output: kitchen:2000 hardware: 200 auto: 100 Sorting Hashes by Value Sorting BINF 634 Fall 2013 - LECTURE06

  24. Review for Midterm BINF634 Midterm • Material • Tisdall Chapters 1-9 • Wall Chapter 5 • Lecture notes • The exam will be open book and notes • You cannot work together on it • You cannot use outside material • You will have the full period to take the midterm • You will be asked to program BINF 634 Fall 2013 - LECTURE06

  25. Some Example Questions Midterm • Given two DNA fragments contained in $DNA1 and $DNA2 how can we concatenate these to make a third string $DNA3? BINF 634 Fall 2013 - LECTURE06

  26. Some Example Questions Midterm • What does this line of code do? $RNA = ~ s/T/U/ig BINF 634 Fall 2013 - LECTURE06

  27. Some Example Questions Midterm • What does this statement do? $revcom =~ tr/ACGT/TGCA/; BINF 634 Fall 2013 - LECTURE06

  28. Some Example Questions Midterm • What do these four lines do? @bases = (‘A’, ‘C’, ‘G’, ‘T’); $base1 = pop @bases; unshift (@bases, $base1); print “@bases\n\n”; BINF 634 Fall 2013 - LECTURE06

  29. Some Example Questions Midterm • What does this code snippet do if COND is true unless(COND){ #do something } BINF 634 Fall 2013 - LECTURE06

  30. Some Example Questions Midterm • What does this code fragment do? $protein = join(‘’,@protein) BINF 634 Fall 2013 - LECTURE06

  31. Some Example Questions Midterm • What does this code fragment do? $myfile = “myfile”; Open(MYFILE, “>$myfile”) BINF 634 Fall 2013 - LECTURE06

  32. Some Example Questions Midterm • What does this code fragment do? while($DNA =~ /a/ig){$a++} BINF 634 Fall 2013 - LECTURE06

  33. Some Example Questions Midterm • What is the effect of using the command use strict; • at the beginning of your program? BINF 634 Fall 2013 - LECTURE06

  34. Some Example Questions Midterm • What is contained in the reserved variable $0 and in the array @ARGV ? BINF 634 Fall 2013 - LECTURE06

  35. Some Example Questions Midterm • What is the difference between “pass by value” and “pass by reference” ? BINF 634 Fall 2013 - LECTURE06

  36. Some Example Questions Midterm • What is a pointer and what does it mean to dereference a pointer? BINF 634 Fall 2013 - LECTURE06

  37. Some Example Questions Midterm • How do you invoke perl with the debugger? BINF 634 Fall 2013 - LECTURE06

  38. Some Example Questions Midterm • Given an array @verbs what is going on here? $verbs[rand @verbs] BINF 634 Fall 2013 - LECTURE06

  39. For the Curious Regarding Data Structures and Their Implications • Niklaus Wirth, Algorithms + Data Structures = Programs, Prentice Hall 1976. • Dated in terms of language, Pascal, but very well written and understandable BINF 634 Fall 2013 - LECTURE06

More Related