270 likes | 431 Views
Perl: subroutines. Hash Example. Codons Use stdin/file re-direction. Outline. String operators and sorting variable scope revisited Subroutines. #!/usr/bin/perl # guessNum.pl # #Example to show STDIN and STDOUT #This will take anything you type, and print it back #to the screen. #
E N D
Hash Example • Codons • Use stdin/file re-direction
Outline • String operators and sorting • variable scope revisited • Subroutines
#!/usr/bin/perl # guessNum.pl # #Example to show STDIN and STDOUT #This will take anything you type, and print it back #to the screen. # #Note that there is a blank line between every line. #This is because, STDIN gets read when you hit "ENTER" key. #This gets stored as a new-line in the "$line" variable. # #Then, a new-line is printed after the new-line you just entered -- #therefore, you get 2 new-lines. # #We could either take the \n out of the print statement, #or we could "chomp" $line # # CHOMP # # Let's use STDIN to do something more interesting -- # number guessing game. # # Don't get too uptight over this line -- it is just setting # a "seed" for the rand() fuction with a value that approximates # a random number. If you must know, it takes a prccess ID ($$), # shitfs its bit left 15 times, then add the process ID to the shifted # value, then does an bit-wise XOR (^) with the current time(). # srand(time() ^ ($$ + ($$ << 15)) ); $high = 100; $low = 0; $rand = int rand(101); # Interger number between 0 and 101 (0-100) print "Guess a number: \n"; while(chomp ($line = <STDIN>) && ($line != $rand)) { if($line =~ m/^\d+$|^-\d+$/) { # A number has been entered if($line > $high) { print "Error: number has to be <= ".($high)."\n"; } elsif($line < $low) { print "Error: number has to be >= ".($low)."\n"; } elsif($line > $rand) { $high = $line; print "lower: between $low and $high inclusive\n"; } else { $low = $line; print "higher: between $low and $high inclusive\n"; } } else { print "Invalid characters: $line\nTry again.\n"; } } # Since I am out of the loop, I must have guessed correctly print "Correct: The number was $rand\n";
Output ./guessNum.pl Guess a number: 6 higher: between 6 and 100 inclusive 50 lower: between 6 and 50 inclusive 25 lower: between 6 and 25 inclusive 15 higher: between 15 and 25 inclusive 20 lower: between 15 and 20 inclusive 25 Error: number has to be <= 20 18 higher: between 18 and 20 inclusive 19 Correct: The number was 19
#!/usr/bin/perl # randomSeq.pl # # This example is a modification of the "number guessing" # example. I will modify the random number generation # to generate nucleotide sequence. # # Don't get too uptight over this line -- it is just setting # a "seed" for the rand() fuction with a value that approximates # a random number. If you must know, it takes a prccess ID ($$), # shitfs its bit left 15 times, then add the process ID to the shifted # value, then does an bit-wise XOR (^) with the current time(). # srand(time() ^ ($$ + ($$ << 15)) ); print "Enter sequence length (nucleotides): "; $length = <STDIN> ; #length of sequence to generate -- from screen if(!($length =~ m/^\d+/)) { #match only an integer at the beginning die ("Invalid input: $&\n"); } while($length) { # stay in loop until have generated enough sequence $rand = int rand(4); # Interger number between (0-3) inclusive if($rand == 0) { $newNT = 'A'; } elsif($rand == 1) { $newNT = 'C'; } elsif($rand == 2) { $newNT = 'G'; } elsif($rand == 3) { $newNT = 'T'; } else { die("Error: $rand out of bounds (0-3)\n"); } $length = $length-1; #decrease loop counter $seq = $seq . $newNT; #keep the nucleotide I just created } # Since I am out of the loop, I must be done print "$seq\n";
#!/usr/bin/perl # randomSeqSub.pl # # This example is a modification of the "number guessing" # example. I will modify the random number generation # to generate nucleotide sequence. # # Don't get too uptight over this line -- it is just setting # a "seed" for the rand() fuction with a value that approximates # a random number. If you must know, it takes a prccess ID ($$), # shitfs its bit left 15 times, then add the process ID to the shifted # value, then does an bit-wise XOR (^) with the current time(). # srand(time() ^ ($$ + ($$ << 15)) ); print "Enter sequence length (nucleotides): "; $length = <STDIN> ; #length of sequence to generate -- from screen if(!($length =~ m/^\d+/)) { #match only an integer at the beginning die ("Invalid input: $&\n"); } while($length) { # stay in loop until have generated enough sequence $rand = int rand(4); # Interger number between (0-3) inclusive # print "rand $rand\n"; $rand =~ s/0/A/; $rand =~ s/1/C/; $rand =~ s/2/G/; $rand =~ s/3/T/; # print "letter rand $rand\n"; $length = $length-1; #decrease loop counter $seq = $seq . $rand; #keep the nucleotide I just created } # Since I am out of the loop, I must be done print "$seq\n";
#!/usr/bin/perl # randomSeqTR.pl # # This example is a modification of the "number guessing" # example. I will modify the random number generation # to generate nucleotide sequence. # # Don't get too uptight over this line -- it is just setting # a "seed" for the rand() fuction with a value that approximates # a random number. If you must know, it takes a prccess ID ($$), # shitfs its bit left 15 times, then add the process ID to the shifted # value, then does an bit-wise XOR (^) with the current time(). # srand(time() ^ ($$ + ($$ << 15)) ); print "Enter sequence length (nucleotides): "; $length = <STDIN> ; #length of sequence to generate -- from screen if(!($length =~ m/^\d+/)) { #match only an integer at the beginning die ("Invalid input: $&\n"); } while($length) { # stay in loop until have generated enough sequence $rand = int rand(4); # Interger number between (0-3) inclusive $rand =~ tr/0123/ACTG/; $length = $length-1; #decrease loop counter $seq = $seq . $rand; #keep the nucleotide I just created } # Since I am out of the loop, I must be done print "$seq\n";
Strings and Sorting index(STRING, SUBSTRING, POSITION) index(STRING, SUBSTRING) -- returns the index of the first character of SUBSTRING within STRING -- if it matches the first character, it returns 0 -- on failure to match, it returns -1 -- may start search at position POSITION (optional) $stuff = "Hello world"; $pos = index($stuff, "wor"); #pos = 6 $pos = index($stuff,"o", 5); # pos = 7
substr substr(STRING, START, LENGTH) substr(STRING, START) -- returns the sub-string of STRING, beginning at location START, for LENGTH number of characters -- omission of LENGTH returns the remaining string from START to the end of the string $rock = substr("Fred Flintstone", 5,4); # rock = Flin $rock = substr("Fred Flintstone", 10, 50); # rock = stone index and substring work well together $long = "this is a very long string"; $found = substr($long, index($long, "o") ); #found = ong string look at: perldoc -f substr
Usage of substr • substr/index functionality may be done with regular expressions • substr/index can be faster • no overhead of regexp's • never case insensitive • no metacharacters • no automatic memory variables
split split /PATTERN/,EXPR,LIMIT split /PATTERN/,EXPR split /PATTERN/ split Splits the string EXPR into a list of strings and returns that list. By default, empty leading fields are preserved, and empty trailing ones are deleted. (If all fields are empty, they are considered to be trailing.) In scalar context, returns the number of fields found and splits into the @_ array. Use of split in scalar context is deprecated, however, because it clobbers your subroutine argu- ments. If EXPR is omitted, splits the $_ string. If PATTERN is also omitted, splits on whitespace (after skipping any leading whitespace). Anything matching PATTERN is taken to be a delim- iter separating the fields. (Note that the delimiter may be longer than one character.)
Split -- continued split /PATTERN/,EXPR,LIMIT split /PATTERN/,EXPR split /PATTERN/ split If LIMIT is specified and positive, it represents the maximum number of fields the EXPR will be split into, though the actual number of fields returned depends on the number of times PAT- TERN matches within EXPR. If LIMIT is unspecified or zero, trailing null fields are stripped (which potential users of "pop" would do well to remember). If LIMIT is negative, it is treated as if an arbitrarily large LIMIT had been specified. Note that splitting an EXPR that evaluates to the empty string always returns the empty list, regardless of the LIMIT speci- fied.
Split example $line = ">gi:25952121:ref:NM_033028.2: Homo saps (BBS4)\n"; ($junk,$gi,$junk,$ref,$long,$short) = split /[:\(\)]/,$line; print "gi=$gi ref=$ref long=$long short=$short\n";
Advanced Sorting sort LIST allowed us to sort arrays lexigraphically (by ASCI values) How do we sort by other orders? Use a sort subroutine
Subroutines Name of a subroutine is a perl identifier (letters, digits, and underscores – cannot begin with digit). Subroutine definition: sub increment { $count++; # global variable defined elsewhere print "Incrementing\n"; }
Subroutines • Subroutine definitions can be anywhere in your program • former C and Pascal programmers tend to put them in the front • others put them at the end so that the main program is at the front • subroutine definitions are global (are known everywhere in the program) • if you define a subroutine twice (bad practice since it typically means the programmer is confused), the later one will be used
Call a subroutine from a program Invoke a subroutine from within ANY expression or location of your program with the subroutine name – typically proceeded by an ampersand & #!/usr/bin/perl sub increment { $count++; # global variable defined elsewhere print "Incrementing\n"; } ########### main program starts here $count = 5; &increment; #Incrementing print "$count\n"; # 6 Note how program execution proceeds from the main program, transfers to the subroutine, and then "returns" back to the main program
Return Values • All subroutines return a value • may be used • may be discarded/ignored (previous example) • the last expression evaluated is automatically returned unless a value is explicitly returned
Example sub sum { print "sum subroutine\n"; $one + $two; # sum is returned } # I would consider this a “poor” programming practice since # any modification to the subroutine after the last line would change # the behavior $one = 2; $two = 4; $c = ∑ sub sum { print “sum subroutine\n”; $one + $two; print ”two = $two \n"; # 1 is returned instead }
Example sub larger { if ($one > $two) { $one; } else { $two; } }
Subroutines (functions, procedures) • A grouping of code (instructions) that typically performs some task, action (or "function") • A useful and effective technique for grouping and organizing programs into smaller and more manageable tasks • Hierarchical (a subroutine can call a subroutine, which can call a sub…)
Example • SimpleTranslate -- hash example
Good Programming Strategies for Subroutines • Bad form to always use "global" values in subroutines • Ideally, you would never use global variables in a subroutine • facilitates modular design • encourages code reusability • "Pass" arguments to the subroutine • review scope of variables
Variable Scope • global variables: variables that are accessible/visible from any part of a program • local variables: accessible to a limited portion of the program • ensures that variables are not unintentionally manipulated • perl • variables are always global unless you specify otherwise my $variable_name specifies a “local” variable • scope usually refers to “blocks of code” • also applies to subroutines • example – scope.pl
#!/usr/bin/perl $i = 5; print "i=$i\n"; { print "i=$i\n"; my $i = 3; print "i=$i\n"; } print "i=$i\n"; # 5 # 5 # 3 # 5