180 likes | 366 Views
Subroutines. Q: Why do we need Subroutines?. A: Because we are lazy!. Repeatedly typing out the code for a chore that is used over and over again (or even only a few times) would be a waste of time and space, and makes the code hard to read. Capital letters OK. No semicolon required.
E N D
Subroutines Q: Why do we need Subroutines? A: Because we are lazy! Repeatedly typing out the code for a chore that is used over and over again (or even only a few times) would be a waste of time and space, and makes the code hard to read.
Capital letters OK No semicolon required Subroutines Defining a subroutine Minimally, all we need is a statement block of Perl code, that we have given a name! sub Dont_do_much { #any code you like!! } Once defined, it can be called just by invoking its name Dont_do_much();
Declaration Subroutine call Definition Subroutines Declaring a subroutine Normally we need to have at least declared a subroutine before we use it. As long as this has been done, it is OK to define it later. sub Dont_do_much; Dont_do_much(); # Rest of your program ... sub Dont_do_much { # any code you like!! }
@_ is the special built-in array variable that contains passed parameters Subroutines Passing parameters Arguments are expressions that are passed to the subroutine sub expand_name { my $amino_acid = $_[0]; my %convert = (“R” => “Arg”, “A” => “Ala”, etc.); if (exists ($convert{$amino_acid}) { print “$amino_acid is $convert{$amino_acid}\n”; } } expand_name(“R”); Output: R is Arg!
We can return a scalar, an array, or a hash Subroutines Return Values Often the subroutine performs a computation based on the arguments and returns a result sub expand_name { my $amino_acid = $_[0]; my %convert = (“R” => “Arg”, “A” => “Ala”, etc.); my $result =$convert{$amino_acid}; return $result; } ############ Main Code ######### my $variable = expand_name(“R”); print “R is $variable!\n”; Output: R is Arg!
Subroutines In class exercise #1 Download subroutine1.pl Modify the program to accept input amino acids from the command line Modify the program to process MULTIPLE amino acids from the command line Hint: remember the @ARGV built-in array! Hint: you can iterate over an array with foreach If you get stuck, subroutine2.pl shows one approach that works When ready to go on: Download subroutine2.pl Modify the program return a list of results and print out its contents. If you get stuck, subroutine3.pl shows how I did it. The push command may be helpful to add a new element to an array: push (@some_existing_array, $new_element);
References Why do we need them? Example: Often we want to keep track of multiple pieces of information about an object e.g. K is Lys is Lysine, and is encoded by AAA or AAG D is Asp is Aspartate, and is encoded by GAU or GAC Etc… Problem: Hashes and Lists can only be populated with scalars!
References Why do we need them? Example: Often we want to keep track of multiple pieces of information about an object What we need here is a way to make a “hash of lists” E Glu Glutamate L V I S
Since references are scalar, it is OK to make a hash of references to lists We can even anonymize the list – It now has no name The \ indicates we want the memory location, not the value stored at it! References What are they? References are essentially the memory location of a scalar, list, hash or subroutine The key point is that references (and therefore memory addresses) are plain old scalar variables and can therefore be used in a hash or in a list my @E_array(“Glu”,”Glutamate”); my @L_array(“Met”,”Methionine”); $reference = \@array; my %aa_hash = (“E” => \@E_array, “L” => \@L_array); # etc. my %aa_hash = (“E” => [“Glu”, “Glutamate”], “L” => [“Met”, “Methionine”]);
Array(0x223f74) is an address Glu Glutamate is a list Glu is a list element The { } dereference the pointer References What are they? We can access the data stored at the address pointed to by the reference by simply “dereferencing” my %aa_code = (“E” => [“Glu”, “Glutamate”], “L” => [“Met”, “Methionine”]); print “$aa_code{‘E’} is an address\n” print “@{$aa_code{‘E’}} is a list\n” print “@{$aa_code{‘E’}}[0] is a list element\n”
References In class exercise #2 Download references1.pl Examine the code and make sure that you understand how referencing and dereferencing works. This is an example of using an references to an array within a hash. Uncomment some of the print statements to see how to access different levels of the data structure. Download convert.txt Use this code structure as a starting point for adding the ability to print all of the possible anticodon sequences for each one-letter amino acid code. If you get stuck, download references2.pl to see my approach to this
E L V I S To get at this element we might use something like the following: $element = $hash{‘E’} [2] [1]; References Putting them to work A list within a list within a hash is one way to get what we need * Glu Glutamate * * * GAA * GAG *
Working with Files Opening a file Built in Filehandles: <STDIN> - standard input (keyboard) <STDOUT> - standard output (screen) <STDERR> - standard error (screen) Not much choice e.g.$dna = <STDIN>; Waits for keyboard input, and puts the result in $DNA
no “my” Normally all caps Special variable for passing errors Working with Files Opening a file Opening your own file handle open SEQ_FILE, $filename or die $!;
Working with Files Reading whole files into an array open SEQ_FILE, $filename or die $!; my @sequences; @sequences = <SEQ_FILE>; Can take an awful lot of memory!!
Filehandle in diamonds acts just like <STDIN> Same as default iterator Working with Files Reading lines open SEQ_FILE, $filename or die $!; my $linenum = 1; while (<SEQ_FILE>) { print “$_”; $linenum++; }
> Indicates we wish to write Redirecting from <STDOUT> To SEQ_OUT Working with Files Writing to Files We can simply redirect output from <STDOUT> to whatever filehandle we choose! open SEQ_IN, $in_seq or die $!; open SEQ_OUT, “>$out_seq” or die $!; #Do a line-wise copy of the file while (<SEQ_IN>) { print SEQ_OUT $_; }
Basic File Handling In class exercise #3 Download inputSeq.txt Modify References3.pl (or your own code) to read its input from inputSeq.txt and output to both the screen AND to a file called output.txt