270 likes | 439 Views
Bioinformatics. Lecture 7: Introduction to Perl. Introduction. Basic concepts in Perl syntax: variables, strings, input and output Conditional and iteration File handling and error handling Arrays, lists and hashes . First program. a basic Strings program: Test.pl #!/ usr/bin/perl
E N D
Bioinformatics Lecture 7: Introduction to Perl
Introduction • Basic concepts in Perl syntax: • variables, strings, input and output • Conditional and iteration • File handling and error handling • Arrays, lists and hashes
First program • a basic Strings program: Test.pl • #!/usr/bin/perl • print "Hello boys and girs!\n this is introduction to perl"; • Open with notepad and type the above • Save file as hello.pl • Ensure that hide file extensions option is unchecked. • Run via the command line
Variables declarations • $variable name : intergers, floats, strings. • @ arrays • Arithmetic operators: • +, -, *, / , **( exponentation); % modulus • Double v single quotation marks • $x = ‘ I am from Cork ‘ • print “the value of $x is $x\n” • print ’the value of $x is $x\n’ • print “the value of \$x is $x\n” # note the \$x • #evaluating expressions in print (# comment line symbol) • $ x = 15; • Print “the value of x is “, $x + 3, “\n” (ArithmeticExample.pl)
Input , output and files handling • Input • $var = <> (input a line of text and assign it to $var): also iputs return character • Chomp $var removes the return character from the #also used the word chop • Alternatively chomp($var = <>); • $line = <DATA> reads in “hardcoded data” • Output • print (already covered) • File Handling • open MYFILE , ‘data.txt’ (open file for reading;) • open MYFILE, ‘>data.txt’ (open file for writing) • Open MYFILE, ‘>> data.txt’ (open file for appending) • $line = <MYFILE > #read one line from file • @entire_file = <MYFILE> ; (called slurping) #reads all the file into an array • print MYFILE “Do you like computers….”, $number/3, “\n” # write out to file • close MYFILE;
Conditional Operator • == Equality $a == $b • != Not equal $a != $b • < Less than $a < $b • > Greater than $a > $b • <= Less than or equal to $a <= $b • >= Greater than or equal to $a >= $b • ! Logical not $ = !$b
String conditional operator • eq Equality $a eq $b • ne Not equal $a ne $b • lt Less than $a lt $b • gt Greater than $a gt $b • le Less than or equal to $a le $b • ge Greater than or equal to $a ge $b • . Concatenation $a.$c • =~ Pattern match $a =~ /gatc/
Conditional statements • If and elseif and else if_else.pl • #!/usr/bin/perl • print “Enter your age: ”; • $age = <>; • if ($age <= 0) { • print “You are way too young to be using a computer.\n”; • } • elseif ($age >= 100) • { • print “Not in a dog’s life!\n”; • } else • { • print “Your age in dog years is ”,$age/7,“\n”; • }
Iteration: loops • While-loops • #!/usr/bin/perl • $count = 1; • while ($count <= 5) { • print “$count potato\n”; • $count = $count + 1; • } • Until-loops • #!/usr/bin/perl • $count = 1; • until ($count > 5) { • print “$count potato\n”; • $count = $count + 1; • }
Loops with defined • #!/usr/bin/perl • # defined fnt is true if $line assigned a value • print “Type something. ‘quit’ to finish\n ”; • while ( defined($line = <>) ) { • chomp $line; • last if $line eq ‘quit’; # breaks out of loop at quit • print “You typed ‘$line’\n\n”; • print “Type something> ”; • } • print “goodbye!\n”; loops_defined.pl
Shorthand input notation • #!/usr/bin/perl • print “Type something. ‘quit’ to finish\n ”; • while (<>) { • chomp; # $_ generic variable name • last if $_eq ‘quit’; • print “You typed ‘$_ ’\n\n”; • print “Type something> ”; • } • print “goodbye!\n”;
Change Standard input/ output • redirect Sdout to a file • U:\test test.pl> stdout.txt [produces a text file ] • print file goes to file and not to screen • Run Loops_defined to redirect to output to file • The <> input has one feature where if a file name is on the command line it beings to read from it otherwise it reads from keyboards • U:\test commandline.plstdin.txt
Finding length of file • #!/usr/bin/perl#File_size_1.pl • # file size.pl • $length = 0; # set length counter to zero • $lines = 0; # set number of lines to zero • print “enter text one line at a time and press (ctrl z) to quit”; • while (<>) { # read file one line at a time • chomp; # remove terminal newline • $length = $length + length $_ ; • $lines = $lines + 1; • } • print “LENGTH = $length\n”; • print “LINES = $lines\n”; • Try using keyboard as Stdin (ctrl Z) and file name on command line
Dynamic Arrays • Declaration of an array in perl • @sequences = (‘123a’, ‘23ed4’, ‘2334d’); • Array contains 3 strings!!! • Array operations: • $one_seq = @sequences[2] {zero based array} • @seq = @sequences; assigns arrays • @seq = (@seq, ‘125f’); adding an value • @combined = (@seq, @seq2) • Removing (splice) @removed = splice @seq, 1, 2 • slicing : @slice = @seq[1,2]; • Splice_slice_array.pl
Dynamic Arrays • push @sequences, ‘2345d’; (adds element to end of array) • Pop @sequences removes and returns (function returns) last element of array • Shifting: removes and returns the first element of an array. • Unshifting: Adds an element or list of elements onto the beginning of an array.
Shift Pop push unshift example • #! /usr/bin/perl • # The 'pushpop' program - pushing, popping, shifting and unshifting. • @sequences = ( 'TTATTATGTT', 'GCTCAGTTCT', 'GACCTCTTAA', • 'CTATGCGGTA', 'ATCTGACCTC' ); • print "@sequences\n"; • $last = pop @sequences; • print "@sequences\n"; • $first = shift @sequences; • print "@sequences\n"; • unshift @sequences, $last; • print "@sequences\n"; • push @sequences, ( $first, $last ); • print "@sequences\n"; • What is the expected output (run code to confirm)
Arrays: two more functions • Substr (extracting a substring from a string) • $sub = substr ($string, offset position[position to begin extraction], size of substring) • Substr and index: • To obtain the reverse complement of a DNA sequence: assume the sequence is stored in array: (GGGGTTTT becomes AAAACCCC) • Iterating through an array: • foreach $dna (@dna) • { • $dna = reverse $dna; # reverse the contents of a scalar $dna • $dna =~ tr/gatcGATC/ctagCTAG/; • # tr (translate first set into second; e.g. g becomes c ) complement (replace) • }
Questions • how would you read in a file of DNA sequence into an array and print both the original and reverse complementary copy • What use could this program have? (biology related answer)
Array and lists • Lists are an array of constants or variables • Values of a list assigned to any array • @clones = (’192a8’,’18c10’,’327h1’,’201e4’); • Values in an array assigned to a list • ($first,$second,$third) = @clones;
Hashes: associative arrays • Similar arrays but elements are unordered • Two parts: the identifer (name), a scalar value (string) • Add Elements are referred to by strings: • %oligos = (); • $oligos{’192a8’} = ‘GGGTTCCGATTTCCAA’; • $oligos{’18c10’} = ‘CTCTCTCTAGAGAGAGCCCC’; • $oligos{’327h1’} = ‘GGACCTAACCTATTGGC’; • Note in the name part use ‘ ‘ • Removing elements: • Delete $oligos{’192a8’};
Hashes • Outputting hash results • $s = $oligos{’192a8’}; • print “oligo 192a8 is $s\n”; • print “oligo 192a8 is ”,length $oligos{’192a8’},“ base pairs long\n”; • print “oligo 18c10 is $oligos{’18c10’}\n”; • Expected output: input_output_hash.pl • oligo 192a8 is GGGTTCCGATTTCCAA • oligo 192a8 is 16 base pairs long • oligo 18c10 is CTCTCTCTAGAGAGAGCCCC
Hashes • Example of the use of a Hash table • hash_bases.pl program • For loops and hash tables • foreach $clone (’327h1’,’192a8’,’18c10’) { • print “$clone: $oligos{$clone}\n”; • } • %oligos is refers to the hash table • $oligos is used to refer to elements • $size = keys %oligo; returns the number of entries
Displaying all entries in a hash table • while ( ( $genome, $count ) = each %gene_counts ) • { • print "`$genome' has a gene count of $count\n";} • foreach $genome ( sort keys %gene_counts ) • { • print "`$genome' has a gene count of $gene_counts • { $genome }\n";} • Refer to genes.pl
Error Handling • die function: • open myfile, ‘stdin.txt’ or • Die “could not open file aborting…\n”; • If file does not exits the program terminates with the above message • Write a program to read in data from a file to an array and when all the data is input to output in reverse order • Create a hash table that performs the condon to AA conversion and use it to convert codons {entered from the key board} into their corresponding Amino Acids