1.25k likes | 1.26k Views
Learn why Perl is widely used for scripting, with its powerful text manipulation capabilities, wide range of libraries, and ease of use. Discover its limitations and explore its applications.
E N D
Perl Introduction
Why Perl? • Widely used scripting language • Powerful text manipulation capabilities • Relatively easy to use • Has a wide range of libraries available • Fast • Good support for file and process operations
Less suiteable for: • Building large and complex applications • Java, C\C++, C# • Applications with a GUI • Java, C\C++, C# • High performance/memory efficient applications • Java, C\C++, C#, Fortran • Statistics • R
Learning to script Knowledge + Skills
Exercise Determine the percentage GC-content of the human chromosome 22
open file read lines per line: skip if header line count Cs and Gs count all nucleotides report percentage Cs and Gs
Hello World…. Simple line of Perl code: print "Hello World"; Run from a terminal: perl -e 'print "Hello World";' Now try this and notice the difference: perl -e 'print "Hello World\n";'
\n “backslash-n” newline character 'Enter'key
\t “backslash-t” 'Tab' key
Hello World (cont) To create a text file with this line of Perl code: echo 'print "Hello World\n";' > HelloWorld.pl perl HelloWorld.pl In the terminal window, type kate HelloWorld.pl and then hit the enter key. Now you can edit the Perl code.
Pythagoras' theorem a2 + b2 = c2 32 + 42 = 52
Pythagoras.pl $a = 3; $b = 4; $a2 = $a * $a; $b2 = $b * $b; $c2 = $a2 + $b2; $c = sqrt($c2); print $c;
$a a single value or scalarvariable starts with a $ followed by its name
Pythagoras.pl $a = 3; $b = 4; $a2 = $a * $a; $b2 = $b * $b; $c2 = $a2 + $b2; $c = sqrt($c2); print $c;
Perl scripts Add these lines at the top of each Perl script: #!/usr/bin/perl # author: # description: use strict; use warnings;
perl Pythagoras.pl Global symbol "$a2" requires explicit package name at Pythagoras.pl line 8. Global symbol "$b2" requires explicit package name at Pythagoras.pl line 9. Global symbol "$c2" requires explicit package name at Pythagoras.pl line 10. Global symbol "$a2" requires explicit package name at Pythagoras.pl line 10. Global symbol "$b2" requires explicit package name at Pythagoras.pl line 10. Global symbol "$c" requires explicit package name at Pythagoras.pl line 11. Global symbol "$c2" requires explicit package name at Pythagoras.pl line 11. Global symbol "$c" requires explicit package name at Pythagoras.pl line 12. Execution of Pythagoras.pl aborted due to compilation errors.
Pythagoras.pl $a = 3; $b = 4; $a2 = $a * $a; $b2 = $b * $b; $c2 = $a2 + $b2; $c = sqrt($c2); print $c;
Pythagoras.pl my $a = 3; my $b = 4; my $a2 = $a * $a; my $b2 = $b * $b; my $c2 = $a2 + $b2; my $c = sqrt($c2); print $c;
my The first time a variable appears in the script, it should be claimed using ‘my’. Only the first time...
Pythagoras.pl my($a,$b,$c,$a2,$b2,$c2); $a = 3; $b = 4; $a2 = $a * $a; $b2 = $b * $b; $c2 = $a2 + $b2; $c = sqrt($c2); print $c;
Pythagoras.pl $a = 3; $b = 4; $a2 = $a * $a; $b2 = $b * $b; $c2 = $a3 + $b2; $c = sqrt($c2); print $c;
Pythagoras.pl $a = 3; $b = 4; $a2 = $a * $a; $b2 = $b * $b; $c2 = $a3 + $b2; $c = sqrt($c2); print $c;
Pythagoras.pl my $a = 3; my $b = 4; my $a2 = $a * $a; my $b2 = $b * $b; my $c2 = $a3 + $b2; my $c = sqrt($c2); print $c;
perl Pythagoras.pl Global symbol "$a3" requires explicit package name at Pythagoras.pl line 10. Execution of Pythagoras.pl aborted due to compilation errors.
Text or number Variables can contain text (strings) or numbers my $var1 = 1; my $var2 = "2"; my $var3 = "three"; Try these four statements: print $var1 + $var2; print $var2 + $var3; print $var1.$var2; print $var2.$var3;
Text or number Variables can contain text (strings) or numbers my $var1 = 1; my $var2 = "2"; my $var3 = "three"; Try these four statements: print $var1 + $var2; => 3 print $var2 + $var3; => 2 print $var1.$var2; => 12 print $var2.$var3; => 2three
variables can be added, subtracted, multiplied, divided and modulo’d with: + - * / % variables can be concatenated with: .
sequence.pl print "Please type a DNA sequence: "; #this is a comment line #Read a line from the standard input (keyboard) my $DNAseq = <STDIN>; #Remove the newline (Enter) from the typed text chomp($DNAseq); #Get the length of the text(DNA sequence) my $length = length($DNAseq); print "It has $length nucleotides\n";
Program flow is top - down sequence.pl print "Please type a DNA sequence: "; #this is a comment line #Read a line from the standard input (keyboard) my $DNAseq = <STDIN>; #Remove the newline (Enter) from the typed text chomp($DNAseq); #Get the length of the text(DNA sequence) my $length = length($DNAseq); print "It has $length nucleotides\n";
<STDIN> read characters that are typed on the keyboard. Stop after the Enter key is pressed
<> same, STDIN is the default and can be left out. This is a recurring and confusing theme in Perl...
sequence.pl print "Please type a DNA sequence: "; #this is a comment line #Read a line from the standard input (keyboard) my $DNAseq = <>; #Remove the newline (Enter) from the typed text chomp($DNAseq); #Get the length of the text(DNA sequence) my $length = length($DNAseq); print "It has $length nucleotides\n";
$output = function($input) input and output can be left out parentheses are optional
sequence2.pl print "Please type a DNA sequence: "; my $DNAseq = <>; chomp($DNAseq); #Get the first three characters of $DNAseq my $first3bases = substr($DNAseq,0,3); print "The first 3 bases: $first3bases\n";
$frag = substr($text, $start, $num) Extract a fragment of string $text starting at $start and with $num characters. The first letter is at position 0!
perldoc perldoc -f substr substr EXPR,OFFSET,LENGTH,REPLACEMENT substr EXPR,OFFSET,LENGTH substr EXPR,OFFSET Extracts a substring out of EXPR and returns it. First character is at offset 0, .....
print perldoc -f print print FILEHANDLE LIST print LIST print Prints a string or a list of strings. If you leave out the FILEHANDLE, STDOUT is the destination: your terminal window.
print In Perl items in a list are separated by commas print "Hello World","\n"; Is the same as: print "Hello World\n";
sequence3.pl print "Please type a DNA sequence: "; my $DNAseq = <>; chomp($DNAseq); #Get the second codon of $DNAseq my $codon2 = substr($DNAseq,3,3); print "The second codon: $codon2\n";
sequence4.pl print "Please type a DNA sequence: "; my $DNAseq = <>; chomp($DNAseq); #Get the first three characters of $DNAseq my $codon = substr($DNAseq,0,3); if($codon eq "ATG") { print "Found a start codon\n"; }
Conditional execution if ( condition ) { do something } if ( condition ) { do something } else { do something else }
Conditional execution if ( $number > 10 ) { print "larger than 10"; } elsif ( $number < 10 ) { print "smaller less than 10"; } else { print "number equals 10"; } unless ( $door eq "locked" ) { openDoor(); }
Conditions are true or false 1 < 10 : true 21 < 10 : false
Examples if ( 1 == 1 ) { # TRUE if ( 1 == 2 ) { # FALSE if ( 1 != 2 ) { # TRUE if ( -1 > 10 ) { # FALSE if ( "hi" eq "dag" ) { # FALSE if ( "hi" gt "dag" ) { # TRUE if ( "hi" == "dag" ) { # TRUE !!! The last example may surprise you, as "hi" is not equal to "dag" and therefore should evaluate to FALSE. But for a numerical comparison they are both 0.