470 likes | 907 Views
Perl Tutorial Why Perl? Perl is built around regular expressions REs are good for string processing Therefore Perl is a good scripting language Perl is especially popular for CGI scripts Perl makes full use of the power of UNIX Short Perl programs can be very short
E N D
Why Perl? • Perl is built around regular expressions • REs are good for string processing • Therefore Perl is a good scripting language • Perl is especially popular for CGI scripts • Perl makes full use of the power of UNIX • Short Perl programs can be very short • “Perl is designed to make the easy jobs easy, without making the difficult jobs impossible.” -- Larry Wall, Programming Perl HY439 Autumn 2005
Why not Perl? • Perl is very UNIX-oriented • Perl is available on other platforms... • ...but isn’t always fully implemented there • However, Perl is often the best way to get some UNIX capabilities on less capable platforms • Perl does not scale well to large programs • Weak subroutines, heavy use of global variables • Perl’s syntax is not particularly appealing HY439 Autumn 2005
What is a scripting language? • Operating systems can do many things • copy, move, create, delete, compare files • execute programs, including compilers • schedule activities, monitor processes, etc. • A command-line interface gives you access to these functions, but only one at a time • A scripting language is a “wrapper” language that integrates OS functions HY439 Autumn 2005
Major scripting languages • UNIX has sh, Perl • Macintosh has AppleScript, Frontier • Windows has no major scripting languages • probably due to the weaknesses of DOS • Generic scripting languages include: • Perl (most popular) • Tcl (easiest for beginners) • Python (new, Java-like, best for large programs) HY439 Autumn 2005
Perl Example 1 #!/usr/local/bin/perl # # Program to do the obvious # print 'Hello world.'; # Print a message HY439 Autumn 2005
Comments on “Hello, World” • Comments are # to end of line • But the first line, #!/usr/local/bin/perl, tells where to find the Perl compiler on your system • Perl statements end with semicolons • Perl is case-sensitive • Perl is compiled and run in a single operation HY439 Autumn 2005
Perl Example 2 #!/ex2/usr/bin/perl # Remove blank lines from a file # Usage: singlespace < oldfile > newfile while ($line = <STDIN>) { if ($line eq "\n") { next; } print "$line"; } HY439 Autumn 2005
More Perl notes • On the UNIX command line; • < filename means to get input from this file • > filename means to send output to this file • In Perl, <STDIN> is the input file, <STDOUT> is the output file • Scalar variables start with $ • Scalar variables hold strings or numbers, and they are interchangeable • Examples: • $priority = 9; • $priority = '9'; • Array variables start with @ HY439 Autumn 2005
Perl Example 3 #!/usr/local/bin/perl # Usage: fixm <filenames> # Replace \r with \n -- replaces input files foreach $file (@ARGV) { print "Processing $file\n"; if (-e "fixm_temp") { die "*** File fixm_temp already exists!\n"; } if (! -e $file) { die "*** No such file: $file!\n"; } open DOIT, "| tr \'\\015' \'\\012' < $file > fixm_temp" or die "*** Can't: tr '\015' '\012' < $ file > $ fixm_temp\n"; close DOIT; open DOIT, "| mv -f fixm_temp $file" or die "*** Can't: mv -f fixm_temp $file\n"; close DOIT; } HY439 Autumn 2005
Comments on example 3 • In # Usage: fixm <filenames>, the angle brackets just mean to supply a list of file names here • In UNIX text editors, the \r (carriage return) character usually shows up as ^M (hence the name fixm_temp) • The UNIX command tr '\015' '\012' replaces all \015 characters (\r) with \012 (\n) characters • The format of the open and close commands is: • openfileHandle,fileName • closefileHandle,fileName • "| tr \'\\015' \'\\012' < $file > fixm_temp"says: Take input from $file, pipe it to the tr command, put the output onfixm_temp HY439 Autumn 2005
Arithmetic in Perl $a = 1 + 2; # Add 1 and 2 and store in $a $a = 3 - 4; # Subtract 4 from 3 and store in $a $a = 5 * 6; # Multiply 5 and 6 $a = 7 / 8; # Divide 7 by 8 to give 0.875 $a = 9 ** 10; # Nine to the power of 10, that is, 910 $a = 5 % 2; # Remainder of 5 divided by 2 ++$a; # Increment $a and then return it $a++; # Return $a and then increment it --$a; # Decrement $a and then return it $a--; # Return $a and then decrement it HY439 Autumn 2005
String and assignment operators $a = $b . $c; # Concatenate $b and $c $a = $b x $c; # $b repeated $c times $a = $b; # Assign $b to $a $a += $b; # Add $b to $a $a -= $b; # Subtract $b from $a $a .= $b; # Append $b onto $a HY439 Autumn 2005
Single and double quotes • $a = 'apples'; • $b = 'bananas'; • print $a . ' and ' . $b; • prints: apples and bananas • print '$a and $b'; • prints: $a and $b • print "$a and $b"; • prints: apples and bananas HY439 Autumn 2005
Arrays • @food = ("apples", "bananas", "cherries"); • But… • print $food[1]; • prints "bananas" • @morefood = ("meat", @food); • @morefood == ("meat", "apples", "bananas", "cherries"); • ($a, $b, $c) = (5, 10, 20); HY439 Autumn 2005
push and pop • push adds one or more things to the end of a list • push (@food, "eggs", "bread"); • push returns the new length of the list • pop removes and returns the last element • $sandwich = pop(@food); • $len = @food; # $len gets length of @food • $#food # returns index of last element HY439 Autumn 2005
foreach # Visit each item in turn and call it $morsel foreach $morsel (@food) { print "$morsel\n"; print "Yum yum\n"; } HY439 Autumn 2005
Tests • “Zero” is false. This includes:0, '0', "0", '', "" • Anything not false is true • Use == and != for numbers, eq and ne for strings • &&, ||, and ! are and, or, and not, respectively. HY439 Autumn 2005
for loops • for loops are just as in C or Java • for ($i = 0; $i < 10; ++$i){ print "$i\n";} HY439 Autumn 2005
while loops #!/usr/local/bin/perl print "Password? "; $a = <STDIN>; chop $a; # Remove the newline at end while ($a ne "fred") { print "sorry. Again? "; $a = <STDIN>; chop $a;} HY439 Autumn 2005
do..while and do..until loops #!/usr/local/bin/perl do { print "Password? "; $a = <STDIN>; chop $a; } while ($a ne "fred"); HY439 Autumn 2005
if statements if ($a) { print "The string is not empty\n"; } else { print "The string is empty\n"; } HY439 Autumn 2005
if - elsif statements if (!$a) { print "The string is empty\n"; } elsif (length($a) == 1) { print "The string has one character\n"; } elsif (length($a) == 2) { print "The string has two characters\n"; } else { print "The string has many characters\n"; } HY439 Autumn 2005
Why Perl? • Two factors make Perl important: • Pattern matching/string manipulation • Based on regular expressions (REs) • REs are similar in power to those in Formal Languages… • …but have many convenience features • Ability to execute UNIX commands • Less useful outside a UNIX environment HY439 Autumn 2005
Basic pattern matching • $sentence =~ /the/ • True if $sentence contains "the" • $sentence = "The dog bites.";if ($sentence =~ /the/) # is false • …because Perl is case-sensitive • !~ is "does not contain" HY439 Autumn 2005
RE special characters . # Any single character except a newline ^ # The beginning of the line or string $ # The end of the line or string * # Zero or more of the last character + # One or more of the last character ? # Zero or one of the last character HY439 Autumn 2005
RE examples ^.*$ # matches the entire string hi.*bye # matches from "hi" to "bye" inclusive x +y # matches x, one or more blanks, and y ^Dear # matches "Dear" only at beginning bags? # matches "bag" or "bags" hiss+ # matches "hiss", "hisss", "hissss", etc. HY439 Autumn 2005
Square brackets [qjk] # Either q or j or k [^qjk] # Neither q nor j nor k [a-z] # Anything from a to z inclusive [^a-z] # No lower case letters [a-zA-Z] # Any letter [a-z]+ # Any non-zero sequence of # lower case letters HY439 Autumn 2005
More examples [aeiou]+ # matches one or more vowels [^aeiou]+ # matches one or more nonvowels [0-9]+ # matches an unsigned integer [0-9A-F] # matches a single hex digit [a-zA-Z] # matches any letter [a-zA-Z0-9_]+ # matches identifiers HY439 Autumn 2005
More special characters \n # A newline \t # A tab \w # Any alphanumeric; same as [a-zA-Z0-9_] \W # Any non-word char; same as [^a-zA-Z0-9_] \d # Any digit. The same as [0-9] \D # Any non-digit. The same as [^0-9] \s # Any whitespace character\S # Any non-whitespace character \b # A word boundary, outside [] only \B # No word boundary HY439 Autumn 2005
Quoting special characters \| # Vertical bar \[ # An open square bracket \) # A closing parenthesis \* # An asterisk \^ # A carat symbol \/ # A slash \\ # A backslash HY439 Autumn 2005
Alternatives and parentheses jelly|cream # Either jelly or cream (eg|le)gs # Either eggs or legs (da)+ # Either da or dada or # dadada or... HY439 Autumn 2005
The $_ variable • Often we want to process one string repeatedly • The $_ variable holds the current string • If a subject is omitted, $_ is assumed • Hence, the following are equivalent: • if ($sentence =~ /under/) … • $_ = $sentence; if (/under/) ... HY439 Autumn 2005
Case-insensitive substitutions • s/london/London/i • case-insensitive substitution; will replace london, LONDON, London, LoNDoN, etc. • You can combine global substitution with case-insensitive substitution • s/london/London/gi HY439 Autumn 2005
Remembering patterns • Any part of the pattern enclosed in parentheses is assigned to the special variables $1, $2, $3, …, $9 • Numbers are assigned according to the left (opening) parentheses • "The moon is high" =~ /The (.*) is (.*)/ • Afterwards, $1 = "moon" and $2 = "high" HY439 Autumn 2005
Dynamic matching • During the match, an early part of the match that is tentatively assigned to $1, $2, etc. can be referred to by \1, \2, etc. • Example: • \b.+\b matches a single word • /(\b.+\b) \1/ matches repeated words • "Now is the the time" =~ /(\b.+\b) \1/ • Afterwards, $1 = "the" HY439 Autumn 2005
tr • tr does character-by-character translation • tr returns the number of substitutions made • $sentence =~ tr/abc/edf/; • replaces a with e, b with d, c with f • $count = ($sentence =~ tr/*/*/); • counts asterisks • tr/a-z/A-Z/; • converts to all uppercase HY439 Autumn 2005
split • split breaks a string into parts • $info = "Caine:Michael:Actor:14, Leafy Drive";@personal = split(/:/, $info); • @personal = ("Caine", "Michael", "Actor", "14, Leafy Drive"); HY439 Autumn 2005
Associative arrays • Associative arrays allow lookup by name rather than by index • Associative array names begin with % • Example: • %fruit = ("apples", "red", "bananas", "yellow", "cherries", "red"); • Now, $fruit{"bananas"} returns "yellow" • Note: braces, not parentheses HY439 Autumn 2005
Associative Arrays II • Can be converted to normal arrays:@food = %fruit; • You cannot index an associative array, but you can use the keys and values functions: foreach $f (keys %fruit){ print ("The color of $f is " . $fruit{$f} . "\n");} HY439 Autumn 2005
Calling subroutines • Assume you have a subroutine printargs that just prints out its arguments • Subroutine calls: • printargs("perly", "king"); • Prints: "perly king" • printargs("frog", "and", "toad"); • Prints: "frog and toad" HY439 Autumn 2005
Defining subroutines • Here's the definition of printargs: • sub printargs{ print "@_\n"; } • Where are the parameters? • Parameters are put in the array @_ which has nothing to do with $_ HY439 Autumn 2005
Returning a result • The value of a subroutine is the value of the last expression that was evaluated sub maximum { if ($_[0] > $_[1]) { $_[0]; } else { $_[1]; } } $biggest = maximum(37, 24); HY439 Autumn 2005
Local variables • @_ is local to the subroutine, and… • …so are $_[0], $_[1], $_[2], … • local creates local variables HY439 Autumn 2005
Example subroutine sub inside { local($a, $b); # Make local variables ($a, $b) = ($_[0], $_[1]); # Assign values $a =~ s/ //g; # Strip spaces from $b =~ s/ //g; # local variables ($a =~ /$b/ || $b =~ /$a/); # Is $b inside $a # or $a inside $b? } inside("lemon", "dole money"); # true HY439 Autumn 2005
Perl V • There are only a few differences between Perl 4 and Perl 5 • Perl 5 has modules • Perl 5 modules can be treated as classes • Perl 5 has “auto” variables HY439 Autumn 2005
The End HY439 Autumn 2005