620 likes | 794 Views
Perl. Introduction. Perl stands for " P ractical E xtraction and R eport L anguage" Created by Larry Wall when awk ran out of steam Perl grew at almost the same rate as the Unix operating system. Introduction (cont.). Perl fills the gaps between program languages of different levels
E N D
Introduction • Perl stands for "Practical Extraction and Report Language" • Created by Larry Wall when awk ran out of steam • Perl grew at almost the same rate as the Unix operating system
Introduction (cont.) • Perl fills the gaps between program languages of different levels • A great tool for leverage • High portability and readily available • Perl can be “write-only,” without proper care during programming
Availability • It's free and runs rather nicely on nearly everything that calls itself UNIX or UNIX-like • Perl has been ported to the Amiga, the Atari ST, the Macintosh family, VMS, OS/2, even MS/DOS and Windows • The sources for Perl (and many precompiled binaries for non-UNIX architectures) are available from the Comprehensive Perl Archive Network (the CPAN). http://www.perl.com/CPAN
Running Perl on Unix • Setup path variable to point to the directory where Perl is located • Check /usr/local/bin or /usr/bin for “perl” • Run a Perl script by typing “perl <filename>” • Alternatively, change the file attribute to executable and include “#!/usr/bin/perl” in the first line of your perl script • The .pl extension is frequently associated to Perl scripts
Running Perl on Win32 • ActivePerl allows Perl scripts to be executed in MS-DOS/Windows • Perl is being ported faithfully • The #! directive is no longer used because it does not mean anything to MS-DOS/Windows • Perl scripts are executed by typing “perl <filename> • Alternatively, double clicking on the file if the extension .pl is being associated to the Perl interpreter
An Example #!/usr/bin/perl print “Hello World!”; • The #! directive directs subsequent lines in the file to the perl executable • All statements are terminated with ; as in C/C++/Java • print by default outputs any strings to the terminal console. (such as printf in C or cout in C++) • Perl completely parses and compiles the script before executing it
Variables • Three main types of variables, scalar, hash and array • Examples: $scale, %hash, @array • Perl is not a strongly typed language • Retrieving values from the variables: • $scale, $hash{key}, $array[offset] • Variables are all global in scope unless defined to be private or local • Note: remember that hash and array are used to hold scalar values
Examples • Assigning values to a scalar $i = “hello world!”; $j = 1 + 1; ($i,$j) = (2, 3) • Assigning values to an array $array[0] = 1; $array[1] = “hello world!”; push(@array,1); #stores the value 1 in the end of @array $value = pop(@array); #retrieves and removes the last element #from @array @array = (8,@array); #inserts 8 in front of @array
Examples (cont.) • Assigning values to a hash $hash{‘greeting’} = “Hello world!”; $hash{‘available’} = 1; #or using a hash slice @hash{“greeting”,”available”} = (“Hello world!”, 1); • Deleting a key-value pair from a hash: delete $hash{‘key’};
Conditional Statements • Variables alone will not support switches or conditions • If-Then-Else like clauses are used to make decisions based on certain preconditions • Keywords: if, else, elsif, unless • Enclosed by ‘{‘ and ‘}’
A Conditional Statement Example print "What is your name? "; $name = <STDIN>; chomp ($name); if ($name eq "Randal") { print "Hello, Randal! How good of you to be here!\n"; } else { print "Hello, $name!\n"; # ordinary greeting } unless($name eq “Randal”) { print “You are not Randal!!\n”; #part of the ordinary greeting} • $name = <STDIN> reads from standard input • chomp is a built-in function that removes newline character
Loops • Conditional statements cannot handle repetitive tasks • Keywords: while, foreach, for , until, do-while, do-until • Foreach loop iterates over all of the elements in an array or hash, executing the loop body on each element • For is a shorthand of while loop • until is the reverse of while
Loops (cont.) • Do-while and do-until loops executes the loop body once before checking for termination • Statements in the loop body are enclosed by ‘{‘ and ‘}’
While Loop • Syntax: while(some expression){ statements; … } • Example: #prints the numbers 1 – 10 in reverse order $a = 10; while ($a > 0) { print $a; $a = $a – 1; }
Until Loop • Syntax: while(some expression){ statements; … } • Example: #prints the numbers 1 – 10 in reverse order $a = 10; until ($a <= 0) { print $a; $a = $a – 1; }
Foreach Loop • Syntax: foreach [<variable>] (@some-list){ statements… } • Example: #prints each elements of @a @a = (1,2,3,4,5); foreach $b (@a) { print $b; }
Foreach Loop (cont.) • Accessing a hash with keys function: foreach $key (keys (%fred)) { # once for each key of %fred print "at $key we have $fred{$key}\n"; # show key and value } • Alternatively: while (($first,$last) = each(%lastname)) { print "The last name of $first is $last\n"; }
For Loop • Syntax: For(initial_exp; test_exp; re-init_exp ) { statements; … } • Example: #prints numbers 1-10 for ($i = 1; $i <= 10; $i++) { print "$i "; }
Do-While and Do-Until Loops • Syntax: do {statments; do{ statements; } while some_expression; }until some_expression; • Example the prints the numbers 1-10 in reverse order: $a = 10; $a = 10; do{ do{ print $a; print $a; $a = $a – 1; $a = $a - 1; }while ($a > 0); }until ($a <= 0);
Built-in functions • shift function • Ex: $value = Shift(@fred) is similar to ($x,@fred) = @fred; • unshift function • Ex: unshift(@fred,$a); # like @fred = ($a,@fred); • reverse function • @a = (7,8,9); • @b = reverse(@a); # gives @b the value of (9,8,7) • sort function • @y = (1,2,4,8,16,32,64); • @y = sort(@y); # @y gets 1,16,2,32,4,64,8
Built-In Functions (cont.) • qw function • Ex: @words = qw(camel llama alpaca); # is equivalent to @words = (“camel”,”llama”,”alpaca”); • defined function • Returns a Boolean value saying whether the scalar value resulting from an expression has a real value or not. Ex: defined $a; • undefined function • Inverse of the defined function
Built-In Functions (cont.) • uc and ucfirst functions –vs- lc and lcfirst functions • <result> = uc(<string>) • <result> = ucfirst(<string>) $string = “abcde”; $string2 = uc($string); #ABCDE $string3 = ucfirst($string); #Abcde • Lc and lcfirst has the reverse effect as uc and ucfirst functions
Basic I/O • STDIN and STDOUT • STDIN Examples: • $a = <STDIN>; • @a = <STDIN>; • while (defined($line = <STDIN>)) { # process $line here } • STDOUT Examples: • print(list of arguments); • print “text”; • printf ([HANDLE], format, list of arguments);
Regular Expressions • Template to be matched against a string • Patterns are enclosed in ‘/’s • Matching against a variable are done by the =~ operator • Syntax: /<pattern>/ • Examples: • $string =~/abc/ #matches “abc” anywhere in $string • <STDIN> =~ /abc/ #matches “abc” from standard #input
Creating Patterns • Single character patterns: • “.” matches any single character except newline (\n), for example: /a./ • “?” matches zero or one of the preceding character • Character class can be created by using “[“ and “]”. Range of characters can be abbreviated by using “-”, and a character class can be negated by using the “^” symbol. • For examples: • [aeiouAEIOU] matches any one of the vowels • [a-zA-Z] matches any single letter in the English alphabets • [^0-9] matches any single non-digit
Creating Patterns (cont.) • Predefined character class abbreviations: • \d == [0-9] • \D == [^0-9] • \w == [a-zA-Z0-9] • \W == [^a-zA-Z0-9] • \s == [ \r\t\n\f] • \s == [^ \r\t\n\f]
Creating Patterns (cont.) • Multipliers: *, + And {} • * matches 0 or more of the preceding character • ab*c matches a followed by zero or more bs and followed by a c • + Matches 1 or more of the preceding character • ab+c matches a followed by one or more bs and followed by a c • {} is a general multiplier • a{3,5} #matches three to five “a”s in a string • a{3,} #matches three of more “a”s
Creating Patterns (cont.) • a{3} #matches any string with more than three “a”s in it • Complex patterns can be constructed from these operators • For examples: • /a.*ce.*d/ matches strings such as “asdffdscedfssadfz”
Creating Patterns: Exercises • Construct patterns for the following strings: 1. "a xxx c xxxxxxxx c xxx d“ 2. a sequence of numbers 3. three or more digits followed by the string “abc” 4. Strings that have an “a”, one or more “b”s and at least five “c”s 5. Strings with three vowels next to each other. Hint: try character class and general multiplier
Creating Patterns: Exercises • Answers: • /a.*c.*d/ • /\d+/ or /[0-9]+/ • /\d\d\d.*abc/ or /\d{3,}abc/ • /ab+c{5,}/ • /[aeiouAEIOU]{3}/ • Other possible answers?
Anchoring Patterns • No boundaries are defined by the previous patterns • Word boundary: \w and \W • \b and \B is used to indicate word boundaries and vice verse • Examples: • /fred\b/ #matches fred, but not frederick • /\b\+\b/ #matches “x+y”, but not “x + y”, “++” and ”+”. Why? • /\bfred\B/ #matches “frederick” but not “fred
Anchoring Patterns (cont.) • ^ and $ • ^ matches beginning of a string • $ matches end of a string • Exampls: • /^Fred$/ #matches only “Fred” • /aaa^bbb/ #matches nothing
More on matching operators • Additional flags for the matching operator: • /<pattern>/i #ignores case differences • /fred/i #matches FRED,fred,Fred,FreD and etc… • /<pattern>/s #treat string as single line • /<pattern>/m #treat string as multiple line
More on Matching Operators (cont.) • “(“ and “)” can be used in patterns to remember matches • Special variables $1, $2, $3 … can be used to access these matches • For example: $string = “Hello World!”; if( $string =~/(\w*) (\w*)) { #prints Hello World print “$1 $2\n”; }
More on Matching Operators (cont.) • Alternatively: $string = “Hello World!”; ($first,$second) = ($string =~/(\w*) (\w*)); print “$first $second\n”; #prints Hello World Line 2: Remember that the =~ return values just like a function. Normally, it returns 0 or 1, which stands for true or false, but in this case, the existence of “(“ and “)” make it returns value of the matching patterns
Substitution • Replacement of patterns in string • s/<pattern to search>/<pattern to replace>/ig • i is to case insensitive • g enables the matching to be performed more than once • Examples: $which = “this this this”; $which =~ s/this/that/; #produces “that this this”
Substitution (cont.) $which =~ s/this/that/g; #produces “that that that” $which =~ s/THIS/that/i; #produces “that this this” $which =~ s/THIS/that/ig; #produces “that that that” • Multipliers, anchors and memory operators can be used as well: $string = “This is a string”; $string =~ s/^/So/; # “So This is a string” $string =~ s/(\w{1,})/I think $1/; # “I think This is a string”
Split and Join Functions • Syntax: • <return value(s)> = split(/<pattern>/[,<variable>]); • <return value> = join(“<seperator>”,<array>); • Examples: $string = “This is a string”; @words = split(/ /,$string); #splits the string into #separate words @words = split(/\s/,$string); #same as above $string = join(“ “,@words); #”This is a string” • Great functions in parsing formatted documents
Functions • Automates certain tasks • Syntax: sub <name> { … <statements> } • Global to the current package. Since we are not doing OOP and packages, functions are “global” to the whole program
Functions (cont.) • Example: sub say_hello { print “Hello world!\n”; } • Invoking a function: say_hello(); #takes in parameters &say_hello; #no parameters
Functions (cont.) • Return values • Two types of functions: void functions (also known as routine or procedure), and functions • void functions have no return values • Functions in Perl can return more than one variable: sub threeVar { return ($a, $b, $c); #returns a list of 3 variables }
Functions (cont.) ($one,$two,$three) = threeVar(); • Alternatively: @list = threeVar(); #stores the three values into a list • Note: ($one, @two, $three) = threeVar(); #$three will not have #any value, why?
Functions (cont.) • Functions can’t do much without parameters • Parameters to a function are stored as a list with the @_ variable • Example: sub say_hello_two { $string = @_; #gets the value of the parameter } • Invocation: say_hello_two(“hello world!\n”);
Functions (cont.) • For example: sub add { ($left,$right) = @_; return $left + $right; } $three = add(1,2);
Functions (cont.) • Variables are all global even if they are defined within a function • my keyword defines a variable as being private to the scope it is defined • For example: sub add { my($left,$right) = @_; return $left + $right; }
Functions (cont.) $three = add(1,2); #$three gets the value of 3 print “$one\n”; #prints 0 Print “$two\n”; #prints 0
Exercises • A trim() function that removes leading and trailing spaces in a string • Hint: use the s/// operator in conjunction with anchors • A date() function that converts date string, “DD:MM:YY” to “13th of December, 2003” • Hint: use a hash table to create a lookup table for the month strings.
File I/O • Filehandle • Automatic filehandles: STDIN, STDOUT and STDERR • Syntax: open(<handle name>,”(<|>|>>)filename”); close(<handle name>); • Example: open(INPUTFILE,”<inputs.txt”); #opens file handle … Close(INPUTFILE); #closes file handle
File I/O (cont.) • Handle access does not always yield true • Check for return value of the open function • Example: • if(open(INPUT,”<inputs.txt”)) … #do something else print “File open failed\n”;