580 likes | 836 Views
Perl. OBJECTIVES. What is Perl Concepts Variables Control Structures Modules Objects Windows. Perl. P ractical E xtraction and R eport L anguage Originally designed as a text processing and “glue” language Perl is a scripting language
E N D
Perl OBJECTIVES • What is Perl • Concepts • Variables • Control Structures • Modules • Objects • Windows
Perl • Practical Extraction and Report Language • Originally designed as a text processing and “glue” language • Perl is a scripting language • Each invocation of a Perl script compiles then executes code • Uses a C-like syntax • Has object-oriented programming features • Highly portable between OS’s
Running Perl • On Unix • Typically set line 1 to #!/usr/bin/perl (wherever Perl is installed) • On Windows • Set file extension to • .pl for standard Perl • .pls for PerlScript (ActiveX scripting engine) • Run from the perl command line
Variables • Perl is not a strongly typed language, the contents of a variable are converted as necessary • The first character of a variable name indicates the type of a variable • $name • The name part of variable can also be enclosed in { } • ${name} • @{$reference_to_array}
Variables - Scalar • A scalar represents a single value • Integer • Floating point • String • Reference • The data held by the variable is converted as necessary • Scalar names start with a $ • $name • As an lvalue • $name = “george burdell”;
Variables - Arrays • An array is an ordered list of scalars • Arrays are indexed by a number, starting at 0 • Arrays indexed by negative numbers are ordered backwards from the end of the array • The indexing operator is [ ] • An array starts with @ • To refer to full array (or a slice) • @names • @names[1,3,5] slice • @names[2 .. 6] slice • A single element of an array starts with $ • $names[4] • $names[$value]
Variables - Arrays • As an lvalue • $names[4] = 345; • @names = (1,2,3,4,5); • @names = 1 .. 5; • $last_value = $names[-1];
Variables - Hashes • A hash, or associative array, is an un-ordered list of scalars • Hashes are indexed by strings • The indexing operator is { } • A hash starts with % • To refer to the entire hash • %months • A single element of a hash starts with $ • $months{‘Mar’} • $months{$some_string} • As an lvalue • $months{‘Mar’} = ‘March’; • %months = (‘Jan’ => ‘January’, ‘Feb’ => ‘February’);
Variables - Namespaces • Two types of namespace • Global • Lexical • Global variables are kept in symbol tables that are named and accessible • Are created in the context of a package (default is $main::) • Can be referenced from another package using $package::variable • Lexical variables are created and exist only in the context of a Perl block (normally region enclosed with { })
Literals – Numeric • Numeric literals can take several formats • 12345 integer • 12345.67 floating point • 1.23e06 scientific • 1_234_567 • 0123 octal • 0xffff hexidecimal • 0b101010 binary
Literals - String • There are several ways to quote a string • Substitution for variables in a string is known as interpolation • print “The value is $value\n”; • print ‘The value is ‘,$value,”\n”; • Interpolation occurs for variables and back slash literals
Literals - String • Special additions to the character set • Backslash escape characters • \n newline • \r carriage return • \t tab • \033 character represented by octal 033 • \cX Control-X • \x{263a} Unicode character • \\ back slash • Translation escapes • \u force next character to uppercase • \l force next character to lowercase • \U force all following characters to uppercase • \L force all following characters to lowercase • \E end \U or \L switch
Literals - String • There is flexibility in choosing quotes • $string = qq[This method allows inclusion of ‘ and ‘’]; • $string = qq{This method allows inclusion of ‘ and ‘’}; • $string = qq/This method allows inclusion of ‘ and ‘’/; • The following executes a command using the OS shell and returns its output as a string • $result = qx(ls); • Word list form does not require tedious quoting • @months = qw(January February March April);
Interpolation • Interpolation is the process of expanding a variable in a string literal, the “ form of the string • Scalars are resolved in place, numeric values are converted to characters • Arrays are interpolated by joining all the elements of the array separated by the value of the special $” variable • $” = ‘~‘; • @months = qw(jan feb mar apr may jun); • $string = “The months are: @months”; • The months are: jan~feb~mar~apr~may~jun • Hashes are interpolated similarly, the key followed by the value are inserted into the string
List Values • A list consists of values enclosed in ( ) and separated by commas • @array = (1,3,5,7,9,11); • In list context the above example loads the array with the values • In a scalar context, each value is evaluated and the last value is returned, $value == 11 below • $value = (1,3,5,7,9,11); • There is an important difference between a list and an array, when an array is evaluated in scalar context it returns its length, $length == 6 • $length = @array; • $length = scalar @array; • $length = @array + 0;
List Values • List interpolation • (@array1, @array2, 1) • Each element above is evaluated and inserted into the list that is generated • There are no lists of lists • Lists can be indexed using [ ] • ($day,$month,$year) = (localtime())[3..5]; • Lists may be used as lvalues (see above)
Context • Every operation in Perl is evaluated in one of two contexts: scalar or list • Assignment to a scalar lvalue will cause the right side to be evaluated in scalar context • Assignment to an arrary, hash, or a slice lvalue will cause the right side to be evaluated in list context • Assignment to a list on the left will cause the right side to be evaluated in list context • Use the scalar function to force evaluation in scalar context • Some operations return different values depending on the context in which they are evaluated • $number_of_matches = m/([^,]+)*/; • @numbers = m/([^,]+)*/;
Arrays and Context • An array when referenced using @ operates in a list context • An array element operates in a scalar context • When a list is assigned to an array each value is inserted into the next element • Special forms of arrays • $length = scalar @array; (scalar not required here) • $last_index = $#array; • scalar @array == $#array + 1 (an identity)
Hashes and Context • A hash when referenced in the % form operates in list context • A hash element operates in a scalar context • When a list is assigned to a hash each pair of values in the list is taken as a key-value pair • %colors = (‘red’,0xff0000,’green’,0x00ff00,’blue’,0x0000ff); • There is a special syntax available for this • %colors = (red => 0xff0000, green => 0x00ff00, blue => 0x0000ff); • Use the keys function to generate a list of keys for a hash • To find the number of keys in a particular hash • $number_of_keys = scalar keys %hash;
Filehandles and Input • A filehandle refers to a file • Filehandles are, by convention, all upper case • STDIN, STDOUT, STDERR are predefined • Use <> operator to read from a filehandle • $line = <STDIN>; read one line from STDIN • @lines = <STDIN>; read all lines from STDIN • Read and print entire STDIN • while(<>) { print; } • reads each line to the special variable $_ which is used implicitly in both the <> and print commands
Operators • Operator precedence • Operators can be overloaded when using objects
Simple Statements • A simple statement is an expression that is evaluated • A simple statement is terminated with a ; • A simple statement may be followed by a modifier • ifexpr • unlessexpr • whileexpr • untilexpr • foreachlist • Examples • print “Value is $i\n” if $i > 5; • print “i=$i-- \n” while $i != 0;
Compound Statements • Expressions containing blocks • A block is normally contained in { } • if statement • if (expr) block • if (expr) blockelseblock • if (expr) blockelsif (expr) block • if (expr) blockelsif (expr) blockelseblock • unless statement is similar $i = $max; unless ($i == $max) { $i++; } else { print “The max is five\n”; exit; } $i = $max; if ($i == $max) { print “The max is five\n”; exit; } else { $i++; }
Compound Statements • while statement • labelwhile (expr) block • labelwhile (expr) blockcontinueblock • until statement • labeluntil (expr) block • labeluntil (expr) blockcontinueblock • The continue block is executed before starting next iteration of loop while (<STDIN>) { chomp; @fields = split(/:/); print “Field 1: $fields[0]\n”; }
Compound Statements • for loop • labelfor (expr1 ; expr2 ; expr3) block • expr1 start condition • expr2 ending condition • expr3 loop statement for (my $i = 0;$i < 10;$i++) { print “i=$i\n”; }
Compound Statements • foreach statement • labelforeach (list) block • labelforeachvar (list) block • labelforeachvar (list) blockcontinueblock • Loops over each entry in the list • When var is omitted then $_ is used foreach my $key (sort keys %people) { print “Key: $key, Value=$people{key}\n”; } foreach my $entry (@items) { print “Item: $entry\n”; }
Compound Statements • Labeled block • labelblock • labelblockcontinueblock • Equivalent to a single iteration loop • Can be used with last, next, and redo
Loop Control • These statements can be used with blocks • The optional label further refines their effect • last label • Exit the loop (block) • The continue block is not executed • next label • Skip the rest of this iteration and start the next iteration • Execute the continue block before the next iteration begins • redo label • Restart the loop with the current iteration parameters • The continue block is not executed • The label parameter enables multi-level block control
Declarations • Subroutine declaration is a global declaration • Must declare a subroutine before using it • sub count; • Can define a subroutine at declaration • sub count { … } • Pragmas are directives to the Perl compiler • use strict; • use integer; • use warnings; • use English;
Declarations • Variable declarations • Lexically scoped declarations • my $var; • my ($var1, $var2); • my $value = function(); • Lexically scoped global declarations • our $var; • Dynamically scoped global declarations • local $var;
Pattern Matching • Regular Expressions • Rule based pattern matching mechanism • Simple patterns • m/Class/ • Complex pattern • m/AE[0-9]+[A-Z]/
Regular Expressions • Meta-characters • \ | ( ) [ { ^ $ * + ? . • Have special meanings inside patterns • \ is the escape character used to use one of the meta-characters as itself in a pattern, eg, \\ or \. • Quantifiers • * + ? {3} {2,5} • RE’s normally match maximal text • Add ? to end to match minimal text • Character classes • [ ] or [^ ] • Grouping • ( )
Regular Expressions • The pattern matching operators • m// match • s/// substitute • tr/// transliterate • Binding operators • =~ binds string to pattern operator • !~ • Examples • $string =~ m/AE[0-9]{4}[A-Z]/; • $string =~ s/old/new/; • $string =~ s(old)(new); can use arbitrary delimiters • $string =~ s’old’new’;
Regular Expressions • Maximal and Minimal matches • “exasperate” =~ m/e(.*)e/ • Returns “xasperat” • “exasperate” =~ m/e(.*?)e/ • Returns “xasp”
Functions • There are many built-in functions • Can be used with or without parentheses around arguments • With parentheses it will be parsed as a function • Without parentheses it will be parsed as a prefix operator, preferred • Use the –w switch on the #!/usr/bin/perl –w line to flag when it is being parsed as a function • Example • print 1+2*4; # prints 9 • print (1+2)*4; # prints 3 • For details see perl documentation or Camel book • Users may define functions • sub name { code }; • User functions are called with parentheses around arguments
Functions - Arguments • Arguments are passed to functions in the built-in array @_ • The elements of @_ can be accessed by any of several techniques sub func { my $nargs = @_; my $arg1 = shift; my @rest = @_; } sub func { my $arg1 = shift; my $arg2 = shift; } sub func { my $arg1 = $_[0]; my $arg2 = $_[1]; } sub func { my $arg1 = shift; my @rest = @_; } • shift is a built-in function that returns the first element of an array then shifts the remaining elements down • shift operates in a manner similar to a stack pop
eval Function • The eval function normally used to trap runtime errors • The eval function has two forms • eval block • Will execute the code enclosed by the block • eval expr • Compiles and executes the code in expr • The code in expr can be dynamically created • The special variable $@ contains the result of execution • $@ is set to the error message if there is an error • $@ is set to an empty string if there is no error eval { … } # execute block of code if ($@) { … } # handle error
References • A reference in Perl is a scalar that contains a pointer to some data in memory • Perl has two types symbolic and hard • Symbolic: scalar contains the name of another variable • Hard: scalar contains the address of the memory • Use the $ prefix to dereference a reference • $ref is the scalar that contains the reference • $$ref # dereference • ${$ref} # dereference • Hard references are generally more common
References • The \ (backslash) operator is used to create a hard reference • $ref = \$sample • In this example $ref is an alias for $sample, they both refer to the same location in memory • Use $$ref to refer to that memory location: $$ref == $sample and ${$ref} = $sample • $ref = \@array • In this example $ref is an alias for @array • To access an array element: $$ref[1] or ${$ref}[1] or $ref->[1] • To access array: @$ref or @{$ref}
Data Structures • References are useful in accessing anonymous data structures • Anonymous array • [ element1, element2, … , elementN ] • $ref = [0,1,2,3,4]; • $$ref[0] or ${$ref}[0] or $ref->[0] • Anonymous hash • { key1=>element1, key2=>element2, … , keyN=>elementN } • $ref = { Jan=>1, Feb=>2, Mar=>3, Apr=>4 }; • $$ref{Jan} or ${$ref}{Jan} or $ref->{Jan} • The -> operator is syntactic shorthand that removes the extra $ dereference
Data Structures • Creating arbitrarily complex data structures is relatively easy using references • Create any number of anonymous structures placing their address into a scalar (reference) • Store the resulting scalars into other structures
Arrays of Arrays • An array of arrays is how to create a multi-dimensional array in Perl • In each cell of one array save a reference to another array • There is no requirement that each secondary array be the same length my $array_ref; for (my $i=0;$i<4;$i++) { my $ref; for (my $j=$i;$j<$i+4;j++) { push @{$ref},$j; } $array_ref->[$i] = $ref; } print $array_ref->[0]->[0],”\n”; my @array; for (my $i=0;$i<4;$i++) { my $ref; for (my $j=$i;$j<$i+4;j++) { push @{$ref},$j; } $array[$i] = $ref; } print $array[0]->[0],”\n”;
Hash of Arrays • In each cell of a hash table save a reference to an array my %months = ( Jan=>[1..31], Feb=>[1..28]); $, = ‘, ‘; foreach my $month (keys %months) { print “$month: “,@{$months{$month}},”\n”; } Jan: 1, 2, 3, 4, … 27, 28, 29, 30, 31 Feb: 1, 2, 3, 4, … 27, 28
Complex Structures • Data structures can be created to any level of complexity • Can mix all types to any depth • Arrays of hashes of hashes of arrays • Hashes containing references to user defined functions • &{$func_list{$member}}(…arguments…) sub startup { print “Startup\n”; } sub shutdown { $code = shift; print “Shutdown: $code\n”; } %func_list = (Startup=>\&startup, Shutdown=>\&shutdown); &{$func_list{shutdown}}(99);
Packages • A package is the way to isolate code in its own namespace • This is particularly useful for re-usable code (libraries) • As generally used, the scope of a package declaration is the file in which it appears • Usually package is the first line of a file that is processed by require or use • To refer to a variable in another package use $package::variable • The default package is main, $main::variable or $::variable
Modules • The module is the basic unit of re-usable Perl code • Module files end with the .pm file extension • Modules come in two forms • Traditional: functions and variables • Object-Oriented: methods and properties • Modules are accessed with the use keyword • use Module; • A module file contains a package declaration with the same name as the file • A module may export a list of functions and variables to the namespace that contains the use statement (do not export OO methods)
Modules • Module names should begin with a capital letter and end with .pm • The last line of a module must be 1; File Sample.pm package Sample; sub func1 { } sub func2 { } 1; use Sample; my $result = Sample::func1;
Modules • Beyond the simple form there is additional support for modules • The Exporter module can be used to place selected symbols into the Perl code that uses the module • There is a version checking mechanism • There is an autoload feature File Sample.pm package Sample; require Exporter; our @ISA = qw(Exporter); our @EXPORT = qw(func1 func2); sub func1 { } sub func2 { } 1; use Sample; my $result = func1;
Objects • The module forms the basis of the Object Oriented features of Perl • The package name is the class name (type) • The function definitions in the module are the methods • A class may inherit methods from parent classes • A class may be sub-classed • Perl classes inherit methods not data • An object is a reference to an instance of a class • All Perl classes are sub-classes of the UNIVERSAL class
Objects – Method Invocation • Assume a class named Sample with an instance named $instance • Invoking a class method • Sample->class_method(…arguments…); • Invoking an instance method • $instance->instance_method(… arguments…); • The first argument of a method invocation is hidden and is either the class name (class method) or a reference to an object (instance method) • Methods can override super class methods