780 likes | 1.08k Views
Intermediate Perl. by Benjamin J. Lynch blynch@msi.umn.edu. Introduction. Perl is a powerful interpreted language that takes very little knowledge to get started. It can be used to automate many research tasks with little effort.
E N D
Intermediate Perl by Benjamin J. Lynch blynch@msi.umn.edu
Introduction • Perl is a powerful interpreted language that takes very little knowledge to get started. It can be used to automate many research tasks with little effort. • The greatest strength and weakness of Perl is the ability to accomplish the same task using two very different codes.
Outline • Review of Perl • Variable types • Context • Operators • Control structure • Pattern Matching • Subroutines • Context • References • grep • map • modules
When should I use Perl? • Perl stands for: Practical Export Report Language • Perl is most useful for: • parsing files to extract desired data • Doing almost anything you can do in a shell script • cgi scripts to generate HTML for web pages • updating or retrieving information from databases • acting as in interface between programs
Programming Style • Questions you should ask: • Who else might look at the code? • Co-workers? • Complete strangers? • How often will the code be modified? • Remember your target audience • There is no substitute for comments
An Interpreted Language • Perl programs are also called Perl scripts because Perl is an interpreted language. • When you execute a Perl script, the script is compiled into a set of instructions for the Perl interpreter • This set of instructions (or parse tree) is sent to the Perl interpreter • The Perl interpreter shares many similarities to the virtual machine in Java • There is no need to compile a Perl script as a separate preliminary step, making Perl scripts similar to shell scripts (at least on the exterior).
A Simple Perl Script #!/usr/bin/perl print “Hello world! \n”; blynch@msi[~] % chmod +x hello.pl blynch@msi[~] % ./hello.pl Hello world! blynch@msi[~] % \n is a new line. The routine print will print the item or list of items that follows.
Variable Types • Scalar • Reference (scalar pointer to another variable) • List (array) • Hash (associative array)
Scalars • Examples: $var = ‘3’; $name = “Larry”; $float = 1.1235813; $sum = $a + 1.2;
A scalar is a single value. $number = 1; $text = ‘Hello world!’; $a = 1.2; $b = 1.3; $sum = $a + $b; print “$sum \n”; 2.5 The scalar can be: integer 64-bit floating point string reference The way that the data is stored (integer, floating point ,…) does not need to be specified. The Perl interpreter will determine it automatically
Lists • A list (or array) of values can be specified like: @number_list = (1,1,2,3,5,8,13,21); @grocery_list = (‘apples’,’chicken’,’canned soup’); • A list always starts with a @
Lists (arrays) @mylist = (1,2,2,3,4,4,4); @names = (‘Larry’, ‘ Moe’); push(@names, ‘ Curly’); print @names; Larry Moe Curly Adds an item to the end of a list
Lists A list (or array) of values @grocery_list = (‘apples’,’chicken’,’canned soup’); print $grocery_list[2]; canned soup Note the numbering of elements A ‘$’ is used in the print statement because of the context. We only want print to handle a single value from the array and so we use $ to denote the scalar context.
Hashes (associative arrays) • A Hash is an associative array • Instead of using an integer index, a hash uses a key to access elements of the hash %lunch = (‘monday’ => ‘pizza’, ‘tuesday’ => ‘burritos’, ‘wednesday’ => ‘sandwich’, ‘thursday’ => ‘fish’); print “on Tuesday I’ll eat $lunch{‘tuesday’}”; on Tuesday I’ll eat burritos
Hashes (associative arrays) • A Hash can be created with a list of key/value pairs. • Each key has one value associated with it. %hash = (‘Larry’ => 1, ‘Moe’ => 2, ‘Curly’ => 3); %hash = (‘Larry’ , 1, ‘Moe’ , 2, ‘Curly’ , 3); Either of these work to specify a hash
Variable Context @number_list = (1,1,2,3,5,8,13,21); @grocery_list = (‘apples’,’chicken’,’canned soup’); print @grocery_list; appleschickencanned soup If we use the array (or list) context, the print command will print out all elements from the array.
Variable Context @number_list = (1,1,2,3,5,8,13,21); @grocery_list = (‘apples’,’chicken’,’canned soup’); print $grocery_list[1]; chicken If we use the scalar context, we must specify the element we want to print from the list.
Variable Context @grocery_list = (‘apples’,’chicken’,’canned soup’); $var = @grocery_list; print $var; 3 If we request a scalar from a list, the list will return it’s length.
Perl Operators $mass*$height ; Multiplication $a + $b Addition $a - $b Subtraction $a / $b division $str1.$str2 Concatenate $count++ Increment $count by 1 $missing-- decrease $missing by 1 $total+= $subtotal increase $total by $subtotal $interest*= $factor set $interest to interest*$factor $string.= $more append $more to the end of $string
rand Perl rand($num) returns a random, double-precision floating-point number between 0 and $num. $var = rand(4);
Control structure #!/usr/bin/perl @my_grocery_list = (‘apples’,’chicken’,’canned soup’); foreach $item (@my_grocery_list){ &purchase ($item); } while ( some condition is true ){ &do_this }
Control structure Two ways to if/then if ($condition) {print “It is true \n”} print “It is true \n” if $condition;
Retrieving a random element from a list • @greeting = (‘Hello’,’Greetings’,’Hola’,’Howdy’) • print $greeting[rand @greeting]; • print $greeting[rand 4]; • print $greeting[2.59196266661263168]; • print $greeting[2]; • print ‘Hola’;
Subroutines • Defined like this: sub my_sub_name { do something } • Used like this: &mysubroutine(variables passed) ;
Subroutines • Variables passed to a subroutine enter the routine as a single list @list1 = (‘a ’,’b ’,’d ’) ; $scalar = 42 ; &mysub(@list1, $scalar) ; sub mysub{ print @_ } a b d 42
Returning values from subroutines • Subroutines return whatever is returned by the return statement or else the last item evaluated in the subroutine. @list1 = (2,3) ; print &mymult(@list1) ; sub mymult{ $product = $_[0]*$_[1]; return $product; } 6
Pattern Matching Perl uses a very robust pattern matching syntax The most basic pattern match looks like: $string =~ /some pattern/ $string = ‘ 1 2 three’; if ($string =~ /2/) { print “the number 2 is in the string\n” } In Perl, anything but ‘’ and 0 are considered TRUE
Pattern Matching $, = “\n”; $string = “1 2 hello 2 5”; $matching = ($string =~ /\d/); print $matching 1 @matches = ($string =~ /\d/g); print @matches; 1 2 2 5 g is for global. This will allow the pattern to be matched multiple times. \d will match any single digit
Pattern Matching $,=“\n”; $string = ‘1.45 1.482 1.938 other text 10.2849’; print (string = ~ /\d.\d+/g); 1.45 1.482 1.938 0.2849
Pattern matching /pattern/ /(sub-expression1)(sub-expression2)/ \d number \s whitespace \S non-whitespace pattern{2} will match pattern exactly twice [character list] defined character class [abcDEF] [^a] NOT ‘b’ | OR statement - it will match pattern on either side
/(bb|[^b]{2})/ This is written on a T-Shirt I own
/(bb|[^b]{2})/ bb We want 2 of them New character class NOT ‘b’ OR statement
Pattern Matching ------------------------------------------------ Charge Models 2 and 3 (CM2 and CM3) and Solvation Model SM5.42 GAMESSPLUS version 4.3 ------------------------------------------------ Gas-phase ------------------------------------------------ Center Atomic CM3 RLPA Lowdin Number Number Charge Charge Charge ------------------------------------------------ 1 3 .218 -1.090 -.938 Gas-phase dipole moment (Debye) ------------------------------------------------ X Y Z Total CM3 -.718 -.592 -1.748 1.980 RLPA -.327 1.122 -.840 1.440 Lowdin -.116 1.662 -.761 1.832 ------------------------------------------------
Pattern Matching if (/ CM3\s+([-]?\d*\.\d+)\s*([-]?\d*\.\d+)\s*([-]?\d*\.\d+)\s*([-]?\d*\.\d+)\s*/) { $amsol[9]=$4; if (/ CM3\s+([-]?\d*\.\d+\s*){3}([-]?\d*\.\d+)\s*/) { $amsol[9]=$2; if (/ CM3\s+(-?\d*\.\d+\s*){3}(-?\d*\.\d+)\s*/) { $amsol[9]=$2; if (/ CM3(\s+\S+){3}\s+(\S+)/) { $amsol[9]=$2;
Pattern Matching if (/ CM3\s+(-?\d*\.\d+\s*){3}(-?\d*\.\d+)\s*/) { $amsol[9]=$2; if (/ CM3(\s+\S+){3}\s+(\S+)/) { $amsol[9]=$2;
Substitutions s/search pattern/replace/ $string = ‘words9words383words’; $string =~ s/\d+/, /g; print $string words, words, words
Special Variables • $1, $2, $3, … • Holds the contents of the most recent sub-patterns matched if ($string =~ /(Larry) (Moe) Curly/){ print $2 } Moe
Special Variables • $[ • Determines which index in a list is the first, the default is 0. my @mylist = (Larry, Moe, Curly); print $mylist[1]; $[ = 1; print $mylist[1]; Moe Larry
Special Variables • $& • Entire pattern from most recent match
Special Variables • $/ • Input record separator, default is \n undef $/; open(FILE,<input.txt); $buffer=<INFILE>; • $buffer contains the entire file
Special Variables • $. • Current line number
Special Variables • $, • Default separator used when a list is printed, default is ‘’ • $,=‘ ‘; will add a space between each item if you print out a list. • $\ • Default record separator, default is ‘’ • $\ = “\n”; will add a blank line after each print statement.
Special Variables • $^T time the perl program was executed • $| autoflush • $$ process ID number for Perl • $0 name of perl script executed • %ENV hash containing environmental variables.
Special Variables • @ARGV is a list that old all the arguments passed to the Perl script. • @_ is a list of all the variables passed to the current subroutine
Special Variables $_ is a variable that hold the current topic. e.g.; while (<FILE1>){ print “line $. $_” }
Special Variables $_ is a variable that hold the current topic. e.g.; while (<FILE1>){ print “line $. $_” } This is the current line number This is the current line being processed in FILE1
References • A reference is a scalar • Instead of number or string, a reference holds a memory location for another variable or subroutine. $myref = \$variable; $subref= \&subroutine;
Dereferencing the Reference • To retrieve the value stored in a reference, you must dereference it. $name = ‘Larry’; $ref_name = \$name; print $ref_name , “\n”; print $$ref_name, “\n”; SCALAR(0x60000000000218a0) Larry
Dereferencing the Reference • Modifying a dereferenced reference to a variable is the same as modifying the variable. $name = ‘Larry’; $ref_name = \$name; $$ref_name .= ‘, Moe, and Curly’; print $$ref_name , “\n”; print $name, “\n”; Larry, Moe, and Curly Larry, Moe, and Curly
Where do we want to use a reference? • References are very useful when passing lists to a subroutine. @mylist = (‘Larry’, ‘Moe’, ‘Curly’); $list_ref = \@mylist; &mysub($list_ref ); sub mysub { my $ref = $_[0]; my @list = @$ref; print $list[2], “\n”; }