Comprehensive Perl Tutorial: Advanced Concepts and File Handling

COMP519: Web ProgrammingAutumn 2007 Perl Tutorial: Beyond the Basics Keyboard Input and file handling Control structures Conditionals String Matching Substitution and Translation Split Associative arrays Subroutines References

Keyboard Input • #!/usr/local/bin/perl • # basic08.pl COMP519 • print "Please input the circle’s radius: "; • $radius = <STDIN>; • $area = 3.14159265 * $radius * $radius; • print "The area is: $area \n"; • در Perl ازSTDINبعنوان ورودی استاندارد پیش فرض(به صورت یک دستگیره فایل) یا همان صفحه کلید استفاده می گردد. • $new = <STDIN>; • اگر ورودی رشته باشد ما معمولا می خواهیم که \n آخر رشته حذف شود. برای اینکار می توانیم از تابع chomp به یکی از روشهای زیر استفاده کنیم: • chomp($new = <STDIN>); • یا • $new = <STDIN>; • chomp($new); bash-2.05b$ perl basic08.pl Please input the circle's radius: 100 The area is: 31415.9265

File Handling • #!/usr/local/bin/perl • # basic09. pl COMP 519 • # Program to open the password file, • # read it in, print it,and close it again. • $file = 'password.dat'; #Name the file • open(INFO, $file); #Open the file • @lines = <INFO>; #Read it into an array • close(INFO); #Close the file • print @lines; #Print the array • کار کردن با فایلها • ابتدا متغییر $file را برای نگهداری اسم فایل تعریف کنید. • توسط دستگیره فایل INFO(یا هر اسم دیگری) و دستور openفایل فوق را باز کنید. • تمام فایل را داخل یک آرایه بریزید: • @lines = <INFO>; • فایل را ببندید. (دستور close) • آرایه را چاپ کنید. (دستور print) What happens if @lines is replaced by the scalar $lines? • کارهای دیگر با فایل: • open(INFO, $file); # Open for input, i.e. reading • open(INFO, ">$file"); # Open for new output, i.e. writing • open(INFO, ">>$file"); # Open for appending • open(INFO, "<$file"); # also for reading • نوشتن در فایل: • print INFO"This line goes to the file.\n";

Example • #!/usr/local/bin/perl • # basic09. pl COMP 519 • # Program to open the password file, • # read it in, print it, and close it again. • $file = 'password.dat'; #Name the file • open(INFO, $file); #Open the file • @lines = <INFO>; #Read it into an array • close(INFO); #Close the file • print @lines; #Print the array password.dat: password1 password2 password3 password4 password5 ... ... ... bash-2.05b$ perl basic09.pl password1 password2 password3 password4 password5 ... ... ...

foreach-Statement • دستور foreach برای خواندن عناصر یک لیست به ترتیب به کار می رود. (مثل for در زبان C) #!/usr/local/bin/perl # basic10.pl COMP519 @food = ("apples", "pears", "eels"); foreach$morsel (@food) # Visit each item in turn and call it # $morsel { print "$morsel \n"; # Print the item print "Yum yum\n"; # That was nice } bash-2.05b$ perl basic10.pl apples Yum yum pears Yum yum eels Yum yum

Testing • در Perl هر رشته یا عدد غیر صفر معادل true است. • عدد صفر، کاراکتر صفر در یک رشته و رشته خالی معادل false هستند. • در ادامه چند عبارت مقایسه ای روی اعداد و رشته ها نشان داده شده اند: • $a == $b# Is $a numerically equal to $b? • # Beware: Don't use the = operator. • $a != $b# Is $a numerically unequal to $b? • $a eq $b# Is $a string-equal to $b? • $a ne $b# Is $a string-unequal to $b? • می توان از عملگرهای منطقی نیز استفاده کرد: • ($a && $b)# Is $a and $b true? • ($a || $b)# Is either $a or $b true? • !($a)# is $a false?

Control Operations

Loop Statements for • Perl مثل Java و C درای حلقه for و while و do-until است. #!/usr/local/bin/perl # basic11.pl COMP519 for ($i = 0; $i < 10; ++$i) # Start with $i = 1 # Do it while $i < 10 # Increment $i before repeating { print "iteration $i\n"; } while do-until #!/usr/local/bin/perl # basic12.pl COMP519 print "Password?\n";# Ask for input $a = <STDIN>;# Get input chomp $a; #Remove the newline at end while ($a ne "fred") #While input is wrong.. { print "sorry. Again?\n"; #Ask again $a = <STDIN>;# Get input again chomp $a;# Chop off newline again } #!/usr/local/bin/perl # basic13.pl COMP519 do { print "Password? "; #Ask for input $a = <STDIN>; #Get input chomp $a; # Chop off newline } until($a eq "fred") # Redo until correct input

Examples for while bash-2.05b$ perl basic11.pl iteration 0 iteration 1 iteration 2 iteration 3 iteration 4 iteration 5 iteration 6 iteration 7 iteration 8 iteration 9 bash-2.05b$ perl basic12.pl Password? wew sorry. Again? wqqw sorry. Again? fred do-until bash-2.05b$ perl basic13.pl Password? werfwe Password? qfvcc Password? qcqc Password? fred

Conditionals • Perl دارای دستور if then else است. • دقت کنید که در elsif یک e وجود ندارد! • if($a) • { • print "The string is not empty\n"; • } • else • { • print "The string is empty\n"; • } if(!$a) # The ! is the not operator { print "The string is empty\n"; } elsif (length($a) == 1) # If above fails, try this { print "The string has one character\n"; } elsif (length($a) == 2)# If that fails, try this { print "The string has two characters\n"; } else #Now, everything has failed { print "The string has lots of characters\n"; }

Regular Expressions • عبارت منظم (RE) یک الگو است که با رشته داده شده مطابقت دارد/ندارد. • RE در بین دو علامت /قرار می گیرد. • مطابقت بعد از دیدن عملگر=~شروع می شود. به این عملگر، عملگر انطباق گفته می شود. • RE به بزرگی و کوچکی حروف حساس است. • عملگر !~برای چک کردن عدم انطباق استفاده می شود. $sentence=~ /the/ gives TRUE if the appears in $sentence if$sentence="The quick brown fox"; then the above match will be FALSE. $sentence!~ /the/ is TRUE because the string “the” does not appear in $sentence

Example • #!/usr/local/bin/perl • # basic14.pl COMP519 • $sentence = "The quick brown fox"; • $a = ($sentence =~ /the/) ; $b= ($sentence !~ /the/); • print "Output gives $a and $b\n"; bash-2.05b$ perl basic14.pl Output gives and 1

The $_ Special Variable می توان RE را در شرط استفاده نمود: • می توان ابتدا عبارت مورد نظر را در متغییر مخصوص $_قرار داد. در این حالت نیازی به تایپ کردن عملگر انطباق نیست. اگر شرایط زیر برقرار باشد شرط درست است: $sentence = "Up and under"; $sentence = "Best in Sunderland"; if ($sentence=~ /under/) { print "We're talking about rugby\n"; } if (/under/) { print "We're talking about rugby\n"; } نکته: تغییر$_در زبان Perl برای بسیاری از عملگرها متغییر پیش فرض است و استفاده زیادی دارد.

More on REs • ایجاد RE ها نیاز به خلاقیت دارد. .# Any single character except a newline ^# The beginning of the line or string $# The end of the line or string *# Zero or more of the last character +# One or more of the last character ?# Zero or one of the last character در اینجا بعضی از متغییرهای مخصوص در RE و معنی آنها آورده شده است. چند مثال: t.e # t followed by anything followed by e. This will match tre,tlebut not te, tale ^f # fat the beginning of a line ^ftp #ftpat the beginning of a line e$ # eat the end of a line tle$ # tleat the end of a line und* # unfollowed by zero or moredcharacters. This will matchun, und, undd, unddd,… .* # Any string without a newline. This is because the.matches anything except # a newline and the*means zero or more of these. ^$ # A line with nothing in it. بخاطر داشته باشید که هر RE باید بین // محصور شود.

Even more on REs می توان از[]استفاده کرد که با تمام کاراکترهای داخل آن مطابقت دارد. • jelly|cream # Either jelly or cream • (eg|le)gs # Either eggs or legs • (da)+ # Either da or dada or dadada or... [qjk]# Either q or j or k [^qjk]# Neither q nor j nor k [a-z]# Anything from a to z inclusive [^a-z]# No lower case letters [a-zA-Z]# Any letter [a-z]+# Any non-zero sequence of lower case letters • -به معنای بین • ^ به معنای not از|یا (..) برای همگروه کردن اشیاء با هم استفاده کنید. (به معنای یا)

Still More on REs تعدادی دیگر از متغییرهای مخصوص در RE و معنی آنها: • \n # A newline • \t # A tab • \w # Any alphanumeric (word) character. • # The same as [a-zA-Z0-9_] • \W # Any non-word character. • # The same as [^a-zA-Z0-9_] • \d # Any digit. The same as [0-9] • \D # Any non-digit. The same as [^0-9] • \s # Any whitespace character: space, • # tab, newline, etc • \S # Any non-whitespace character • \b # A word boundary, outside [] only • \B # No word boundary \| # Vertical bar \[ # An open square bracket \) # A closing parenthesis \* # An asterisk \^ # A carat symbol \/ # A slash \\ # A backslash

Some Example REs بهترین راه این است که با تمرین و ممارست نحوه استفاده از RE را یاد بگیرید. در این جا چند مثال دیگر آورده شده است: • [01] # Either "0" or "1" • \/0 # A division by zero: "/0" • \/ 0 # A division by zero with a space: "/ 0" • \/\s0 # A division by zero with a whitespace: • # "/ 0" where the space may be a tab etc. • \/ *0 # A division by zero with possibly some • # spaces: "/0" or "/ 0" or "/ 0" etc. • \/\s*0 # A division by zero with possibly some whitespace. • \/\s*0\.0* # As the previous one, but with decimal • # point and maybe some 0s after it. Accepts • # "/0." and "/0.0" and "/0.00" etc and • # "/ 0." and "/ 0.0" and "/ 0.00" etc. بخاطر داشته باشید که هر RE باید بین // محصور شود.

Substitution • می توان از تابع s برای جایگزینی انطباقهای پیدا شده با یک رشته دیگر استفاده کرد. در رشته $sentenceاولینlondonرا باLondon عوض می کند. $sentence =~ s/london/London/; s/london/London/; در رشته $_اولینlondonرا باLondon عوض می کند. در رشته $_تمامlondon ها را باLondon عوض می کند (با استفاده از گزینه g). s/london/London/g; RE زیر تمام حالات کلمه london مثل lOndon, lonDON, LoNDoNرا با London عوض می کند. s/[Ll][Oo][Nn][Dd][Oo][Nn]/London/g اما یک راه حل ساده تر استفاده از گزینه i است که بزرگ و کوچک بودن حروف را در نظر نمی گیرد. s/london/London/gi گزینه i را می توان در انطباق /.../نیز استفاده کرد.

Example • #!/usr/local/bin/perl • # basic15.pl COMP519 • $_ = "I love london, but not london or LoNdoN"; • $a = s/london/London/; • print "$a changes in $_ \n"; • $a= s/london/London/g; • print "$a changes in $_ \n"; • $a= s/london/oldLondon/gi; • print "$a changes in $_ \n"; bash-2.05b$ perl basic15.pl 1 changes in I love London, but not london or LoNdoN 1 changes in I love London, but not London or LoNdoN 3 changes in I love old London, but not old London or old London دقت کنید تغییرات در متغییر$_اتفاق می افتند. در حالیکه $aتعداد تغییرات را می شمارد.

Remembering Patterns • می توان از متغییرهای مخصوص کدهای RE یعنی \1,...,\9 برای فراخوانی انطباقهای پیدا شده در همان RE (با یا بدون جایگزینی) استفاده کرد. تمام حروف بزرگ را بین : محصور می کند. :L:ord :W:hopper of :F:ibbing. $_ = "Lord Whopper of Fibbing"; s/([A-Z])/:\1:/g; print "$_\n"; حتی می توان از متغییرهای $1,...,$9در باقی کد نیز استفاده کرد. if (/(\b.+\b)\1/) { print "Found $1 repeated\n"; } تمام کلمات تکراری داخل $_ را مشخص می کند. $search = "the"; s/${search}re/xxx/; تمام thereهای داخل$_را با xxxعوض می کند. بعد از انطباق می توان از متغییرهای $` و $& و $' برای پیدا کردن آنچه که قبل از انطباق، در حین انطباق و بعد از آن پیدا شده است؛ استفاده کرد.

Some Future Needs • s/\+/ /g; • s/%([0-9A-F][0-9A-F])/pack("c",hex($1))/ge; s/\+/ /gتمام علامتهای + را با یک فاصله جایگزین می کند. /%([0-9A-F][0-9A-F])/تمام % هایی را که بعد از آنها دو رقم هگزادسیمال وجود دارد را پیدا می کند. () کاراکترهای بعد از % را همگروه می کند تا بعدا توسط $1 استفاده شود. hex($1)عددهگزادسیمال داخل $1 را تبدیل به یک رشته هگزادسیمال می کند. تابعpack()یک مقدار را می گیرد و آنرا بصورت یک ساختمان داده می آورد. گزینه "c” به تابع می گوید که رشته هگزادسیمال را تبدیل به معادل کاراکتری آن کند. eطرف راست را بعنوان یک عبارت ارزیابی می کند.

Translation • تابعtrبرای ترجمه کاراکتر به کاراکتر استفاده می گردد. در $sentence تمام a ها با e، تمام b ها با d و تمام c ها با f عوض می شوند. خروجی تابع تعداد ترجمه های انجام شده است. $sentence=~tr/abc/edf/ اکثر کدهای مخصوص RE در تابع tr معنایی ندارند. $count = ($sentence=~ tr/*/*/); تمام *های موجود در $sentenceرا می شمارد و در متغییر $count میریزد. مثالهای زیر را در نظر بگیرید: tr/a-z/A-Z/; تمام حروف$_را بزرگ می کند. tr/\,\.//; تمام کاما ها و نقطه های $_را حذف می کند.

Split • تابع splitیک رشته را تکه تکه کرده و در یک آرایه می ریزد. $info = "Caine:Michael:Actor:14, Leafy Drive"; @personal = split(/:/,$info); @personal = ("Caine", "Michael", "Actor", "14, Leafy Drive"); روی متغییر$_کار می کند. @personal = split(/:/); @personal = ("Capes", "Geoff", "", "Shot putter", "", "", "Big Avenue"); $_ = "Capes:Geoff::Shot putter:::Big Avenue"; @personal = split(/:/); می توان از REها داخل تابع splitاستفاده کرد. $_ = "Capes:Geoff::Shot putter:::Big Avenue"; @personal = split(/:+/); @personal = ("Capes", "Geoff", "Shot putter", "Big Avenue"); کارکترهای یک کلمه کلمات یک جمله جملات یک پاراگراف @chars = split(//, $word); @words = split(//, $sentence); @sentences = split(/\./, $paragraph)

Some Future Needs • @parameter_list = split(/&/,$result); • # split string on '&' characters • foreach (@parameter_list) { • s/\+/ /g; • s/%([0-9A-F][0-9][A-F])/pack("c",hex($1))/ge; • } • ابتدا، ما query را براساس & از هم جدا می کنیم. و نتیجه را در آرایه @parameter_listمی ریزیم. • سپس ما توی آرایه می چرخیم. هر آیتم آرایه داخل $_قرار می گیرد و تغییرات روی آن انجام می شود.

Hashes • هاشها آرایه ای هستند که در آنها هر عنصر دارای یک کلید رشته ای است. %ages = ("Michael Caine", 39, "Dirty Den", 34, "Angie", 27, "Willy", "21 in dog years", "The Queen Mother", 108); $ages{"Michael Caine"}; # 39 $ages{"Dirty Den"}; # 34 $ages{"Angie"}; # 27 $ages{"Willy"}; # "21 in dog years" $ages{"The Queen Mother"}; # 108 هر%تبدیل به یک $ شده است. می توان یک هاش را تبدیل به آرایه کرد و برعکس. @info = %ages; # @info is a list array. It now has 10 elements $info[5]; # Returns the value 27 from the list array @info %moreages = @info; # %moreages is a hash. # It is the same as %ages

Operators • می توان به عناصر هاش را به طرق مختلف دسترسی داشت. Keysلیستی از کلید ها را بر می گرداند. Valuesلیستی از مقدارها را بر می گرداند.. foreach$person (keys%ages) { print "I know the age of $person\n"; } foreach$age (values%ages) { print "Somebody is $age\n"; } eachیک زوج key/value را بر می گرداند. while (($person, $age) = each(%ages)) { print "$person is $age\n"; } if (exists$ages{"The Queen Mother"}) { print "She is still alive…\n"; } اگر مقدارvalueدرداخل هاش بصورت کلید موجود باشدتابعexistمقدار درست را بر می گرداند. deleteکلید مورد نظر را به همراه مقدارش را حذف می کند. delete$ages{"The Queen Mother"};

Subroutines • می توان توسط sub یک سابروتین را تعریف نمود. submysubroutine { print "Not a very interesting routine\n"; print "This does the same thing every time\n"; } نیازی به تعیین لیست پارامتر ها نیست • برای صدا زدن یک سابروتین از & به همراه اسم آن استفاده کنید. &mysubroutine; # Call the subroutine &mysubroutine($_); # Call it with a parameter &mysubroutine(1+2, $_); # Call it with two parameters

Parameters • لیست پارامترها از طریق آرایه @_به سابروتین فرستاده می شود. می توان از طریق$_[0] و $_[1]و ... به پارامترها دسترسی داشت. • این متغییرها هیچ ربطی به متغییر عمومی $_ندارند. sub printargs { print "@_\n"; } &printargs("perly", "king"); # Example prints "perly king" &printargs("frog", "and", "toad"); # Prints "frog and toad" sub printfirsttwo { print "Your first argument was $_[0]\n"; print "and $_[1] was your second\n"; }

Returning Values • می توان از کلمه کلیدی return برای بارگرداندن خروجی استفاده نمود. • اگر از return استفاده نکنیم آخرین انتساب به عنوان خروجی برگردانده می شود. submaximum { if ($_[0] > $_[1]) { return $_[0]; } else { return $_[1]; } } $biggest = &maximum(37, 24); # Now$biggestis 37

Local Variables • آرایه@_و متغییرهای $_[0],$_[1],$_[2],محلی هستند. • متغببرهای Perl به صورت پیش فرض عمومی هستند اما می توان متغییر محلی نیز تعریف نمود. sub inside { my ($a, $b); # Make local variables using the # “my” keyword ($a, $b) = ($_[0], $_[1]); # Assign values $a =~ s/ //g; # Strip spaces from $b =~ s/ //g; # local variables ($a =~ /$b/ || $b =~ /$a/); # Is $b inside $a or $a inside $b? } &inside("lemon", "dolemoney"); # true

More on Parameters submaximum { # returns the maximum of the # parameters passed to function my ($current_max) = shift @_; foreach (@_) { if ($_ > $current_max) { $current_max = $_; } } $current_max; # return the max }

Sorting @names = (“Joe”, “Bob”, “Bill”, “Mark”); @sorted = sort @names; # @sorted now has the list # (“Bill”, “Bob”, “Joe”, ”Mark”) • Perl دارای تابع sort برای مرتب کردن عناصر یک لیست است: • این تابع اعداد را بصورت کاراکتری مرتب می کند. برای مرتب کردن اعداد بصورت ریاضی از کد زیر استفاده کنید: • دقت کنید که نحوه مقایسه را به تابع sort اعلام کرده ایم. • می توان عناصر لیست را بصورت معکوس نیز مرتب کرد. @numbers = (12, 3, 1, 8, 20); @sorted = sort { $a <=> $b } @numbers; # @sorted now has the list # (1, 3, 8, 12, 20) @numbers = (12, 3, 1, 8, 20); @sorted = reverse sort { $a <=> $b } @numbers; # sort the list from biggest # to smallest

References $age=42; $ref_age =\$age; @stooges = ("Curly","Larry","Moe"); $ref_stooges = \@stooges; $ref_salaries = [42500,29800,50000,35250]; $ref_ages ={ ‘Curly’ => 41, ‘Larry’ => 48, ‘Moe’ => 43, }; برای ارجاع: از\برای متغییرها از []برای آرایه ها و از {} برای هاشها استفاده کنید. $$ref_age = 30; $$ref_stooges[3] = "Maxine“; $ref_stooges -> [3] = "Maxine“; برای از بین بردن اثر ارجاع از $$ برای متفیرها و از -> برای لیستها استفاده کنید.

Comprehensive Perl Tutorial: Advanced Concepts and File Handling