1 / 44

Programming in Unix

Programming in Unix. Regular Expressions These expressions are used in grep, sed, awk, ed, vi and the various shells. Regular Expressions. A regular expression is a pattern to be matched Perl is a superset of all these tools Any regular expression used in Unix tools can be used in Perl.

sereno
Download Presentation

Programming in Unix

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Programming in Unix • Regular Expressions • These expressions are used in grep, sed, awk, ed, vi and the various shells

  2. Regular Expressions • A regular expression is a pattern to be matched • Perl is a superset of all these tools • Any regular expression used in Unix tools can be used in Perl

  3. Regular Expressions • The string abc can be a regular expression by enclosing the string in slashes: $_ = “I know my abc s” if (/abc/) { print $_; }

  4. Regular Expressions Single character patterns - a character in the expression must match a single character in the string • The dot “.” matches any single character other than “\n” /r.g/ would match rug or rag

  5. Regular Expressions • Metacharacters or escape sequences allow you to match certain conditions in a string . \ | ( ) [ * + ? (Are all metacharacters) • A backslash in front of any metacharacter makes it non-special 5.18 would use /5\.18/ 01\20\03 would use /01\\20\\03

  6. Regular Expressions Some escape sequences you might see

  7. Regular Expressions

  8. Regular Expressions • Pattern /m./ matches any two character pattern that starts with m my or me would be examples of matches

  9. Regular Expressions • A character class uses a list of possible characters enclosed in brackets [ ] • It will match any one character listed within the brackets • [a-z] will match any single lowercase letter (a range can be used with the hyphen) • Negated character class ^ matches character not in the list

  10. Regular Expressions • Grouping Patterns - one or more of…. • Sequence - i.e.; abc means a followed by b followed by c • Multipliers • * indicates zero or more of previous characters • + meaning one or more of the immediately previous character • ? means zero or one of the immediately previous character

  11. Regular Expressions • General Multiplier • $_ = “fred xxxxxxxxxx barney”; • /x{5,10}/ #would look for 5 to 10 repetitions of the letter x • s/x[5,10]/and/; #would substitute and for the x’s

  12. Regular Expressions • Parentheses • (a) matches an a • ([a-z]) matches any single lowercase letter • Alternation • match exactly one of the alternatives a|b|c • /[abc]/ works the same way

  13. Regular Expressions • Anchoring Patterns • Generally when a pattern is matched against a string it is evaluated from left to right matching at the first opportunity • \b anchor requires a word boundary at the indicated point • \B requires that there is not a word boundary • ^ matches the beginning of a pattern • $ matches the ending of a pattern

  14. Regular Expressions /fred\b/; #matches fred but not frederick /\bmo/; #matches moe but not Elmo /\bFred\b/; #matches Fred but not Freddy or AlFred /\b\+\b/; #matches “ + “but not ++ or x+y

  15. Regular Expressions • Precedence • Parentheses ( ) • Quantifiers * + ? { } • Anchors and sequence ^ $ \b \B\ • Alternation |

  16. Regular Expressions • Matches with m// (m not needed when using //) • Searches using /pattern/ is actually a shortcut for m/pattern/ • You may choose any pair of delimiters to quote the contents • Where you used /fred/ you can use m(fred) or m,fred, or m<fred> or m!fred!

  17. Regular Expressions • Different delimiter • rather than the slash (/) • add the letter m to the new delimiter • ie. m@/usr/etc@

  18. Regular Expressions • Binding Operator =~ selects a different target, it tells Perl to match the pattern on the right against the string on the left (instead of matching $_) • Ignoring case with /i • [yY] matches either upper or lower case y • /^procedure/i #matches P or p

  19. Regular Expressions • Case shifting $_ = “I saw Barney with Fred.”; s/(fred|barney)/\U$1/gi; #Now $_ is “I saw BARNEY with FRED.”

  20. Regular Expressions • The split Operator will break up a string according to a separator. This is useful for tab separated or colon-separated data @fields = split /:/, “abc:def:g:h”; Gives you (“abc”, “def”, “g”, “h”) @fields = split /:/, “abc:def::g:h”; Gives you (“abc”, “def”, “”, “g”, “h”)

  21. Regular Expressions • It is common to split on whitespace using /\s+/ as the pattern • All whitespace runs equal to a single space $input= “This is a \t test.\n”; split /\s+/, $input; will give you the result “This”, “is”, “a”, “test.”

  22. Regular Expressions • Substitutions • $_ = “foot fool buffoon”; • s/foo/bar/; #$_is now “bart fool buffoon” • s/// will make just one replacement • s/foo/bar/g; #$_is now “bart barl bufbarn” • /g globally replace on all possible matches

  23. Regular Expressions • The join function takes a list of values and glues them together. Performs the opposite of split. • For example $info = join(“\n”, Name, Address, “Zip Code”); print $info will display Name Address Zip Code

  24. Regular Expressions • Or take a list @values = ( 2, 4, 6, 8, 10); $new_value= join “-”, @values; # $new_value looks like “2-4-6-8-10” $new_value= join “:”, @values; # $new_value looks like “2:4:6:8:10” $new_value= join “-”, “cat”, @values; # $new_value looks like “cat-2-4-6-8-10”

  25. Filehandles and File Tests • What is a filehandle? • An I/O connection between your Perl process and the outside world. • Like the names for labeled blocks • Easy to confuse with future reserved words, so recommendation is to use all UPPERCASE letters in your filehandle;

  26. Filehandles and File Tests • syntax is like: open (FILEHANDLE, “somename”); • FILEHANDLE is the new filehandle and somename is the external filename (such as file or device) • To open a file for write, use the same open statement but prefix the filename with a greater than sign (caution this will overwrite any existing files with the same name) open (OUT, “>outfile”);

  27. Filehandles and File Tests • Syntax continued: • To open a file to append data to it open (LOGFILE, “>>mylogfile”); • All forms of open return true for success and false for failure • When finished with a filehandle you close it close(LOGFILE); • reopening a filehandle will close the previous version

  28. Filehandles and File Tests • When a filehandle does not open successfully you can use the die function to report that an error has occurred • unless statement can be used as a logical or • unless (this) { that; } • this || that; • unless statement used as a logical or with the die statement • unless (open (DATAPLACE, >/tmp/dataplace”)) { print “Sorry, I couldn’t create your file”; }else { #the rest of your program }

  29. Filehandles and File Tests Or….make it even simpler with: unless (open DATAPLACE, “>/tmp/dataplace”) { die “Sorry, I couldn’t create your file”; or open (DATAPLACE, “>tmp/dataplace”) || die “Sorry, I couldn’t create your file”;

  30. Filehandles and File Tests • The -x File Tests • Suppose you wanted to make sure that there wasn’t a file by that name (so you don’t blow away valuable data) when you open and write to a file • Use file tests (see page 157-8) -e for a file or directory exists

  31. Formats • Helps you generate simple, formatted reports and charts • Keeps track of number of lines per page, current page • Use “format” to declare and “write” to execute

  32. Declaring a Format format MYNAME = FORMLIST . Note: if MYNAME is omitted writes to STDOUT FORMLIST is a list containing the following A comment (start the line with #) A “picture” giving the output for one output line An argument line supplying values to plug into the previous “picture” line

  33. Special Values FORMAT_NAME_TOP defines text that will appear at the top of each page FORMAT_NAME section defines format and variables for each line that should print as the body of the report • You should define the format and format_top together somewhere in your program (often seen at the end).

  34. Example # a report on the /etc/passwd file format MY_REPORT_TOP = Password File Report Name Login Uid Gid Shell Home ------------------------------------------------------------------- .

  35. Example #how to send output to the screen format STDOUT = Password File Report Name Login Uid Gid Shell Home ------------------------------------------------------------------- . open STDOUT; write;

  36. Example (cont...) format MY_REPORT = @<<<<< @||||||| @<<<< @>>>> @>>>> @<<<<<<<<<<<< $name, $login, $uid, $gid, $shell, $home . Then to print this when you want: write MY_REPORT;

  37. Example of Code #!/usr/local/bin/perl -w print "This is an address label program\n"; print "Enter your name: \n"; $name=<>; print "Enter your street address: \n"; $street=<>; print "Enter your City, State, and Zip: \n"; $therest=<>; open (AddressLabel,">myaddrlist"); write (AddressLabel); format AddressLabel = ================================== | @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< | $name | @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< | $street | @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< | $therest ================================== .

  38. Example Entering Data # addrlabel.pl This is an address label program Enter your name: Mike Enter your street address: 14590 Roller Coaster Rd Enter your City, State, and Zip: Denver, CO 80931

  39. Example Output to File # cat myaddrlist ================================== | Mike | | 14590 Roller Coaster Rd | | Denver, CO 80931 | ==================================

  40. Format Pictures • @ or ^ indicates substitution at run-time • < left justify • > right justify • | centering • If the variable has more characters than the format picture, it will be truncated • To avoid truncating use “@*” on a format line by itself.

  41. The ^ Picture • Starting a field with ^ allows you to print part of the text with the first call • The next time you reference it, the string will only contain that part of the string that has not been printed and the next n characters will be printed and so on... • Warning!: this does destroy the original value of the variable so store it off if you will need it again.

  42. Example of the ^ # a report from a bug report form format BUG_REPORT = Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< $subject From: @<<<<<<<<<<<<<< Priority: @<<<<<<<<<< $from, $priority Description: ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< $description ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< $description ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<… $description

  43. Special Variables • $~ contains $FORMAT_NAME • $^ contains $FORMAT_NAME_TOP • $% contains the current output page number • $= contains number of lines per page • $- contains lines remaining on current page (set to zero to force a new page)

  44. To Use Special Variables • You can use these by “selecting”: $myform = select(MYFORMAT); $~ = “My_Other_Format”; $^ = “My_Top_Format”; select($myform);

More Related