70 likes | 234 Views
Regular Expression: Pattern Matching. Introduction. What is Regular Expression? Grammar /rule for matching pattern in strings pattern = a sequence of characters Pattern matching Syntax $target =~ / $pattern / ; searches $target for $pattern true if found, false otherwise
E N D
Regular Expression:Pattern Matching Web Programming
Introduction • What is Regular Expression? • Grammar /rule for matching pattern in strings • pattern = a sequence of characters • Pattern matching • Syntax • $target=~ /$pattern/; • searches $target for $pattern • true if found, false otherwise • ( “prog1.pl” =~ /pl/) : true • ( “prog1.pl” =~ /PL/) : false • can use any pattern delimiter character with m • e.g. $target =~ m!$pattern! (or m|$pattern|) • $target!~ /$pattern/; • true if NOT found, false otherwise Example script Web Programming
Matching Repeats • + Match one or more preceding characters • e.g. /ab+c/ matches ‘abc’, ‘abbc’, ‘abbbc’, etc. • * Match zero or more preceding characters • e.g. /ab*c/ matches ‘ac’, ‘abc’, ‘abbc’, ‘abbbc’, etc. • ? Match zero or one preceding character • e.g. /ab?c/ matches ‘ac’, or ‘abc’. • . Match any character • e.g. /b.a/ matches ‘abba’, ‘b1a’, etc. /b.*a/ matches ‘aba’, ‘abba’, ‘banana’, etc. • ‘.’ does not match newline (i.e. \n) • /b.*a/ does not match ‘ab\na’, ‘abb\na’, etc. Regular Expression Checker: Perl script, HTML Form, CGI Web Programming
Special Characters • [$pattern] Match any character in $pattern • e.g. /[Rr]ed/ matches ‘bredd’, ‘bRedd’, ‘red’, ‘Red’, etc. /[0-9]/ matches any number /[a-zA-Z]/ matches any alphabet /[0-9a-zA-z]/ matches any alphanumeric • [^$pattern] Match any character except those in $pattern • e.g. /[^0-9]/ matches any non-numeric characters. • \b Match at word boundary • Any character other than alphanumeric and underscore (i.e., [0-9a-zA-Z_]) • e.g. /\bred\b/ matches ‘is red’, ‘red rose’, ‘$red’ Example: HTML Form, CGI Web Programming
Escape Sequences • \d • any digit (i.e. [0-9] ) • \D • any non-digit (i.e. [^0-9]) • \w • any word character (i.e. [_0-9a-zA-Z]) • \W • any non-word character (i.e. [^_0-9a-zA-Z]) • \s • any white space (i.e. [ \r\t\n\f]) • \S • any non-white space (i.e. [^ \r\t\n\f]) Web Programming
Pattern Matching Options • /$pattern/i • Ignore case • e.g. /ab/i matches ‘ab’, ‘AB’, ‘Ab’, ‘aB’ • /$pattern/g • Matches all possible patterns • Returns a list of matches • e.g. @matches = ‘abcdcb’=~/.b/g; @matches will be (‘ab’,’cb’) Example: HTML Form, CGI Web Programming
Substitution • $string =~ s/$pattern/$replacement/; • Replace $pattern with $replacement in $string$string = “Before substitution”;$string =~ s/Before/After/; “After substitution” • Substitution Options • $string =~ s/$pattern/$replacement/i; • Ignore case of $pattern $string = “One plus one is done.”;$string =~ s/one/ONE/i; “ONE plus one is done.” • $string =~ s/$pattern/$replacement/g; • Change all occurrence of $pattern in $string $string = “ this is line 1. ”;$string =~ s/ +/ /g; “ this is a line. ” • $string =~ s/$pattern/$replacement/s; • Treat $string as a single line (i.e., . will match \n) $string = “this <img src=\n img.gif>image”;$string =~ s/<.+>//s; “this image” Example Web Programming