270 likes | 403 Views
Pattern Matching. Regular Expression. Pattern Matching. JavaScript provides two ways to do pattern matching: 1. Using RegExp objects 2. Using methods on String objects RE in both ways are the same Same as in Perl. Simple patterns. Two categories of characters in patterns:
E N D
Pattern Matching Regular Expression CS346 Regular Expressions
Pattern Matching JavaScript provides two ways to do pattern matching: 1. Using RegExp objects 2. Using methods on String objects • RE in both ways are the same • Same as in Perl CS346 Regular Expressions
Simple patterns Two categories of characters in patterns: a. normal characters (match themselves) b. metacharacters (can have special meanings in patterns--do not match themselves) \ | ( ) [ ] { } ^ $ * + ? . - A metacharacter is treated as a normal character if it is backslashed - period (.) is a special metacharacter - it matches any character except newline CS346 Regular Expressions
create RegExp objects • var varname = / reg_ex_pattern / flags • Simplest example: exact match • To match occurrence of “our” in a string containing your, our, sour, four, pour • var toMatch = /our/; CS346 Regular Expressions
1. Matching in RegExp objects • Tests a string for pattern matches. This method returns a Boolean that indicates whether or not the specified pattern exists within the searched string. This is the most commonly used method for validation. Use test() method of RegExp object Format: regexp.test( string_to_be_tested ) • test() returns a Boolean var tomatch=/our/; var result = tomatch.test(“pour”); //boolean result Example: 16-0-checkName.html CS346 Regular Expressions
Pattern Modifiers (Adding flags) CS346 Regular Expressions
2. Matching in Strings • search() method • Returns the position in the specified string of the RE pattern (position is relative to zero); returns -1 if it fails var str = "Gluckenheimer"; var position = str.search(/n/); /* position is now 6 */ • match() method • compares a RE and a string to see whether they match. • replace() method • finds out if a RE matches a string and then replaces a matched string with a new string CS346 Regular Expressions
search() method • Format: string.search(reg-exp) • Searches the string for the first match to the given regular expression • returns an integer that indicates the position in the string (zero-indexed). • If no match is found, the method will return –1. • Similar to the indexOf() method, • Example: To find the location of the first absolute link within a HTML document:: pos = htmlString.search(/^<a href = ”http:\/\/”$/i);if ( pos != -1) { alert( ‘First absolute link found at’ + pos +’position.’);}else { alert ( ‘Absolute links not found’);} CS346 Regular Expressions
Match() method • match() method • Format: string.match( regular_expression ) • returns an array of all the matching strings found in the string given. If no matches are found, then match() returns false. • Example: To check the proper format for a phone number entered by a user, with the form of (XXX) XXX-XXXX. • function checkPhone( phone ) { phoneRegex = /^\(\d\d\d\) \d\d\d-\d\d\d\d$/; if( !phone.match( phoneRegex ) ) { alert( ‘Please enter a valid phone number’ ); return false; } return true;} CS346 Regular Expressions
replace() method • Format string.replace(reg_exp) • Properties: replaces matches to a given regular expression with some new string. • Example: To replace every newline character (\n) with a break <br /> tag, comment = document.forms[0].comments.value; /* assumes that the HTML form is the first one present in the document, and it has a field named “comments” */ comment = comment.replace( /\n/g, “<br />”); function formatField( fieldValue ) { return fieldValue = fieldValue. replace(/\n/g, “<br />”);} • The function accepts any string as a parameter, and returns the new string with all of the newline characters replaced by <br /> tags. CS346 Regular Expressions
Character classes – [ ] • Sequence of characters in brackets defines a set of characters, any one of which matches • e.g. [abcd] • Dashes used to specify spans of characters in a class • e.g. [a-z] • A caret at the left end of a class definition means • the opposite • e.g. [^0-9] CS346 Regular Expressions
Character class abbreviations CS346 Regular Expressions
From Chapter 25 of text - Perl Note the difference of usage of ^ here and in a class CS346 Regular Expressions
Quantifiers Quantifiers in braces - Repetitions Quantifier Meaning {n} exactly n repetitions {m,} at least m repetitions {min, max} at least min but max repetitions allowed CS346 Regular Expressions
Some other common Quantifiers * zero or more repetitions e.g., \d* means zero or more digits + one or more repetitions e.g., \d+ means one or more digits ? zero or one e.g., \d? means zero or one digit . exactly one character except newline character e.g., /.l/ matches al or @l but not \n nor l CS346 Regular Expressions
Anchors The pattern can be forced to match only at the left end with ^; at the end with $ e.g., /^Lee/ matches "Lee Ann" but not "Mary Lee Ann" /Lee Ann$/ matches "Mary Lee Ann", but not "Mary Lee Ann is nice“ The anchor operators (^ and $) do not match characters in the string--they match positions, at the beginning or end CS346 Regular Expressions
Examples • test() • See 16-1checkURL.html • See 16-2validEmail.html • search() method in String • See 16-3check_phone.html CS346 Regular Expressions
replace method() • replace(RE_pattern, string) • Finds a substring that matches the pattern • replaces it with the string • g modifier applicable var str = "Some rabbits are rabid"; str.replace(/rab/g, "tim"); str is now "Some timbits are timid“ Matched substrings stored in $1, $2, etc $1 and $2 are both set to "rab" CS346 Regular Expressions
match(pattern) • Most general pattern-matching method • Returns an array of results of the pattern-matching operation • With the g modifier, returns an array of the substrings that matched • Without the g modifier, first element of the returned array has the matched substring, the other elements have the values of $1, … obtained by parenthesized parts of pattern var str = "My 3 kings beat your 2 aces"; var matches = str.match(/[ab]/g); - matches is set to ["b", "a", "a"] CS346 Regular Expressions
match(pattern) example 16-4matchExample.html var str = “Having a take-home exam that takes 3 hours to complete is better than a 1-hour in-class exam”; var matches = str.match( /\d/g ); matches is set to [3, 1] CS346 Regular Expressions
Parentheses in RE Example: 16-5complexMatchEx.html var str = "I have 118 credits; but I need 120 to graduate"; matches = str.match(/(\d+)([^\d]+)(\d+)/); document.write(matches, "<br />"); 1st element of matches is the match, 2nd is the value of $1, 3rd element $2, 4th element $3 etc. matches array: 118 credits; but I need 120,118, credits; but I need ,120 ______________________ ___ _______________ ___ match with RE $1 $2 $3 CS346 Regular Expressions
Alternate patterns • Use the alternation operator | • Example: 16-6matchAlternatives.html CS346 Regular Expressions
split(parameter) of String splits a string into substrings based on a pattern “:" and /:/ both work Example: 16-7splitEx.html CS346 Regular Expressions
Program Structure • Example 16-3check_phone.html • Limitations? • How can you make it more flexible? • Can you generalize it for checking multiple fields CS346 Regular Expressions
Uniform Program Structure for multiple tests • regex_name.test( string_to_be_tested ) to test each field • if test() returns false, compile an error message • See 16-8Structure.html CS346 Regular Expressions
Examples of curly braces { } • 16-9-curly_braces.html CS346 Regular Expressions
Table – Regular Expression Codes • See “Regular Expression Codes.doc” CS346 Regular Expressions