1.02k likes | 2.25k Views
Lecture 12: Regular Expression in JavaScript. RegExp. Regular Expression. A regular expression is a certain way to describe a pattern of characters. Pattern-matching or keyword search.
E N D
Regular Expression • A regular expression is a certain way to describe a pattern of characters. • Pattern-matching or keyword search. • Regular expressions are frequently used to test whether or not a string entered in an HTML form has a certain format. • For example, you want to test if a user entered a string that contains 3 consecutive digits. • \d\d\d
Regular expression in JavaScript • JavaScript provides regular expression capabilities through instances of the built-in RegExp object. • varregTest = new RegExp(pattern, modifiers); • Pattern specifies the patthern of an expression • Modifiers specify if a search should be global, case-senstive • The i modifier is used to perform case-insensitive matching • The g modifier is used to perform a global match (find all matches rather than stopping after the first match) • match() is a method of a String instance returns true if its string matches the pattern; otherwise, returns false.
macth() in JavaScript • match() is a method of a String instance returns true if its string matches the pattern; otherwise, returns false. For example: var word = “I am a FSU student”; var re = /fsu/i; if(word.match(re))alert(“Yes”); else alert(“No”);
test() • test(argument) is a method of a Regular Expression instance returns true if the argument contains the pattern; otherwise, returns false. for example: varphoneNum = “8502349 023”; varregTest = new RegExp(“^\\d{10}$”); if(regTest.test(phoneNum)){ alert(“Valid phone number”); } else{ alert(“Invalid phone number”);} What is the output?
Caret ^ and dollar sign $ • ^ indicates the beginning of a string • $ indicates the end of a string • For example: • \d\d\d represents strings consisting of three consecutive digits. It could represents all strings containing three consecutive digits, such as “my number is 123, blah blah”. • ^\d\d\d$ represents strings consisting of only three consecutive digits.
Regular expression literal • It is tiresome to write duplicate backslashes. • varregTest = new RegExp(“^\\d\\d\\d$”); • Alternative syntax for creating a RegExp instance is • varregTest = /^\d\d\d$/; • The expression on the right-hand side is known as regular expression literal. The scripting engine automatically escaping any backslash characters contained in the regular expression literal.
Special characters • The simplest form of regular expression is a character that is not one of the regular expression special characters: • ^ $ \ . * + ? ( ) [ ] { } | • A special character is escaped by preceding it with a backslash. For example, \$$ represents the set of strings that end with a dollar sign. • A . means a character except for a line terminator, such as \n and tab. • * is called Kleene star, represents infinitely large sets.
Escape Code • A escape code regular expression represents multiple characters • A list of escape code • \d – digits : 0 through 9 • \D – any character except those matched by \d • \s – space: any JavaScript white space or line terminator ( space, tab, line feed, etc) • \S – any character except those matched by \s • \w – “word” character: any letter (a through z and A through Z), digit , or underscore • \W – any character except those matched by \W
White space in regular expression • A white space is significant within a regular expression • For example, we have a regular expression, such as ^\d\. \w$ • Does “3.A” match this regular expression? • Does “9. B” match this regular expression?
Concatenation Operator • Simple regular expressions can be composed into more complex regular expressions using concatenation operator. • For instance: ^\d \s$ is a concatenated regular expression • When you want to concatenate a regular expression with itself multiple times, you can use the quantifier shorthand notation. • \d{3} == \d\d\d
Union Operator • Union operator is represented by the pipe symbol | • For example, \d|\s represents the set consisting of all digit and white space characters. • Concatenation operator takes precedence over union, so \+|-\d|\s consists of +, the two-character strings beginning with – followed by a digit, and the white space characters.
Character class • It is tedious to use the union operator to represent a set of all lowercase letters. • JavaScript provides a character class that can be used for such purpose. • [a-z] • [A-Z] • [0-9] • How to represent \w using character class?
Examples of regular expression • What does \d{3,6} represents? • How about (\+|-){0,1}\d? • How about \d*?