1 / 34

Regular Expressions

Regular Expressions. Adapted from Javascript Regular Expressions by Bob Molnar Indiana University/IUPUI Streaming Media Laboratory http://wally.cs.iupui.edu/n341_05/. Goals. By the end of this unit you should … Understand what regular expressions are

cahil
Download Presentation

Regular Expressions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Regular Expressions Adapted from Javascript Regular Expressions by Bob Molnar Indiana University/IUPUI Streaming Media Laboratory http://wally.cs.iupui.edu/n341_05/

  2. Goals By the end of this unit you should … • Understand what regular expressions are • Be able to use regular expressions to match text against a particular string pattern • Be able to use special regular expression characters to match multiple search terms against strings

  3. What is a Regular Expression? • A regular expression is a pattern of characters. • We use regular expressions to search for matches on particular text. • In JavaScript, we can use regular expressions by creating instances of the regular expression object, RegExp.

  4. The Regular Expression Constructor • We can declare a regular expression by using a constructor. General Form:var regExpName = new RegExp(“RegExp”, “flags”); • Example:var searchTermRE = new RegExp(“s”,“gi”);(search for the letter “s” globally, ignore case)

  5. The Regular Expression Literal • Declaring using a reg. expression literal:var searchTermRE = /X1X4/gi; • When declaring regular expression literals, do NOT include quotation marks and offset the expression with a pair of forward slashes. • By convention, variables acting as regular expressions end with the suffix “RE.” Flags come after the second forward slash.

  6. Literal Characters • A lot of the time, we use regular expressions to match specific patterns, like the word “java”:var firstRE = /java/;(would match the words “java”, “javascript”, “javabeans”, “myJava”) • Matching character for character is termed matching literals.

  7. Non-Printing Literal Characters • We also consider some non-printing characters as literals: • \t (tab character) • \n (newline character) • \0 (NUL character – null value)

  8. Metacharacters • Sometimes, we want to search not for specific patterns, but for parts of patterns. • Consider searching for all lines that end with the letter “s”. To do so, we’ll need to use metacharacters:var firstRE = /s$/;(finds all phrases that end with “s”)

  9. What are Metacharacters? • Metacharacters are characters used to represent special patterns that don’t necessarily fit in the range of standard letters and numbers (A-Z; a-z; 0-9, etc.). • We often use symbols as metacharacters to indicate a special circumstance. • Some of these symbols include:$ . ^ * ?

  10. Metacharacters as Literals • What if I want to search for a literal symbol that is also used as a metacharacters? To search for a symbol as a literal and not as a metacharacter, we use the \ (backslash) to turn “off” the metacharacter property. • $ used as a metacharacter:var firstRE = /s$/; • $ used as a literal character:var firstRE = /\$/;

  11. Flags • When searching, flags can help refine or expand a search • Flags modify a particular search to fit certain criteria • There are three common flags, the global flag, ignore case flag and the multiline mode flag.

  12. The Global flag • In a regular expression without flags, JavaScript will return only the first instance of a search term:var mySearchRE = /X1X4/;(returns only the first instance of “X1X4”) • To modify the search to include all instances of “X1X4”, we would use the global flag:var mySearchRE = /X1X4/g;(returns all instances of “X1X4”)

  13. The Ignore Case flag • In a regular expression without flags, JavaScript only returns an exact match:var mySearchRE = /X1X4/;(returns only an instance of “X1X4”, but not “x1x4” or “x1X4”, etc.) • To modify the search to include instances of “X1X4”, regardless of case, we would use the ignore case flag:var mySearchRE = /X1X4/i;(returns an instance of “X1X4”,“x1x4”, x1X4”, etc.)

  14. The Multiline flag • A single string may include newline characters. • We can use the multiline flag which allows us to search at the beginning or end of a line, not just the beginning or end of a string. To turn it on:var mySearchRE = /^X1X4/m;

  15. Combining Flags • We can also combine flags to expand our search:var mySearchRE = /X1X4/gi;(returns all instances of “x1x4”, “x1X4”, “X1x4” & “X1X4”)

  16. Searching for Matches Only at the Beginning of a Line • Consider the following string:Jimmy the Scot scooted his scooter through the Park.The park guard watched Jimmy do this. • The code:var mySearchRE = /^Jimmy/gm;(would only return “Jimmy” from the first line) • The ^ metacharacter says “look only for matches at the beginning of the string or line (multiline mode).”

  17. Searching for Matches Only at the End of a Line • Consider the following string:Jimmy the Scot scooted his scooter through the Park.The park guard watched Jimmy do this. • The code:var mySearchRE = /his$/gm;(would only return “his” from the second line) • The $ metacharacter says “look only for matches at the end of the string or line (multiline mode).”

  18. Using Boundaries • Consider the following string:Jimmy the Scot scooted his scooter through the Park.The park guard watched Jimmy do this. • To search for the all instances of the word “the” we could use the space metacharacter (\s):var mySearchRE = /\sthe\s/gim;(Ignores “The” that begins the second line, since it has no space before it -- it starts a line)

  19. Using Boundaries • Consider the following string:Jimmy the Scot scooted his scooter through the Park.The park guard watched Jimmy do this. • Instead of using a space character, we can use the boundary (\b). The boundary metacharacter searches for all instances of a pattern which are not a prefix (\b at the beginning of a search pattern) or a suffix (\b at the end of a search pattern) of another word:var mySearchRE = /\bthe\b/gim;

  20. Using Boundaries (continued) • Our string:Jimmy the Scot scooted his scooter throughthe Park.The park guard watched Jimmy do this. • Code:var mySearchRE = /\bt/gim; • Search for all matches that begin with “t”. Ignore “t” if “t” is in the middle or at the end of a word.

  21. Searching for Multiple Patterns at the Same Time • Consider the following string:lop, mop, bop, sop, pop, gop, top, fop • To search for the all instances that end with “op” we would use a wildcard character (.) There no need for the global flag, because the global is inherent in the wildcard character:var mySearchRE = /.op/;(returns the all the words)

  22. Searching for Multiple Patterns at the Same Time • Consider the following string:lop, mop, bop, sop, pop, gop, top, fop • To search only for the instances that match “bop”, “lop” or “pop” we would use brackets to include the search characters, but exclude all others ([]):var mySearchRE = /[blp]op/;

  23. Searching for Multiple Patterns at the Same Time • Consider the following string:lop, mop, bop, sop, pop, gop, top, fop • We can also use ranges of letters in the brackets:var mySearchRE = /[a-m]op/;(returns “bop”, “fop”, “gop”, “lop” and “mop”, but ignores all other words ending with “op”)

  24. Excluding Patterns • Consider the following string:lop, mop, bop, sop, pop, gop, top, fop • To search for the all instances that end with “op” except those that begin with “b”, “l” or “p”, we use the not metacharacter (^):var mySearchRE = /[^blp]op/;(returns the all the words except “bop”, “lop” and “pop”) • Inside brackets, the ^ symbol means “not” and DOES NOT mean the beginning of a line!

  25. Excluding Patterns • Our string:lop, mop, bop, sop, pop, gop, top, fop • We can also use ranges of letters in the brackets:var mySearchRE = /[^a-m]op/;(returns all words except “bop”, “fop”, “gop”, “lop” and “mop”)

  26. Other Metacharacters: ?, * and + • To match zero or one characters:var mySearchRE = /b?onk/;(matches “bonk” or “onk”) • To match zero or n characters:var mySearchRE = /b*onk/;(matches “bonk”, “onk” or “bbonk”) • To match one or n characters:var mySearchRE = /b+onk/;(matches “bonk” or “bbonk”, but not “onk”)

  27. Other Metacharacters:{ } • To match a specific number of characters:var mySearchRE = /g{2}op/;(matches “goop”, but not “gop” or “gooop”) • To match between nand mcharacters:var mySearchRE = /g{1,3}p/;(matches “gop” “goop” or “gooop” only)

  28. String.search() Method • The String.search() method gives us the character position (index number) of where the search term starts or –1 if there is not match. • The String.search() does not perform global searches and will ignore the “g” flag!

  29. Open the file called introRegExp_01.html

  30. String.match() Method • The String.match() method returns an array containing all of the matches from a string. • Unlike the String.search()method, the String.match()method does perform global searches.

  31. Open the file called introRegExp_02.html

  32. Summary • We can use a regular expression to search for a pattern of characters. • We can create a JavaScript regular expression by using the RegExp constructor or by creating a regular expression literal. continued …

  33. Summary • We can use the String.search() method to find the find first occurrence of a regular expression. • We can use the String.match() method to return an array of all occurrences of a regular expression.

More Related