1 / 20

Python Regular Expressions

Python Regular Expressions. Easy text processing. Regular Expression. A way of identifying certain String patterns Formally, a RE is: a letter or lambda RE1 RE2 (concatenate 2 RE’s) (RE or RE) (RE)* Why do you think they’re called Regular Expressions?. Python regex. Use the re module

bishop
Download Presentation

Python Regular Expressions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Python Regular Expressions Easy text processing

  2. Regular Expression • A way of identifying certain String patterns • Formally, a RE is: • a letter or lambda • RE1 RE2 (concatenate 2 RE’s) • (RE or RE) • (RE)* • Why do you think they’re called Regular Expressions?

  3. Python regex • Use the re module • import re • The special characters: . ^ $ * + ? { } [ ] \ | ( ) • We’ll learn them one at a time…

  4. Character classes • [abc] means a or b or c • [a-c] is the same thing • [a-z] = any lowercase letter • [^579] = any character except 5, 7, or 9 For Strings, use |: Shannon|Duvall

  5. Metacharacters • \d any digit [0-9] • \D any non-digit [^0-9] • \s any whitespace character (tabs, return so forth) • \S • \w any alphanumeric character • \W • \b any word boundary • . anything except newline

  6. Repeat • * means 0 or more ma*d matches: md, mad, and maaaaad • + means 1 or more ma+d matches mad and maaaaad but not md • ? means 0 or 1 ma?d matches md and mad only • {x,y} means between x and y repetitions ma{1,3}d matches mad, maad, and maaad

  7. Repeating groups • [ab]* matches a, b, bbb • (ab)* matches ab, abab, ababab

  8. More metacharacters • ^ outside of a character class, means the beginning of a line • $ matches the end of a line

  9. What can I do with them?Search • re.search(pattern, string, <flags>) • pattern is the regex • string is what you are searching in • flags are special modifiers, optional • This either returns None (false) or a Match object • When specifying the regex, use r to denote “raw string”

  10. Search Example import re line = “Cats are smarter than dogs” if re.search(r’.*are.*than.*’,line): print(“yes”)

  11. Groups • Using () in a regex creates a group that can be referenced later. • The string that matches the entire regex is said to be group 0. • Other groups are numbered, starting at 1.

  12. Grouping example import re m = re.search(r'(\w+) (\w+)',"Shannon Lynn Duvall") m.group(0) 'Shannon Lynn’ m.group(1) 'Shannon’ m.group(2) 'Lynn'

  13. Grouping Example • Would it match? m = re.search(r’(\w+) \1’, “Shannon Shannon”) • Space taken out: m = re.search(r’(\w+)\1’, “Shannon Shannon”)

  14. Nested groups • Group number goes from out to in. Count the parentheses. m = re.search(r'(a(b)c)d’, ’’abcd’’) m.group(0) 'abcd’ m.group(1) 'abc’ m.group(2) 'b'

  15. sub: search and replace • re.sub(regex, putIn, string, <flags>) • phone = "1-800-555-9090” • newPhone = re.sub(r'\D', “”, phone) • What is newPhone?

  16. findall • Search for all matches and return them as a list • song ="12 drummers drumming, 11 pipers piping, 10 lords a leaping" • nums = re.findall(r'\d+',) • nums is now [‘12’, ‘11’, ‘10’]

  17. split • Split a string based on a regex as the delimiters. verses = re.split(r'\d+',song) verses is ['', ' drummers drumming, ', ' pipers piping, ', ' lords a leaping']

  18. split with groups • Sometimes you want the delimiter to show up in the list. Use a group – the group will be returned in the list. verses = re.split(r'(\d+)',song) verses is: ['', '12', ' drummers drumming, ', '11', ' pipers piping, ', '10', ' lords a leaping']

  19. Examples • You have a string that represents a poker hand: • a,k,q,j for ace, king, queen, jack • 1-9 for numbers 1-9 • 0 for 10

  20. How would you: • Make sure a string is a valid hand? • Check for a pair of sevens? • Check for any pair? • Check for 3 of a kind? • Check for a full house?

More Related