1 / 17

CSE 303 Concepts and Tools for Software Development

CSE 303 Concepts and Tools for Software Development. Richard C. Davis UW CSE – 10/6/2006 Lecture 5 – Regular Expressions. Administravia. Homework 1 is due!. Tip of the Day. Viewing Multiple buffers in Emacs C-x 2 : Make two buffers visible C-x o : Switch cursor to “other” buffer

aletha
Download Presentation

CSE 303 Concepts and Tools for Software Development

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE 303Concepts and Tools for Software Development Richard C. DavisUW CSE – 10/6/2006 Lecture 5 – Regular Expressions

  2. Administravia • Homework 1 is due! CSE 303 Lecture 5

  3. Tip of the Day • Viewing Multiple buffers in Emacs • C-x 2 : Make two buffers visible • C-x o : Switch cursor to “other” buffer • Careful! This also switches from mini-buffer! • C-x 1 : Make current buffer the only buffer CSE 303 Lecture 5

  4. Where are we? • Tons of info on the shell • Pseudo-programming language • Files, users, and processes • You know just enough to get around • Coming up • More powerful ideas CSE 303 Lecture 5

  5. Today • Regular Expressions • Find strings matching a pattern • Recurring theme in CS theory • Crisp, elegant, and useful idea • Powerful tool for computer science • Less educated hackers don’t tend to know • Other Utilities • find, diff, wc CSE 303 Lecture 5

  6. Types of Regular Expressions • “Globbing” = filename expansion • “Regular Expressions” = pattern matching • In CSE 322 (finite automata) • In grep • In egrep • Others… CSE 303 Lecture 5

  7. What are Regular Expressions? • Recursive definition: p = … • a, b, c : matches a single character • p1p2:matches ifp1comes just before p2 • (p1|p2):matches eitherp1or p2 • p*:matches 0 or more instances of p • Example • Match abcbdgef against bd(e|g)* CSE 303 Lecture 5

  8. Why This Language? • Amazing Facts • Such patterns are very easy to match • Space/time scale nicely with length of input and regexp • Know space needed before matching! • Can tell if two regexps match same strings • For more of this, see CSE 322 CSE 303 Lecture 5

  9. More Regular Expressions • Syntactic Conveniences • p+=pp*(one or more p’s) • p?=(|p) (zero or one p) • [zd-h] =z | d | e | f | g | h • [^A-Z] = … (any not in set) • . = … (any character) • p{n}= … (n instances of p) • p{n,}= … (n or more p’s) • p{n,m}= … (from n to n+m instances of p) CSE 303 Lecture 5

  10. More Regular Expressions • Other Conveniences • [:alnum:] = Letters and Numbers • [:alpha:] = Letters • [:digit:] = Numbers • [:lower:] = Lowercase letters • [:print:] = Printable characters • [:punct:] = Punctuation Marks • [:space:] = Whitespace CSE 303 Lecture 5

  11. grep finds patterns in files • grep p [file] • Reads each line from stdin (or file) • If match, print line on stdout • Technically, matches against .*p.* • Get rid of 1st.* with ^ • Get rid of 2nd.* with $ • Example: ^p$ matches whole line exactly CSE 303 Lecture 5

  12. Regular Expression Gotchas • Check syntax of each program • grep uses \| \+ \? \{ \} • egrep uses | + ? { } • Use quotes to avoid shell substitutions • Escape special characters with \ • \. and . are very different! • Rules different inside [ ] (\ is literal) CSE 303 Lecture 5

  13. Reusing Previous Matches • Can refer to matches of previous ( ) • \n refers to match of nth( ) • Example: Double-words ^([a-zA-Z]+)\1$ • Very useful when doing substitution • We’ll see this next time CSE 303 Lecture 5

  14. Other Utilities • Put these in your arsenal • find : search a directory tree for anything • Linux Pocket Guide gives nice summary • find . –type f – name “foo.*” – exec grep “regexp” “{}” \; - print • diff : compare two files or directories • wc : count the bytes, words, lines in a file CSE 303 Lecture 5

  15. Summary • Regular Expressions • Powerful and useful idea • Powerful tool for computer science • Less educated hackers don’t tend to know • Other Utilities • find • diff • wc CSE 303 Lecture 5

  16. Reading • Linux Pocket Guide • p71-74 (grep) • p66-68 (find) • p86-88 (diff) • p58 (wc) CSE 303 Lecture 5

  17. Next Time Other scripting tools • sed : For changing text files • awk : More powerful text editing • General-purpose scripting languages • perl • python • ruby CSE 303 Lecture 5

More Related