200 likes | 300 Views
Appendix A: Regular Expressions. It’s All Greek to Me. Regular Expressions. A pattern that matches a set of one or more strings May be a simple string, or contain wildcard characters or modifiers Used by programs such as vim , grep , awk , and sed Not the same as shell expansion.
E N D
Appendix A:Regular Expressions It’s All Greek to Me
Regular Expressions • A pattern that matches a set of one or more strings • May be a simple string, or contain wildcard characters or modifiers • Used by programs such as vim, grep, awk, and sed • Not the same as shell expansion
Components • Characters • Literals • Special Characters • Delimiters • Mark beginning end of regular expressions • Usually / • ’ (but not really)
Simple Strings • Contain no special characters • Matches only the string • Ex: /foo/ matches: • foo • tomfoolery • bar.foo.com
Special Characters • Can match multiple strings • Represent zero or more characters • Always match the longest possible string (we’ll see examples in a bit)
Periods • Matches any single character • Ex: /.ing/ • I was talking • bling • he called ingred • Ex: /spar.ing/ • sparring • sparking
Brackets • Define a character class • Match any one character in the class • If a carat (^) is first character in class, character class matches any character not in class • Other special characters in class lose meaning
Brackets con’t • Ex. /[jJ]ustin/ matches justin and Justin • Ex. /[A-Za-z]/ matches any letter • Ex. /[0-9]/ matches any number • Ex. /[^a-z]/ matches anything but lowercase letters
Asterisks • Zero or more occurrences of the previous character • So match any number of characters would be /.*/ • Ex. /t.*ing/ • thing • this is really annoying
Plus Signs and Question Marks • Very similar to asterisks, depend on previous • + matches one or more occurrences (not 0) • ? Matches zero or one occurrence (no more) • Ex. /2+4?/ matches one or more 2’s followed by either zero or one 4 • 22224, 2 match • 4, 244 do not • Part of the class of extended R.E.
Carets & Dollar Signs • If a regular expression starts with a ^, the string must be at the beginning of a line • If a regular expression ends with a $, the string must be at the end of a line • ^ and $ are referred to as anchors • Ex. /^T.*T$/ matches any line that starts and ends with T
Quoting Special Characters • If you want to use a special character literally, put a backslash in front of it • Ex. /and\/or/ matches and/or • Ex. /\\/ matches \ • Ex. /\**/ matches any number of asterisks
Longest Match • Regular expressions match the longest string possible in a line • Ex. I (Justin) like coffee (lots). • /(.*)/ • Matches (Justin) like coffee (lots) • /([^)]*)/ • Matches (Justin)
Boolean OR • You can pattern match for two distinct strings using OR (the pipe) • Ex. /CAT|DOG/ • Matches exactly CAT and exactly DOG • Simplier expressions can be written just using a character class • I.E. /a[bc]/ instead of /ab|ac/ • Also part of extended R.E.
Grouping • You can apply special characters to groups of characters in parenthesis • Also called bracketing • Matches same as unbracketed expression • But can use modifiers • Ex. /\(duck\)*|\(goose\)/
Using with vim • Use regular expressions for searching and substituting • Searching: • /string or ?string • Substituting: • :[g][address]s/string/replace[/g] • g : global; substitute all lines • string and replace can be R.E. • /g: global; replace all occurrences in the line
Using with vim con’t • [address] • n: line number • n[+/-]x: line number plus x lines before or after • n1,n2 : from line n1 to n2 • . : alias for current line • $ : alias for last line in work buffer • % : alias for entire work buffer
vim examples • /^if( • /end\.$ • :%s/[Jj]ustin/Mr\. Awesome/g
Using with vim con’t • Ampersand (&) • Alias for matched string when substituting • Ex: /[A-Z][0-9]/_&_/ • Quoted digit (\n) • Used with R.E. with multiple quoted parts • Can be used to rearrange columns • Ex: /\([^,]*\), \(.*\)/\2 \1/
Using with grep • To take advantage of extended regular expressions, use egrep or grep -E instead • Use single quote as delimiter • Ex: • egrep ’^T.*T$’ myfileLists all lines in myfile that begin & end with T