270 likes | 294 Views
Learn the fundamentals of regular expressions with grep command usage, various operators, extended expressions, exercises, and references for efficient string processing in Linux.
E N D
NCNU Linux User Group 2010 <Regular Expression> 王惟綸(Wei-Lun Wang) 2010/07/07
Outline • What’s a Regular Expression? • The Purpose • What’s grep? • Various Operators • Extended Regular Expressions • Exercises • References
What’s a Regular Expression? • A regular expression is a pattern that describes a set of strings. • ExamplesX[2-7] = {X2, X3, X4, X5, X6, X7} T[ae]ste? = {Taste, Tast, Teste, Test}
The Purpose • The regular expression is used to process strings. It makes users easily do searching, replacement, and deletion though the aid of special characters. • T[ae]ste? = {Taste, Tast, Teste, Test} -- These four strings, Taste, Tast, Teste, and Test, can be found out by only searching the pattern “T[ae]ste?”.
What’s grep? • global regular expression print • The grep command searches for the pattern specified by the Pattern parameter and writes each matching line to standard output. [-i ] : ignore the type of upper and lower cases [-v] : inverse the output
Various Operators • [ ] presents any one character among those characters inside. • [ - ] presents any one character among the code range. • [^ ] represents the characters not in the range of a list. • ^ Matches the empty string at the beginning of a line. • $ Matches the empty string at the end of a line. • . Matches any single character. • * The preceding item will be matched zero or more times.
1. [ ] presents any one character among those characters inside. th[ei] = {the, thi}
2. [ - ] presents any one character among the code range. LANG=C :0 1 2 3 4 ... A B C D ... Z a b c d ...z LANG=zh_TW.Big5 :0 1 2 3 4 ... a A b B c C d D ... z Z
2. [ - ] presents any one character among the code range. LANG=C :0 1 2 3 4 ... A B C D ... Z a b c d ...z LANG=zh_TW.Big5 :0 1 2 3 4 ... a A b B c C d D ... z Z
3. [^] represents the characters not in the range of a list.
7. * The preceding item will be matched zero or more times. go* = {g, go, goo, gooo, …} goo* = {go, goo, gooo, …}
Extended Regular Expressions • In basic regular expressions the metacharacters "?", "+", "{", "|", "(", and ")" lose their special meaning; instead use the backslashed versions "\?", "\+", "\{", "\|", "\(", and "\)". • Using grep -E or egrep instead of grep. • + The preceding item will be matched one or more times. • ? The preceding item will be matched zero or one time. • | represents the preceding item or the following item. • ( ) represents group strings. • {N} The preceding item is matched exactly N times. • {N, } The preceding item is matched N or more times. • {N,M} The preceding item is matched at least N times, but not more than M times.
1. + The preceding item will be matched one or more times. goo+ = {goo, gooo, goooo, …}
2. ? The preceding item will be matched zero or one time. goog? = {goog, goo}
3. | represents the preceding item or the following item. goo|fav = {goo, fav}
4. ( ) represents group strings. f(oo|ee)d = {food, feed}
5. {N} The preceding item is matched exactly N times. go\{2\} = {goo} go\{5\} = {gooooo}
7. {N,M} The preceding item is matched at least N times, but not more than M times. go\{2,5\}g = {goog, gooog, goooog, gooooog}
Exercises • What does grep -n '^[^A-z] ' mean? • How to find out empty lines? • How to find out “[LUG2010]”? • Find all files and their contents containing the symbol “*” under /etc
References • http://linux.vbird.org/linux_basic/0330regularex.php • http://tldp.org/LDP/Bash-Beginners-Guide/html/chap_04.html • http://en.wikipedia.org/wiki/Regular_expression • http://www.regular-expressions.info/posix.html