80 likes | 188 Views
CSC 352– Unix Programming, Spring, 2011. April 6, 2011, Week 11, a useful subset of regular expressions, grep and sed, parts of Chapter 11. Motivation.
E N D
CSC 352– Unix Programming, Spring, 2011 April 6, 2011, Week 11, a useful subset of regular expressions, grep and sed, parts of Chapter 11
Motivation • In assignment 4 you will write a shell script to inspect all regular files in and below directory (including subdirectories) for a string pattern. • To get a list of files containing a pattern, you can use the find command to find all regular files, and grep to search for the pattern. • Your script will then iterate through this list of files and, one at a time, use sed to replace all occurrences of the pattern with a new string.
Finding files with pattern • Start out with a manual find command. • find JavaLang -type f –print # looks for all regular files • find JavaLang -type f –name “*.java” –print # use name • Use grep with above in back ticks for file list. • grep interface `find JavaLang -type f -print` • grep –l interface `find JavaLang -type f -print` • grep –l interface `find JavaLang -type f –print 2>/dev/null` • Iterate through files in a for loop. • all=`find JavaLang -type f -print 2>/dev/null` • matches=`grep -l interface $all 2>/dev/null` • for file in $matches; do echo ”FOUND FILE $file"; done
File names with spaces • These create problems for above approach. • It is necessary to use the “-print0” option of find instead of “-print” and to pipe the stdout of find to xargs -0 in order to package space-containing file names up for downstream commands such as grep. • We will not use directories and file names that contain spaces in project 4. • Avoid spaces in directory and file names when setting up Unix source code and similar repositories.
Using sed for substitution • Replace for loop in previous example line with this. • for file in $matches • do • sed -e "s/interface/thingy/g" $file > junk.tmp.txt • mv junk.tmp.txt $file • done • Sed can substitute a string (“thingy’ in this example) for a regular expression pattern. • The “/g” portion of sed’s substitute command “s/” says, “Do it globally, throughout each line.”
Grep command line (p. 295) • Grep searches for a regular expression in stdin or in a list of files given on the command line. • A regular expression is a string expression that describes a set of strings. • Grep options include the following. • -i is case insensitive • -v shows only non-matching line (filters out matches). • -l (ell) gives only the distinct file names. • -n displays line numbers along with lines.
A few important patterns (p. 299) pattern * 0 or more occurrences of previous character pattern . any single character pattern [pqr] any single character from set p, q or r pattern [a-zA-Z] any single character from range a-z or A-Z pattern [^pqr] any single character *not* from set p, q or r pattern ^ the start of the string being searched for a match pattern $ the end of the string being searched for a match pattern \ escapes the next character so it is treated as a regular char These are the most useful regular expression patterns available to grep, sed, and for pattern-based searching in emacs and vi. The shell uses so-called “glob-style matching” for strings (*.java), which differ from regular expressions (.*\.java) used by grep, sed, emacs and vi.
Intro to sed • Sed is based on early command-line editors in Unix (ed editor), applied to a stream of lines as a filter. • All we will cover this semester is using sed to replace occurrences of a pattern in a file with a string. • sed –e ‘s/this/that’ # substitutes string “that” for pattern “this”, only once per line, where “this” may contain patterns from the previous slide. Use single quotes in the sed command line if you don’t want the Shell to expand Shell meta-characters such as $. • sed –e ‘s/this/that/g’ # global substitute all occurrences of pattern “this” in each line. • Sed can read stdin or a file, writes to stdout.