630 likes | 960 Views
Advanced UNIX. 240-491 Special Topics in Comp. Eng. 1 Semester 2, 2000-2001. Objectives to discuss five useful filters: tr , grep , awk , sed , and find. 4. Filters (Part II, Sobell). 1. tr. format: tr [options] string1 [string2]
E N D
Advanced UNIX 240-491 Special Topics in Comp. Eng. 1Semester 2, 2000-2001 • Objectives • to discuss five useful filters:tr, grep, awk, sed, and find 4. Filters(Part II, Sobell)
1. tr • format: tr [options] string1 [string2] • trreads its standard input and translates each character in string1to the corresponding character in string2
Examples $ echo 12abc3def4 | tr ’abcdef’ ’xyzabc’12xyz3abc4 $ echo 12abc3de4 | tr ’[a-c][d-f]’ ’[x-z][a-c]’12xyz3abc4 $ cat foo.txt | tr ’[A-Z]’ ’[a-z]’
$ tr ’\015’ ’ ’ < file1 > file2 • \015 is carriage return $ cat mail.txt | tr -s ’ป’ ’ ’ > new-mail.txt • ป represents tab; could write \011 • -s means remove duplicates of string2in output $ echo Can you read this? | tr -d ’aeiou’Cn y rd ths?
“rot13” Text Popular in 1970-1980’s. $ echo Gur chapuyvar bs gur wbxr vf ... | tr ’[N-Z][A-M][n-z][a-m]’ ’[A-M][N-Z][a-m][n-z]’The punchline of the joke is ...
2. grep • Format: grep [options] pattern [file-list] • Search one or more files, line by line, for a pattern (a regular expression). Actions taken depend on options.
Variants of grep • grep Uses basic RE pattern • fgrepFast grep. Pattern can only be an ordinary string. • egrepExtended grep. Pattern can use full REs.
grep options • -c print a count of matching lines • -i ignore case in pattern during search • -l list filenames with match • -n precede each matching line by a line number • -v print lines that do not match pattern
Examples File testa File testb File testcaaabb aaaaa AAAAAbbbcc bbbbb BBBBBff-ff ccccc CCCCCcccdd ddddd DDDDDdddaa continued
$ grep bb testaaaabbbbbcc • $ grep -v bb testaff-ffcccdddddaa • $ grep -n bb testa1: aaabb2: bbbcc continued
$ grep bb *testa: aaabbtesta: bbbcctestb: bbbbb • $ grep -i bb * $ grep -i BB * testa: aaabb testa: aaabb testa: bbbcc testa: bbbcc testb: bbbbb testb: bbbbb testc: BBBBB testc: BBBBB
Fancier Patterns • $ grep ’fun..ion’ file • $ grep -n ’^#define’ file • $ grep ’^#de[a-z]*’ file • $ egrep ’while|if’ *.c • $ egrep ’[0-9]+’ *.c
3. awk • format: awk program file-list awk -f program-file file-list • awkis a pattern scanning and action processing language • The action language is very like C.
Overview 3.1. Patterns & Actions 3.2. awkProcessing Cycle 3.3. How awkSees a Line 3.4. Pattern Expressions 3.5. ‘,’ Range Operator continued
3.6. Many Built-in Functions 3.7. BEGINand END 3.8. First awkProgram File: pre_header 3.9. Action Language 3.10. Associative Arrays
3.1. Patterns & Actions • An awkprogram consists of: pattern {action}pattern {action} :
3.2. awk Processing Cycle 1. Read next input line. 2. Apply all awkpatterns sequentially. 3. If a pattern matches, do its action. 4. Go to step (1).
Example • $ cat carsplym fury 77 73 2500chevy nova 79 60 3000ford mustang 65 45 10000volvo gl 78 102 9850ford ltd 83 15 10500chevy nova 80 50 3500fiat 600 65 115 450honda accord 81 30 6000ford thundbd 84 10 17000toyota tercel 82 180 750chevy impala 65 85 1550ford bronco 83 25 9500 continued
$ awk ’/chevy/ {print}’ carschevy nova 79 60 3000chevy nova 80 50 3500chevy impala 65 85 1550 • $ awk ’/chevy/’ carschevy nova 79 60 3000chevy nova 80 50 3500chevy impala 65 85 1550 • $ awk ’/^h/’ carshonda accord 81 30 6000
3.3. How awkSees a Line • awkviews each line as a record consisting of fields separated by spaces. • Each field is referred to by a variable called $<number>: • $1, $2, $3, etc. • $0 refers to the whole line (record) • The current line number is stored in NR continued
$ awk ’{print $3, $1}’ cars77 plym79 chevy65 ford :83 ford • $ awk ’/chevy/ {print $3, $1}’ cars79 chevy80 chevy65 chevy
3.4. Pattern Expressions • Format: variable OP pattern • OPforms: • matching: ~ !~ • ariithmetic: < <= == != >= > • boolean: && || ! continued
$ awk ’$1 ~ /h/’ carschevy nova 79 60 3000chevy nova 80 50 3500honda accord 81 30 6000chevy impala 65 85 1550 • $ awk ’$1 ~ /^h/’ carshonda accord 81 30 6000 continued
$ awk ’$2 ~ /^[tm]/ {print $3, $2, “$” $5}’ cars65 mustang $1000084 thundbd $1700082 tercel $750 • $ awk ’$3 ~ /5$/ {print $3, $1, “$” $5}’ cars65 ford $1000065 fiat $45065 chevy $1550 continued
$ awk ’$3 == 65’ carsford mustang 65 45 10000fiat 600 65 115 450chevy impala 65 85 1550 • $ awk ’$5 <= 3000’ carsplym fury 77 73 2500chevy nova 79 60 3000fiat 600 65 115 450toyota tercel 82 180 750chevy impala 65 85 1550 continued
$ awk ’$5 >= “2000” && $5 < “9000”’ carsplym fury 77 73 2500chevy nova 79 60 3000chevy nova 80 50 3500fiat 600 65 115 450honda accord 81 30 6000toyota tercel 82 180 750 • $ awk ’$5 >= 2000 && $5 < 9000’ carsplym fury 77 73 2500chevy nova 79 60 3000chevy nova 80 50 3500honda accord 81 30 6000
3.5. ‘,’ Range Operator • Format: pattern1 , pattern2 • Select a range of lines. • the first line of the range matches pattern1 • the last line of the range matches pattern2 • May return several groups of lines continued
$ awk ’/volvo/ , /fiat/’ carsvolvo gl 78 102 9850ford ltd 83 15 10500chevy nova 80 50 3500fiat 600 65 115 450 • $ awk ’NR == 2 , NR ==4’ carschevy nova 79 60 3000ford mustang 65 45 10000volvo gl 78 102 9850 continued
$ awk ’/chevy/ , /ford/’ carschevy nova 79 60 3000ford mustang 65 45 10000chevy nova 80 50 3500fiat 600 65 115 450honda accord 81 30 6000ford thundbd 84 10 17000chevy impala 65 85 1550ford bronco 83 25 9500 threegroups
3.6. Many Built-in Functions • length(str) length of string strlength length of current line • split(strings, array, delimitor) split stringinto parts based on the delimitor, and place in array • split(“a bcd ef g1”, arr, “ “) continued
3.7. BEGINand END • BEGIN {action} executed before first line is processed • END {action} executed after last line is processed • $ awk ’END {print NR, “cars for sale.”}’ cars12 cars for sale
3.8. First awk Program File • $ cat pr_header## pr_header#BEGIN {print “Make Model Year Miles Price”print “---------------------------------” } {print} continued
$ awk -f pr_header carsMake Model Year Miles Price---------------------------------plym fury 77 73 2500chevy nova 79 60 3000 : :chevy impala 65 85 1550ford bronco 83 25 9500
redirect_out • $ cat redirect_out/chevy/ {print > “chev.txt”}/ford/ {print > “ford.txt”}END {print “done.”} • $ awk -f redirect_out carsdone.$ cat chev.txtchevy nova 79 60 3000chevy nova 80 50 3500chevy impala 65 85 1550
3.9. Action Language • Very C like: • var = expr • if (cond) stat1 else stat2 • while (cond) stat • for (expr1; cond; expr2) stat • printf “format” expr1, expr2, ... • { stat1 ; stat2; ... ; statN } • User-defined variables do not need to be declared continued
Long statements, conditions, expressions may need to be typed over several lines. • Use ‘\’ to hide newline: if ($3 > 2000 && \ $3 < 3000) print $3
price_range • $ cat price_range{if ($5 <= 5000) $5 = “inexpensive”else if ($5 > 5000 && $5 < 10000) \ $5 = “please ask”else if ($5 >= 10000) $5 = “expensive”printf “%-10s %-8s 19%2d %5d %-12s\n”, \ $1, $2, $3, $4, $5} continued
$ awk -f price_range carsplym fury 1977 73 inexpensivechevy nova 1979 60 inexpensive : :ford bronco 1983 25 please ask
summary • $ cat summaryBEGIN { yearsum = 0 ; costsum = 0 newcostsum = 0 ; newcnt = 0 } { yearsum += $3 ; costsum += $5 }$3 > 80 { newcostsum += $5 ; newcnt++ }END { printf “Avg. car age: %3.1f yrs\n”, \ 90 - (yearsum/NR) printf “Avg. car cost: $%7.2f\n”, \ costsum/NR printf “Avg. newer car cost: $7.2f\n”, \ newcostsum/newcnt } continued
$ awk -f summary carsAvg. car age: 13.2 yrsAvg. car cost: $6216.67Avg. newer car cost: $8750.00
3.10. Associative Arrays • Arrays that use strings as indexes: • array[string] = value • Special for-loop for awkarrays: • for (elem in array) action continued
manuf • $ cat manuf {manuf[$1]++}END { for (name in manuf) \ print name, manuf[name] } continued
$ awk -f manuf carshonda 1 fiat 1 volvo 1 ford 4 plym 1chevy 3toyota 1
Sorted Output • Sort by first column (i.e. by name): $ awk -f manuf cars | sort • Sort by second column (i.e. by number): $ awk -f manuf cars | sort +1
4. sed • Format: sed ’list of ed commands’ file • Read lines one at a time from the input file • apply ed commands in order to each line • write edited line to stdout • ed is an old UNIX editor • vi without full-screen mode • did you think vi was tough :)
4.1. Search and Replace • The ‘s’ command searches for a pattern (a regular expression), and replaces it with the new string: ’s/pattern/new-string/g’ • ‘g’ means global (everywhere on line)
Examples • $ sed ’s/UNIX/UNIX(TM)/g’ file > new-file • $ sed ’s/^/ /’ file > new-file • put a tab at the start of every line (no g needed) • $ sed ’s/[ ][ ]*/\/g’ file > new-file • replace every sequence of blanks or tabs with a newline • this splits the input into 1 word/line continued
$whoad tty1 Sep 29 07:14ron tty3 Sep 29 10:31td tty4 Sep 29 08:36$ who | sed ’s/ .* / /’ad 07:14ron 10:31td 08:36$ replace a blank and everything that follows it (as much as possible, including more blanks) up to the last blank
More Information • sedcan use most ed commands, not just s • See the entry on sed in Sobell, p.680-691