280 likes | 392 Views
CSE4251 The Unix Programming Environment. Lecture 10. awk. Recap. Regular expressions (lec-5) symbols & rules to describe text patterns/filters Unix commands/utilities that support regular expressions grep ( fgrep , egrep ) - search a file for a string or regular expression
E N D
CSE4251 The Unix Programming Environment Lecture 10 • awk
Recap • Regular expressions (lec-5) • symbols & rules to describe text patterns/filters • Unix commands/utilities that support regular expressions • grep(fgrep, egrep)- search a file for a string or regular expression • sed - stream editor • awk (nawk) - pattern scanning and processing language note: there are some minor differences between the regular expressions supported by these programs
awk history • The name AWK • Initials of designers: Alfred V. Alo, Peter J. Weinberger, and Brian W. Kernighan. • Appear 1977, stable release 1985 • In BSD, OS X: bawk or nawk. • GNU/Linux : gawk $ which awk /bin/awk $ ls -l /bin/*awk lrwxrwxrwx. 1 root root4 Jul 2 2013 /bin/awk -> gawk -rwxr-xr-x. 1 root root 382456 Jul 4 2012 /bin/gawk
awk basics • basic function: • search files for lines that contains certain patterns, • do actions on those lines • basic command format: $ awk ‘{action}’ file.txt $ awk‘/pattern/{action}’ file.txt $ awk‘/pattern1/{action1} /pattern2/{action2}’ file.txt
gamefile northwest NW Charles Main 3.0 .98 3 34 western WE Sharon Gray 5.3 .97 5 23 southwest SW Lewis Dalsass 2.7 .8 2 18 southern SO Suan Chin 5.1 .95 4 15 southeast SE Patricia Heme 4.0 .7 4 17 eastern EA TB Savage 4.4 .84 5 20 northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Webber 4.5 .89 5 9 central CT Ann Stephens 5.7 .94 5 13
awk basics • basic command examples: $ awk ‘{print}’ gamefile #print all lines (no pattern constraints) $ awk '{print}' gamefile northwest NW Charles Main 3.0 .98 3 34 western WE Sharon Gray 5.3 .97 5 23 southwest SW Lewis Dalsass 2.7 .8 2 18 southern SO Suan Chin 5.1 .95 4 15 southeast SE Patricia Heme 4.0 .7 4 17 eastern EA TB Savage 4.4 .84 5 20 northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Webber 4.5 .89 5 9 central CT Ann Stephens 5.7 .94 5 13
awk basics • basic command examples: $ awk ‘{print $1}’ gamefile #print 1st field in all lines $ awk '{print $1}' gamefile northwest western southwest southern southeast eastern northeast north central
awk basics • basic command examples: $ awk ‘/north/{print $1}’ gamefile #print 1st field in lines containing north $ awk '/north/{print $1}' gamefile northwest northeast north
awk basics • basic command examples: $ awk ‘/north/{print $1} /west/{print}’ gamefile $ awk '/north/{print $1} /west/{print}' gamefile northwest northwest NW Charles Main 3.0 .98 3 34 western WE Sharon Gray 5.3 .97 5 23 southwest SW Lewis Dalsass 2.7 .8 2 18 northeast north
More concepts • A line is called a record • text separated by delimiter is called field • default delimiter is space • FS: input field separator (delimiter) • default is space • two ways to change default delimiter • change via –F • change via setting FS
More concepts • Change delimiter to “:” via –F • Change delimiter to “:” via setting FS $ awk -F: '/north/{print $1}' gamefile northwest NW Charles Main 3.0 .98 3 34 northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Webber 4.5 .89 5 9 $ awk '{FS=":"} /north/{print $1}' gamefile northwest northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Webber 4.5 .89 5 9
More concepts • $0, $1, ... etc • $0 : the whole line • $1 : the first field in a line • NR : Number of record • also the line number • NF : number of fields in a line
More concepts • E.g., print line number, first field, and number of fields in the line; connect each output filed with “-” $ awk '{print NR "-" $1 "-" NF}' gamefile 1-northwest-8 2-western-8 3-southwest-8 4-southern-8 5-southeast-8 6-eastern-8 7-northeast-9 8-north-8 9-central-8
Another example • E.g., print line number, employ first name, and number of fields in the line; connect each output filed with “---” $ cat employees.txt Tom Jones 4424 5/12/66 543354 Mary Adams 5346 11/4/63 28765 Sally Chang 1654 7/22/54 650000 Billy Black 1683 9/23/44 336500 $ awk '{print NR "---" $1 "---" NF}' employees.txt 1---Tom---5 2---Mary---5 3---Sally---5 4---Billy---5
awkcomparison expression • Conditional expression • condition? exp1 : exp2 • Logical operation • &&, ||, !
More examples • awk ‘$3<4000’ employees.txt • print lines where $3 is less than 4000 • awk‘/Tom/{print “Hello, “ $1}’ employees • find the line containing Tom, then print “Hello Tom“ $ awk '$3<3000' employees.txt Sally Chang 1654 7/22/54 650000 Billy Black 1683 9/23/44 336500 $ awk '/Tom/{print "Hello, " $1}' employees.txt Hello, Tom
More examples • awk‘/ly/{ print $1}’ employees • print the names that contain ly • awk ‘$1 !~ /ly/{ print $1}’ employees • print the names that dose not contain ly $ awk '/ly/{print $1}' employees.txt Sally Billy $ awk '$1 !~ /ly$/{print $1}' employees.txt Tom Mary
More examples • Conditional expression $ cat needmax.txt 1 2 3 5 6 3 7 2 $ awk '{max=($1>$2)? $1 : $2; print max}' needmax.txt 2 5 6 7
More examples • awk'$8 > 10 && $8 < 17' gamefile • awk ‘$7==5{print $7+5}’ gamefile $ awk '$8 > 10 && $8 < 17' gamefile southern SO Suan Chin 5.1 .95 4 15 central CT Ann Stephens 5.7 .94 5 13 $ awk '$7==5{print $7+5}' gamefile 10 10 10 10
Math operators • awk '/southern/{print $5 + 10.56}' gamefile • awk '/southern/{print $8 - 10}' gamefile • awk '/southern/{print $8 / 2}' gamefile • awk '/southern/{print $8 * 2}' gamefile • awk '/northeast/ {print $8 % 3}' gamefile
More operators • assignment operators: =, +=, -=, *=, /=, %=, ^= • increment and decrement: ++, -- • awk '$3 == "Chris"{ $3 = "Christian"; print}' gamefile • if a line’s 3rd field is “Chris”, change it to Christian and print out the line • awk‘{$7^=2; print $7}’ gamefile • square the 7th field and print out the 7thfield
More operators • awk‘{x=1; y=x++; print x, y}’ gamefile $ awk '{x=1;y=x++; print x,y}' gamefile 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1
BEGIN pattern • BEGIN pattern is followed by an action block that is executed before processing any lines from the input file. • can run an awk command without file $ awk 'BEGIN{print "Hello" }' Hello
BEGIN pattern • Change delimiter to “:” via setting FS $ awk '{FS=":"} /north/{print $1}' gamefile northwest northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Webber 4.5 .89 5 9 $ awk 'BEGIN{FS=":"} /north/{print $1}' gamefile northwest NW Charles Main 3.0 .98 3 34 northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Webber 4.5 .89 5 9
END pattern • END pattern allows actions to be executed after processing all lines in the input file • can run an awk command without file $ awk 'END{ print "The number of records is " NR}' employees.txt The number of records is 4
Can use redirection & pipe in actions • awk'$8 > 10 && $8 < 17 {print}' gamefile • awk '$8 > 10 && $8 < 17 {print > “tmp.out”}' gamefile $ $ awk '$8 > 10 && $8 < 17 {print}' gamefile southern SO Suan Chin 5.1 .95 4 15 central CT Ann Stephens 5.7 .94 5 13 $ awk '$8 > 10 && $8 < 17 {print > "tmp.out"}' gamefile $ cat tmp.out southern SO Suan Chin 5.1 .95 4 15 central CT Ann Stephens 5.7 .94 5 13
awkin script • awk –fawk.scriptsomefile $ cat awk.script #file: awk_first /Tom/{print "Tom's birthday is " $4} /Mary/{print NR, $0} #print line number /^Sally/{print "Hi, Sally. " $1 " has salary of $" $5 "."} $ awk -f awk.script employees.txt Tom's birthday is 5/12/66 2 Mary Adams 5346 11/4/63 28765 Hi, Sally. Sally has salary of $650000.
More to explore • conditional statement • loop • arrays • user-defined functions. Chapter 6 of Unix Shells by Example [Online Version] , 4th Edition by Ellie Quigley