1 / 11

Lecture 10

Lecture 10. Introduction to AWK COP 3344 Introduction to UNIX. 1. What is AWK. Important early text manipulation language Created by Al Aho, Peter Weinberger & Brian Kernighan This Unix utility manipulates text files that are viewed as arranged in columns

adolfo
Download Presentation

Lecture 10

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 10 Introduction to AWK COP 3344 Introduction to UNIX 1

  2. What is AWK • Important early text manipulation language • Created by Al Aho, Peter Weinberger & Brian Kernighan • This Unix utility manipulates text files that are viewed as arranged in columns • awk splits each line of input (from standard input or a set of files) based on whitespace (the default) and processes each line - the field separator need not be whitespace but can also be a specified character • There are also other flavors of awk such as nawk and gawk 2

  3. Awk Command Structure • awk [options] ‘program’ [file(s)] • awk [options] -f programfile [files(s)] • A program can be one or more pairs of the following: • pattern { procedure } • BEGIN and END constructs can also be used • An important option is -Fc where c is the field separator to use. For example awk -F: . . . indicates that the separator is”:” • Example • awk -F: ‘/this/ { print $2 }’ file1 3

  4. Awk Program Processing • awk scans each input line for pattern and when a match occur the associated actions defined by procedure are executed. The general form of a program is: • BEGIN { initial statements } • pattern { procedure } • pattern { procedure } • END { final statements } • If the pattern is missing, the procedure is applied to each line • If procedure is missing, then the matched lines are written to standard output • Fields are referred to by the variables $1, $2, …, $n. $0 refers to the entire record (the line). • Statements following BEGIN are done before any pattern-procedures; statements after END are done after all pattern-procedures. • In most programs there is only one pattern {procedure} 4

  5. awk patterns • awk patterns can be of the following form • /regular expression/ • relational expression • field-matching expression • Example patterns • /this/ • /^alpha*/ • NF > 2 • $1 == $2 • $1 ~ /m$/ 5

  6. Example pattern-procedures • Print the second field of each line { print $2 } • Print the first field of all lines that contain the pattern alpha /alpha/ { print $1 } • Print all records containing more than two fields NF > 2 • Add numbers in second column if first field matches the word “add” • $1 ~ /^add$/ { total += $2 } • END { print “total is”, total } 6

  7. awk Regular Expressions • Regular expressions are formed in the same way as they are for extended grep. All the operators are available • Note that regular expressions must be placed with the slashes: /<regular expression>/ • Examples • /D[Rr]\./ #matches any line containing DR. or Dr. • /^alpha/ #matches any line starting with alpha • /^[a-zA-Z]+/ #matches any line starting with a sequence of #letters (one or more)‏ 7

  8. awk Relational Expressions • Relational expressions can consist of strings, numbers, arithmetic / string operators, relational operators, defined variables, and predefined variables. • $1, …, $n, are the fields of the record • $0 is the entire line • NF is the number of fields in the current line • NR is the number of the current line • FS is the field separator • FILENAME is the current filename • many relational operators are available • NF > 5 && $1 == $2 • /while/ || /do/ • Note: variables can be assigned with the “=“ operator • FS = “,” • total = 5 8

  9. awk field matching expressions • Field matching expressions can check if a regular expression matches “~” or does not match “!~” a field. • Examples • $1 ~ /D[Rr]\./ #first field matches DR. or Dr. ? • $1 !~ /From/ #first field does not match From ? 9

  10. awk procedures • An awk procedure specifies the processing of a line that matches a given pattern. An awk procedure is contained within the “{“ and “}” and consists of statements separated by semicolons or newlines. • awk is a full programming language, and contains control statements (such as: do while, for, if, break, continue, etc.)‏ • Note that BEGIN can be used to initialize variables and END can be used to do post processing after all records have been processed 10

  11. awk examples • #print the first two fields of each line if the first field matches the string /this/ awk ‘/this/ { print $2, $1 }’ file1 • #sum the values of the fields in the second column and print out the final sum, if the first field matches add awk ‘BEGIN { sum=0 } /add/ { sum += $2 } \ END{ print sum }’ file2 • # illustrating if statements and the or operator awk ‘/green/ || /yellow/ \ {if ($1==“green") print $1 ; \ else if ($1=="yellow") print "SLOW DOWN";}’ \ file3 11

More Related