180 likes | 316 Views
Revision Lecture. Mauro Jaskelioff. AWK Program Structure. AWK programs consists of patterns and procedures. Pattern_1 { Procedure_1} Pattern_2 { Procedure_2} Pattern_3 { Procedure_3} … … Pattern_n { Procedure_n}.
E N D
Revision Lecture Mauro Jaskelioff
AWK Program Structure • AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2} Pattern_3 { Procedure_3} … … Pattern_n { Procedure_n} • Additionally, a program can contain function definitions (but we don’t need to worry about them now)
Example program BEGIN { FS= ":" print “Example v0.1" } $7 ~ /bash/ { print $1 " uses bash" } $4 == 0 { print "user " $1 " belongs to the root group" } { print "--------------------------------" } • Don’t mind details! Try to recognize the general structure described on the previous slide.
AWK Input • AWK input consists of records and fields • Records are separated by a record separator RS • By default the RS is a newline, so each record is a line of input • Each record consists of zero or more fields, separated by a field separator FS • By default the FS is blank space. • The current record is $0. Each of its fields is $1, $2, …
Red,255 0 0 Green,0 255 0 Blue,0 0 255 Red,255 0 0 Green,0 255 0 Blue,0 0 255 Example of inputs Red,255 0 0 Green,0 255 0 Blue,0 0 255 Consider the following input file: • Default RS and default FS if $0=“Red,255 0 0” then $1=“Red,255”, $2=“0” and $3=“0” • With FS=‘,’ if $0=“Red,255 0 0” then $1=“Red” and $2=“255 0 0”
AWK’s Main loop (simplified) for each input record r do parse r for each pattern patido if r matches patithen execute proci
Patterns A pattern can be: • Relational expression • Use relational operators, e.g. $1 > $2 awk -F: ‘$1 > $2 {print $0}’ /etc/passwd • Can do numeric or string comparisons awk -F: ‘$1==“gdm” {print $0}’ /etc/passwd • An empty pattern awk -F: ‘{print $0}’ /etc/passwd • Always True • Equivalent to a true expression. For example, the command above is the same as: awk -F: ‘1 < 2 {print $0}’ /etc/passwd
Patterns (2) • Pattern-matching expression • E.g. quoted strings, numbers, operators, defined variables… • ~ means match, !~ means don’t match awk -F: '$1 ~ /.dm.*/ {print $0}' /etc/passwd awk -F: '$0 ~ /^...:/ {print $0}' /etc/passwd awk -F: '$1 !~ /^g/ {print $0}' /etc/passwd • /regular expression/ • Equivalent to $0 ~ /regular expression/ awk -F: ‘/^...:/ {print $1}’ /etc/passwd
Special patterns • Two special patterns: • BEGIN • Specifies procedures that take place before the first input line is processed awk ‘BEGIN {print “Version 1.0”}’ dataFile • END • Specifies procedures that take place after the last input record is read awk ‘END {print “end of data”}’ dataFile • This means we need to refine description of the main loop (see next slide)
AWK’s refined Main loop for each BEGIN pattern do execute corresponding procedure for each input record r do parse r for each pattern patido if r matches patithen execute proci for each END pattern do execute corresponding procedure This is the previous version of the main loop
Procedures • Procedures consist of the usual assignment, conditional, and looping statements found in most languages. • These are separated by newlines or semi-colons and are contained within curly brackets { } • A procedure can be empty. The empty procedure prints $0.
awk Built-in Variables • awk has a number of built in variables: • FILENAME - current filename • FS - Field separator • NF - Number of fields in current record • NR - Number of current record • RS - Record separator • $0 - Entire input record • $n - nth field in current record
Control Structures • if (condition) statement • if (condition) statement else statement • for (expr1; expr2; expr3) statement • for (index in array) statement • More about this when we review arrays. • while (condition) statement
For-While equivalence for (expr1; expr2; expr3) statement is equivalent to: expr1; while (expr2) { statement; expr3 }
Arrays in awk • awk has arrays with elements subscripted with strings (associative arrays) • Assign arrays in one of two ways: • Name them in an assignment statement • myArray[i]=n++ • myArray["Red"]="255 0 0" • Use the split(str,arr,fs) function which splits the string str into elements of array arr, using field separator, fs. It returns the number of fields used. • n=split(input, words, "")
Example of split m=split("Blue 0 0 255",colors," ") results in: m ← 4 colors[1]← "Blue" colors["2"]← "0" colors[3]← "0" colors["4"] ← "255" • Since indexes are really strings it's legal to write them enclosed in quotes
Reading elements in an array • Using a for loop: • Since indexes are strings, this is the only way to loop through all elements of an array • Using the operator in: • we use this to test if an index exists. for (index in array) print array[index] if (index in array) ...