310 likes | 324 Views
4. Semantic Processing and Attribute Grammars. Semantic Processing. The parser checks only the syntactic correctness of a program. Tasks of semantic processing. Symbol table handling - Maintaining information about declared names - Maintaining information about types - Maintaining scopes.
E N D
Semantic Processing The parser checks only the syntactic correctness of a program Tasks of semantic processing • Symbol table handling- Maintaining information about declared names- Maintaining information about types- Maintaining scopes • Checking context conditions- Scoping rules- Type checking • Invocation of code generation routines Semantic actions are integrated into the parser and are described with attribute grammars
Now: translation of the input (semantic processing) e.g.: we want to count the terms in the expression Expr = Term (. int n = 1; .) { "+" Term (. n++; .) } (. Console.WriteLine(n); .) . • semantic actions • arbitrary Java statements between (. and .) • are executed by the parser at the positionwhere they occur in the grammar Semantic Actions So far: analysis of the input the parser checks if the input is syntactically correct. Expr = Term { "+" Term }. "translation" here: 1+2+33 47+12 9091
Attributes are useful in the translation process e.g.: we want to compute the value of a number Expr (. int sum, val; .) = Term<sum> { "+" Term<val>(. sum += val; .) } (. Console.WriteLine(sum); .) . Attributes Syntax symbols can return values (sort of output parameters) Term returns its numeric value as an output attribute Term <int val> "translation" here: 1+2+36 47+148 909909
Example Expr<bool printHex> (. int sum, val; .) = Term<sum> { "+" Term<val>(. sum += val; .) }. (. if (printHex) Console.WriteLine("{0:X}", sum) else Console.WriteLine("{0:D}", sum); .) Input Attributes Nonterminal symbols can have also input attributes (parameters that are passed from the "calling" production) printHex: print the result of the addition hexadecimal (otherwise decimal) Expr<bool printHex>
Attribute Grammars Notation for describing translation processes consist of three parts 1. Productions in EBNF Expr = Term { "+" Term }. 2. Attributes(parameters of syntax symbols) output attributes (synthesized): yield the translation result input attributes (inherited): provide context from the caller Term<int val> Expr<bool printHex> 3. Semantis actions (. ... arbitrary Java statements ... .)
(. Struct type; .) <type> <type> <Struct type> (. Tab.insert(token.str, type); .) (. Tab.insert(token.str, type); .) Example ATG for processing declarations VarDecl = Type IdentList ";" . IdentLIst = ident { "," ident } . This is translated to parsing methods as follows static void VarDecl () { Struct type; Type(out type); IdentList(type); Check(Token.SEMICOLON); } static void IdentList (Struct type) { Check(Token.IDENT); Tab.Insert(token.str, type); while (la == Token.COMMA) { Scan(); Check(Token.IDENT); Tab.Insert(token.str, type); } } ATGs are shorter and more readable than parsing methods
Expr (. int val1; .) <int val> <val> (. val += val1; .) Term <val1> (. val -= val1; .) <val1> Factor (. int val1; .) <int val> <val> Expr <val1> (. val *= val1; .) <val1> (. val /= val1; .) Term Term (. int val1; .) <int val> Factor Factor Factor (. val = t.val; .) <val> 3 * ( 2 + 4 ) Example: Processing of Constant Expressions input: 3 * (2 + 4) desired result: 18 Expr = Term { "+" Term | "-" Term }. Term = Factor { "*" Factor | "/" Factor } Factor = number | "(" Expr ")" 18 18 6 6 2 4 3 2 4
Transforming an ATG into a Parser Production Expr<int val> (. int val1; .) = Term<val> { "+" Term<val1> (. val += val1; .) | "-" Term<val1> (. val -= val1; .) }. Parsing method static void Expr (out int val) { int val1; Term(out val); for (;;) { if (la == Token.PLUS) { Scan(); val1 = Term(out val1); val += val1; } else if (la == Token.MINUS) { Scan(); Term(out val1); val -= val1; } else break; } } input attribute parameter output atribute out parameter semantic actions embedded Java code Terminal symbols have no input attributes. In our form of ATGs they also have no output attributes, but their value is computed from token.str or token.val.
Input for example: 3451 2 5 3 7 END 3452 4 8 1 END 3453 1 1 END ... Desired output: 3451 17 3452 13 3453 2 ... Example: Sales Statistics ATGs can also be used in areas other than compiler constructions Example: given a file with sales numbers File = { Article }. Article = Code { Amount } "END" Code = number. Amount = number. Whenever the input is syntacticlly structured ATGs are a good notation to describe its processing
Parsercode static void File () { int code, amount; while (la == number) { Article(out code, out number); Write(code + " " + amount); } } static void Article (out int code, out int amount) { Value(out code); while (la == number) { int x; Value(out x); amount += x; } Check(end); } terminal symbols number, end, eof static void Value (out int x) { Check(number); x = token.val; } ATG for the Sales Statistics File(. int code, amount; .) = { Article<code, amount>(. Write(code + " " + amount); .) }. Article<int code, int amount> = Value<code> { (. int x; .) Value<x>(. amount += x; .) } "END". Value<int x> = number (. x = token.val; .) .
described by: input syntax: POLY (10,40) (50,90) (40,45) (50,0) END Polygon = "POLY" Point {Point} "END". Point = "(" number "," number ")". We want a program that reads the input and draws the polygon Polygon (. Pt p, q; .) = "POLY" Point<p> (. Turtle.start(p); .) { "," Point<q> (. Turtle.move(q); .) } "END" (. Turtle.move(p); .) . Point<p> (. Pt p; int x, y; .) = "(" number (. x = t.val; .) "," number (. y = t.val; .) ")" (. p = new Pt(x, y); .) . We use "Turtle Graphics" for drawing Turtle.start(p); sets the turtle (pen) to point p Turtle.move(q); moves the turtle to q drawing a line Example: Image Description Language (50,90) (40,45) (10,40) (50,0)
Expr Term Term Factor Factor Factor 3 + 4 * 2 Example: Transform Infix to Postfix Expressions Arithmetic expressions in infix notation are to be transformed to postfix notation 3 + 4 * 23 4 2 * + (3 + 4) * 23 4 + 2 * Expr = Term { "+" Term (. Write("+"); .) | "-" Term (. Write("-"); .) } Term = Factor { "*" Factor (. Write("*"); .) | "/" Factor (. Write("/"); .) }. Factor = number (. Write(token.val); .) | "(" Expr ")". Write + Write * Write 3 Write 4 Write 2
ATGs according to Donald Knuth (1968) attributes scanner parser NT NT attributation NT NT NT NT T NT T T NT T T T syntax tree "decorated" syntax tree • Nonterminal symbols have attributes • static properties (do not change after their evaluation) • examples: type of an expression, address of a variable, ... Attributation The syntax tree is traversed (possibly several times up and down) until all attributes have been computed. Idea • ATGs so far: procedural descriptions (translation algorithms) • Every production is processed from left to right • In doing so, attributes are computed and semantic actions are executed
a b production p represents a section of the syntax tree A B C c d e f We must define all attributes that leave p (i.e. b, c, e) Attribute evaluation rules R(p) e.g.: Context condition CC(p) e.g.: c = a; e = d + foo(a); b = d + f; d >= f Attribute Evaluation Rules • For every production they define ... • the input attributes of all symbols on the right-hand side of the production • the output attributes of the symbol on the left-hand side of the production • any context conditions if necessary Example Aab = Bcd Cef . Production p:
Attributes Syntax tree for the input: 1BH Number Number val val Digits base val attributes have not yet been evaluated so far Digits "H" hex base val val Digits hex base val val hex val 1 B H Example: Computing the Value of a Hex Number Grammar (must be in BNF so that we can build a syntax tree) Number = Digits. // decimal number Number = Digits "H". // hexadecimal number Digits = hex. // hex ... 0..9, A..F Digits = Digits hex.
Attribute Evaluation Rules Production 1 Numberval = Digitsbaseval. Digits.base = 10; Number.val = Digits.val; Numberval = Digitsbaseval "H". Production 2 Digits.base = 16; Number.val = Digits.val; Digitsbaseval = hexval. Production 3 Digits.val = hex.val; CC: Digits.base == 10 && 0 hex.val 9 || Digits.base == 16 && 0 hex.val 15 Digitsbaseval = Digits1baseval hexval. Production 4 Digits1.base = Digits.base; Digits.val = Digits1.val * Digits.base + hex.val; CC: Digits.base == 10 && 0 hex.val 9 || Digits.base == 16 && 0 hex.val 15
Attributation of the Tree Scanner fills the attribute values of the terminal symbols Number val Digits "H" base val Digits hex base val val 11 hex val 1 1 B H
Attributation of the Tree (cont.) The tree is traversed top-down For every production we check which attribute evaluation rules are ready to be executed Production 2 Number val Numberval = Digitsbaseval "H". Digits.base = 16; Number.val = Digits.val; Digits "H" base val 16 Digits hex base val val 11 hex val 1 1 B H
Attributation of the Tree (cont.) The tree is traversed top-down For every production we check which attribute evaluation rules are ready to be executed Number val Production 4 Digitsbaseval = Digits1baseval hexval. Digits "H" base val 16 Digits1.base = Digits.base; Digits.val = Digits1.val * Digits.base + hex.val; CC: Digits.base == 10 && 0 hex.val 9 || Digits.base == 16 && 0 hex.val 15 Digits hex base val val 16 11 hex val 1 1 B H
Attributation of the Tree (cont.) The tree is traversed top-down For every production we check which attribute evaluation rules are ready to be executed Number val Digits "H" base val 16 Production 3 Digits hex base val val Digitsbaseval = hexval. 16 1 11 Digits.val = hex.val; CC: Digits.base == 10 && 0 hex.val 9 || Digits.base == 16 && 0 hex.val 15 hex val 1 1 B H
Attributation of the Tree (cont.) The tree is traversed bottom-up For every production we check which attribute evaluation rules are ready to be executed Number val Production 4 Digitsbaseval = Digits1baseval hexval. Digits "H" base val 16 27 Digits1.base = Digits.base; Digits.val = Digits1.val * Digits.base + hex.val; CC: Digits.base == 10 && 0 hex.val 9 || Digits.base == 16 && 0 hex.val 15 Digits hex base val val 16 1 11 hex val 1 1 B H
Attributation of the Tree (cont.) The tree is traversed bottom-up For every production we check which attribute evaluation rules are ready to be executed Production 2 Number val Numberval = Digitsbaseval "H". 27 Digits.base = 16; Number.val = Digits.val; Digits "H" base val 16 27 Digits hex base val val 16 1 11 hex val 1 1 B H All attributes have been computed end of the tree traversal
Context-free Grammar CFG = (T, N, P, S) T ... terminal symbols N ... nonterminal symbols P ... productions S ... start symbol Attributes A(X) ... attributes of the symbol X (written as X.a, X.b, ...) AS(X) ... output attributes of X (synthesized) AI(X) ... input attributes of X (inherited) Attribute evaluation rules R(p) ... attribute evaluation rules for production p: X0 = X1 ... Xn R(p) = {Xi.a = f(Xj.b, ..., Xk.c)} for all AS of the left-hand side and all AI of the right-hand side of p Context conditions CC(p) ... context conditions of production p: X0 = X1 ... Xn in the form of a Boolean expression B(Xi.a, ..., Xj.b) check the "static semantics", i.e. whether the input is semantically correct Definition of ATGs According to Knuth Attribute Grammar ATG = (CFG, A, R, CC) CFG ... context-free grammar A ... set of attributes R ... set of attribute evaluation rules CC ... set of context conditions
Complete ATGs Definition An ATG is called complete if for all productions p: X = Y1 ... Yn the following condition holds: all AS(X) and all AI(Yi) are computed in R(p)
Example A A B C B C ab cd ab cd D E F e f g well-defined circular not well-defined Well-defined ATGs (WAGs) Definition • An ATG is called well-defined (WAG) • if the ATG is complete and • if the relations between attributes are non-circular in every possible syntax tree In other words: We can find an attribute evaluation order for every possible syntax tree Checking for well-definedness is NP complete (can only be done in exponential time)! However, there are subclasses of WAG for which this check can be simplified.
Example Attributation code comp (b1 = a1) downB comp (a3 = b2) up comp (c1 = a2) downC comp (a4 = c2) up A a1 a2a3 a4 B C b1b2 c1c2 Ordered ATGs (OAGs) Definition An ATG is called ordered (OAG), if a fixed attribute evaluation order can be specified for every production regardless of its context in the syntax tree Attributation code of a production p can be specified by the following operations: compi ... execute attribute evaluation rule i from R(p) up ... go to the father in the syntax tree downi ... go to son i in the syntax tree
A a1 a2a3 a4 B C b1b2 c1c2 Attributation code comp (c1 = a2) downC comp (a4 = c2) up comp (b1 = a1) downB comp (a3 = b2) up The attributation order depends on where A occurs in the syntax tree. Counter-Example ATG which is not ordered A a1 a2a3 a4 B C b1b2 c1c2 Attributation code comp (b1 = a1) downB comp (a3 = b2) up comp (c1 = a2) downC comp (a4 = c2) up Checking whether a grammar is an OAG can be done in polynomial time.
Example A a1 a2 Information from the front of a program can be propagated backwards but not vice versa B C b1b2 c1c2 D E d1d2 e1e2 L-Attributed ATGs (LAGs) Definition An ATG is called L-attributed (LAG) if all attributes in the syntax tree can be avaluated in a single sweep (down and up, left to right). In other words: If the attributes can be computed during syntax analysis. For LAGs it is not even necessary to build a syntax tree (corresponds to our procedural ATGs).
Relations Between Classes of ATGs ATG WAG OAG LAG • in other words • every LAG is ordered • every OAG is well-defined WAG is more powerful than OAG OAG is more powerful than LAG