500 likes | 706 Views
Chap. 7, Syntax-Directed Compilation. J. H. Wang Dec. 6, 2011. Outline. Overview Bottom-Up Syntax-Directed Translation Top-Down Syntax-Directed Translation Abstract Syntax Trees AST Design and Construction AST Structures for Left and Right Values Design Patterns for ASTs. Overview.
E N D
Chap. 7, Syntax-Directed Compilation J. H. Wang Dec. 6, 2011
Outline • Overview • Bottom-Up Syntax-Directed Translation • Top-Down Syntax-Directed Translation • Abstract Syntax Trees • AST Design and Construction • AST Structures for Left and Right Values • Design Patterns for ASTs
Overview • Syntax-directed translation • The work performed by a compiler while parsing • Syntactic actions • Semantic actions
Semantic Actions and Values • Semantic actions • Associated code sequence that will execute when the production is applied • Semantic values • For production A -> X1…Xn, a semantic value for each symbol • Terminals: values originate from the scanner • Nonterminals: to compute a value for A based on the values assigned to X1…Xn
Synthesized and Inherited Attributes • Synthesized attributes • Attributes flow from the leaves of a derivation tree toward its root • Ex.: evaluating expressions (Fig. 7.1) • Inherited attributes • Attribute values pass from parent to child • Ex.: counting the position of each x in a string (Fig. 7.2)
Bottom-Up Syntax-Directed Translation • A->X1…Xn • For yacc and Bison • Xi: $i • A: $0 • For JavaCUP • X:val • Two stacks in bottom-up parser • Syntactic stack (parse stack): to manipulate terminal and nonterminal symbols • Semantic stack: to manipulate semantic values of these symbols
Example • Ex.: 4 3 1 $ • (Fig. 7.3) • Semantic values for nonterminal symbols: computed by semantic actions • Semantic values for terminal symbols: established by the scanner • Ex.: o 4 3 1 $ • Base-8 (octal): (Fig. 7.4) • Problem: the information required at a semantic action is not available from below • Semantic actions allowed only on reductions
Rule Cloning • A similar sequence of input symbols should be treated differently depending on the context • Ex.: (Fig. 7.5) • Redundancy in productions
Forcing Semantic Actions • Introducing unit productions of the form AX • Semantic actions can be associated with the reduction of AX • If a semantic action is desired between two symbols Xm and Xn, • a production of the form A can be introduced • Ex.: (Fig. 7.6)
Another example for assigning the base with the symbol x • (Fig. 7.7)
Aggressive Grammar Restructuring • Reasons to avoid using global variables • Grammar rules are often invoked recursively, and the global variables can introduce unwanted interactions • Global variables can make semantic actions difficult to write and maintain • Global variables may require setting or resetting
More robust solution • Sketch the parse tree without global variables • Revise the grammar to achieve the desired parse tree • Verify the revised grammar is still suitable for parser construction (e.g. LALR(1)) • Verify the revised grammar still generates the same language • (Fig. 7.8) • Keep the base in the semantic values • Propagate the value up the parse tree
Top-Down Syntax-Directed Translation • Using the recursive-descent parsers • Semantic actions can be written directly into the parser • Ex.: Lisp-like expressions (Fig. 7.9) • ( plus 31 ( prod 10 2 20 ) ) $ • Inherited values: parameters passed into a method • Synthesized values: returned by methods • (Fig. 7.10)
Abstract Syntax Trees • The central data structure for all post-parsing activities • AST must be concise • AST must be sufficiently flexible • Concrete vs. abstract trees • (Fig. 7.3 & 7.4) • (Fig. 7.11)
An Efficient AST Data Structure • Considering • AST is typically constructed bottom-up • Lists of siblings are typically generated by recursive rules • Some AST nodes have a fixed number of children, but some may require an arbitrarily large number of children • (Fig. 7.12)
Each node is of constant size • Each node points to its next (right) sibling, and also its leftmost sibling • Each node n points to its leftmost child • Each node points to its parent
Infrastructure for Creating ASTs • Methods to create and manage AST nodes (Fig. 7.13) • MakeNode(t) • MakeNode(int n), MakeNode(Symbol s), MakeNode(Operator o), MakeNode() • X.MakeSiblings(y) • X.AdoptChildren(y) • MakeFamily(op, kid1, … kidn)
AST Design and Construction • Important forces that influence the design of an AST • It should be possible to unparse (i.e. reconstitute) an AST • AST must hold sufficient information • The implementation of an AST should be decoupled from the essential information represented within the AST • Different views from different phases of a compiler
Process of the design of an appropriate AST structure • An unambiguous grammar for L is devised • An AST for L is devised • Semantic actions are placed in the grammar to construct the AST • Passes of the compiler are designed • Ex. (Fig. 7.14)
Design • Assignment statements • Fig. 7.15(a) • If statements • Fig. 7.15(c) • While statements • Fig. 7.15(d) • Block of statements • Fig. 7.15(e) • Plus operations • Fig. 7.15(b)
Construction • Semantic actions are added into the parse to construct the AST structure • (Fig. 7.17) • Example: (Fig. 7.18 & 7.19) • if (x + y) { while (z) z = z +1 od; x = 8 }else z = 7 fi $
AST Structure for Left and Right Values • Identifier • Value • Location (address) • Ex: x=y • y: value of y (right value, or R-value) • x: location of x (left value, or L-value) • Ex: • *p = 0; • x = &p;
Given the location of an identifier, we can always find the value by indirection • But we cannot find an identifier’s location from its value • We always regard an identifier as representing its location • Explicit deref nodes will be placed to show exactly where indirection occurs • (Fig. 7.20, 7.21, & 7.22)
Design Patterns for ASTs • (omitted)