280 likes | 290 Views
This text explores the concept of syntax-directed translation in compiler design, including attribute grammars, semantic rules, and the traversal algorithm.
E N D
The Structure of our Compiler Revisited Lexical analyzer Syntax-directedtranslator Characterstream Tokenstream Javabytecode Lex specification Yacc specificationwith semantic rules JVM specification
Syntax-Directed Definitions • A syntax-directed definition (or attribute grammar) binds a set of semantic rules to productions • Terminals and nonterminals have attributes holding values set by the semantic rules • A depth-first traversal algorithm traverses the parse tree thereby executing semantic rules to assign attribute values • After the traversal is complete the attributes contain the translated form of the input
Example Attribute Grammar Production Semantic Rule L EnE E1+TE TT T1*FT FF (E) F digit print(E.val)E.val:= E1.val + T.valE.val:= T.valT.val:= T1.val * F.valT.val:= F.valF.val:= E.valF.val:=digit.lexval Note: all attributes inthis example are ofthe synthesized type
Example Annotated Parse Tree L E.val= 16 E.val= 14 T.val = 2 F.val = 5 E.val= 9 T.val= 5 T.val = 9 F.val = 5 F.val = 9 Note: all attributes inthis example are ofthe synthesized type 9 + 5 + 2 n
Annotating a Parse Tree With Depth-First Traversals procedure visit(n : node);begin for each child m of n, from left to right dovisit(m); evaluate semantic rules at node nend
Depth-First Traversals (Example) L print(16) E.val= 16 E.val= 14 T.val = 2 F.val = 5 E.val= 9 T.val= 5 T.val = 9 F.val = 5 F.val = 9 Note: all attributes inthis example are ofthe synthesized type 9 + 5 + 2 n
Attributes • Attribute values typically represent • Numbers (literal constants) • Strings (literal constants) • Memory locations, such as a frame index of a local variable or function argument • A data type for type checking of expressions • Scoping information for local declarations • Intermediate program representations
Synthesized Versus Inherited Attributes • Given a productionAthen each semantic rule is of the formb := f(c1,c2,…,ck)where f is a function and ci are attributes of A and , and either • b is a synthesized attribute of A • b is an inherited attribute of one of the grammar symbols in
Synthesized Versus Inherited Attributes (cont’d) inherited Production Semantic Rule D T LT int…L id L.in:= T.typeT.type:=‘integer’…… := L.in synthesized
S-Attributed Definitions • A syntax-directed definition that uses synthesized attributes exclusively is called an S-attributed definition (or S-attributed grammar) • A parse tree of an S-attributed definition is annotated with a single bottom-up traversal • Yacc/Bison only support S-attributed definitions
Dependency Graphs with Cycles? • Edges in the dependency graph determine the evaluation order for attribute values • Dependency graphs cannot be cyclic A.a := f(X.x)X.x := f(Y.y)Y.y := f(A.a) A.a X.x Y.y Error: cyclic dependence
Example Annotated Parse Tree D T.type = ‘real’ L.in = ‘real’ real L.in = ‘real’ , id3.entry L.in = ‘real’ , id2.entry id1.entry
Example Annotated Parse Tree with Dependency Graph D T.type = ‘real’ L.in = ‘real’ real L.in = ‘real’ , id3.entry L.in = ‘real’ , id2.entry id1.entry
Evaluation Methods • Parse-tree methods determine an evaluation order from a topological sort of the dependence graph constructed from the parse tree for each input • Rule-base methods the evaluation order is pre-determined from the semantic rules • Oblivious methods the evaluation order is fixed and semantic rules must be (re)written to support the evaluation order (for example S-attributed definitions)
Concrete and Abstract Syntax Trees • A parse tree is called a concrete syntax tree • An abstract syntax tree (AST) is defined by the compiler writer as a more convenient intermediate representation E + E + T id * T T * id id id id id Concrete syntax tree Abstract syntax tree
The Structure of our Compiler Revisited Syntax-directedstatic checker Lexical analyzer Characterstream Tokenstream Javabytecode Syntax-directedtranslator Lex specification Yacc specification JVM specification Typechecking Codegeneration
Static versus Dynamic Checking • Static checking: the compiler enforces programming language’s static semantics • Program properties that can be checked at compile time • Dynamic semantics: checked at run time • Compiler generates verification code to enforce programming language’s dynamic semantics
Static Checking • Typical examples of static checking are • Type checks • Flow-of-control checks • Uniqueness checks • Name-related checks
Type Checking, Overloading, Coercion, Polymorphism class X { virtual int m(); } *x; class Y: public X { virtual int m(); } *y; int op(int), op(float); int f(float); int a, c[10], d; d = c + d; // FAIL *d = a; // FAIL a = op(d); // OK: static overloading (C++) a = f(d); // OK: coersion of d to float a = x->m(); // OK: dynamic binding (C++) vector<int> v; // OK: template instantiation
Flow-of-Control Checks myfunc(){ … break; // ERROR } myfunc(){ … switch (a) { case 0: … break; // OK case 1: … } } myfunc(){ … while (n) { … if (i>10) break; // OK } }
Uniqueness Checks myfunc(){ int i, j, i; // ERROR … } cnufym(int a, int a) // ERROR { … } struct myrec{ int name; }; struct myrec // ERROR{ int id; };
Name-Related Checks LoopA: for (int I = 0; I < n; I++) { … if (a[I] == 0) break LoopB; // Java labeled loop … }
One-Pass versus Multi-Pass Static Checking • One-pass compiler: static checking in C, Pascal, Fortran, and many other languages is performed in one pass while intermediate code is generated • Influences design of a language: placement constraints • Multi-pass compiler: static checking in Ada, Java, and C# is performed in a separate phase, sometimes by traversing a syntax tree multiple times
Type Expressions • Type expressions are used in declarations and type casts to define or refer to a type • Primitive types, such as int and bool • Type constructors, such as pointer-to, array-of, records and classes, templates, and functions • Type names, such as typedefs in C and named types in Pascal, refer to type expressions
Graph Representations for Type Expressions int *f(char*,char*) fun fun args pointer args pointer pointer pointer int pointer int char char char Tree forms DAGs
Cyclic Graph Representations Source program struct Node{ int val; struct Node *next; }; struct next val pointer int Internal compiler representation of the Node type: cyclic graph