280 likes | 286 Views
This article explores syntax-directed translation in the context of compiler design, discussing lexical analyzers, token streams, and Java bytecode. It also covers attribute grammars, semantic rules, and the traversal algorithm used to assign attribute values to parse tree nodes.
E N D
The Structure of our Compiler Revisited Lexical analyzer Syntax-directedtranslator Characterstream Tokenstream Javabytecode Lex specification Yacc specificationwith semantic rules JVM specification
Syntax-Directed Definitions • A syntax-directed definition (or attribute grammar) binds a set of semantic rules to productions • Terminals and nonterminals have attributes holding values set by the semantic rules • A depth-first traversal algorithm traverses the parse tree thereby executing semantic rules to assign attribute values • After the traversal is complete the attributes contain the translated form of the input
Example Attribute Grammar Production Semantic Rule L EnE E1+TE TT T1*FT FF (E) F digit print(E.val)E.val:= E1.val + T.valE.val:= T.valT.val:= T1.val * F.valT.val:= F.valF.val:= E.valF.val:=digit.lexval Note: all attributes inthis example are ofthe synthesized type
Example Annotated Parse Tree L E.val= 16 E.val= 14 T.val = 2 F.val = 5 E.val= 9 T.val= 5 T.val = 9 F.val = 5 F.val = 9 Note: all attributes inthis example are ofthe synthesized type 9 + 5 + 2 n
Annotating a Parse Tree With Depth-First Traversals procedure visit(n : node);begin for each child m of n, from left to right dovisit(m); evaluate semantic rules at node nend
Depth-First Traversals (Example) L print(16) E.val= 16 E.val= 14 T.val = 2 F.val = 5 E.val= 9 T.val= 5 T.val = 9 F.val = 5 F.val = 9 Note: all attributes inthis example are ofthe synthesized type 9 + 5 + 2 n
Attributes • Attribute values typically represent • Numbers (literal constants) • Strings (literal constants) • Memory locations, such as a frame index of a local variable or function argument • A data type for type checking of expressions • Scoping information for local declarations • Intermediate program representations
Synthesized Versus Inherited Attributes • Given a productionAthen each semantic rule is of the formb := f(c1,c2,…,ck)where f is a function and ci are attributes of A and , and either • b is a synthesized attribute of A • b is an inherited attribute of one of the grammar symbols in
Synthesized Versus Inherited Attributes (cont’d) inherited Production Semantic Rule D T LT int…L id L.in:= T.typeT.type:=‘integer’…… := L.in synthesized
S-Attributed Definitions • A syntax-directed definition that uses synthesized attributes exclusively is called an S-attributed definition (or S-attributed grammar) • A parse tree of an S-attributed definition is annotated with a single bottom-up traversal • Yacc/Bison only support S-attributed definitions
Dependency Graphs with Cycles? • Edges in the dependency graph determine the evaluation order for attribute values • Dependency graphs cannot be cyclic A.a := f(X.x)X.x := f(Y.y)Y.y := f(A.a) A.a X.x Y.y Error: cyclic dependence
Example Annotated Parse Tree D T.type = ‘real’ L.in = ‘real’ real L.in = ‘real’ , id3.entry L.in = ‘real’ , id2.entry id1.entry
Example Annotated Parse Tree with Dependency Graph D T.type = ‘real’ L.in = ‘real’ real L.in = ‘real’ , id3.entry L.in = ‘real’ , id2.entry id1.entry
Evaluation Methods • Parse-tree methods determine an evaluation order from a topological sort of the dependence graph constructed from the parse tree for each input • Rule-base methods the evaluation order is pre-determined from the semantic rules • Oblivious methods the evaluation order is fixed and semantic rules must be (re)written to support the evaluation order (for example S-attributed definitions)
Concrete and Abstract Syntax Trees • A parse tree is called a concrete syntax tree • An abstract syntax tree (AST) is defined by the compiler writer as a more convenient intermediate representation E + E + T id * T T * id id id id id Concrete syntax tree Abstract syntax tree
The Structure of our Compiler Revisited Syntax-directedstatic checker Lexical analyzer Characterstream Tokenstream Javabytecode Syntax-directedtranslator Lex specification Yacc specification JVM specification Typechecking Codegeneration
Static versus Dynamic Checking • Static checking: the compiler enforces programming language’s static semantics • Program properties that can be checked at compile time • Dynamic semantics: checked at run time • Compiler generates verification code to enforce programming language’s dynamic semantics
Static Checking • Typical examples of static checking are • Type checks • Flow-of-control checks • Uniqueness checks • Name-related checks
Type Checking, Overloading, Coercion, Polymorphism class X { virtual int m(); } *x; class Y: public X { virtual int m(); } *y; int op(int), op(float); int f(float); int a, c[10], d; d = c + d; // FAIL *d = a; // FAIL a = op(d); // OK: static overloading (C++) a = f(d); // OK: coersion of d to float a = x->m(); // OK: dynamic binding (C++) vector<int> v; // OK: template instantiation
Flow-of-Control Checks myfunc(){ … break; // ERROR } myfunc(){ … switch (a) { case 0: … break; // OK case 1: … } } myfunc(){ … while (n) { … if (i>10) break; // OK } }
Uniqueness Checks myfunc(){ int i, j, i; // ERROR … } cnufym(int a, int a) // ERROR { … } struct myrec{ int name; }; struct myrec // ERROR{ int id; };
Name-Related Checks LoopA: for (int I = 0; I < n; I++) { … if (a[I] == 0) break LoopB; // Java labeled loop … }
One-Pass versus Multi-Pass Static Checking • One-pass compiler: static checking in C, Pascal, Fortran, and many other languages is performed in one pass while intermediate code is generated • Influences design of a language: placement constraints • Multi-pass compiler: static checking in Ada, Java, and C# is performed in a separate phase, sometimes by traversing a syntax tree multiple times
Type Expressions • Type expressions are used in declarations and type casts to define or refer to a type • Primitive types, such as int and bool • Type constructors, such as pointer-to, array-of, records and classes, templates, and functions • Type names, such as typedefs in C and named types in Pascal, refer to type expressions
Graph Representations for Type Expressions int *f(char*,char*) fun fun args pointer args pointer pointer pointer int pointer int char char char Tree forms DAGs
Cyclic Graph Representations Source program struct Node{ int val; struct Node *next; }; struct next val pointer int Internal compiler representation of the Node type: cyclic graph