440 likes | 641 Views
Semantic Analysis (Symbol Table and Type Checking). Chapter 5. The Compiler So Far. Lexical analysis Detects inputs with illegal tokens Parsing Detects inputs with ill-formed parse trees Semantic analysis (contextual Analysis) Catches all remaining errors. What ’ s Wrong?. Example 1
E N D
The Compiler So Far • Lexical analysis • Detects inputs with illegal tokens • Parsing • Detects inputs with ill-formed parse trees • Semantic analysis (contextual Analysis) • Catches all remaining errors
What’s Wrong? • Example 1 int y = x + 3; • Example 2 String y = “abc” ; y ++ ;
Why a Separate Semantic Analysis? • Parsing cannot catch some errors • Some language constructs are not context-free • Example: All used variables must have been declared (i.e. inscope) • ex: { int x { .. { .. x ..} ..} ..} • Example: A method must be invoked with arguments of proper type (i.e. typing) • ex: int f(int, int) {…} called by f(‘a’, 2.3, 1)
More problems require semantic analysis • Is x a scalar, an array, or a function? • Is x declared before it is used? • Is x defined before it is used? • Are any names declared but not used? • Which declaration of x does this reference? • Is an expression type-consistent? • Does the dimension of a reference match the declaration? • Where can x be stored? (heap, stack, . . . ) • Does *p reference the result of a malloc()? • Is an array reference in bounds? • Does function foo produce a constant value?
Why is semantic analysis hard? • need non-local information • answers depend on values, not on syntax • answers may involve computation
Symbol Tables • Symbol Tables Environments • Mapping IDs to Types and Locations • Definitions Insert in the table • Use Lookup ID • Scope • Where the IDs are “visible” Ex: formal parameters, local variables in MiniJava -> inside the method where defined -- (private) variables in a class -> inside the class -- (public) method : visible anywhere (unless overridden)
Environments • A set of bindings ( -> ) Initial Env s0 Class C { int a; int b; int c; Env s1 = s0 + {a -> int, b -> int, c -> int} public void m() { System.out.println(a+c); int j = a+b; Env s2 = s1 + {j -> int} String a = “hello”; Env s3 = s2 + {a -> String} System.out.println(a);
Environments (Cont’d) Env s3 = s2 + {a -> String} System.out.println(a); System.out.println(a); System.out.println(a); } Env s1 } Env s0
Implementing Environments • Functional Style • Keep previous env and create new one • When restored, discard new one and back to old • Imperative Style • Destructive update the env(symbol tables) • Undo : need “undo stack”
Multiple Symbol Tables : ML-style structure M = sturct structure E = struct val a = 5 end s0 + s2 structure N = struct val b = 10 val a = E.a + b end s0 + s2 + s4 structure D = struct val d = E.a + N.a end Ends7 Initial Env s0 s1 = {a -> int} s2 = {E -> s1 } s3 = {b -> int,a -> int} s4 = {N -> s3 } s5 = {d -> int} s6 = {D -> s5 } s7 = s2 + s4 + s6
Multiple Symbol Tables : Java-style Initial Env s0 s1 = {a -> int} s2 = {E -> s1 } s3 = {b -> int,a -> int} s4 = {N -> s3 } s5 = {d -> int} s6 = {D -> s5 } s7 = s2 + s4 + s6 Package M; s7 class E { static int a = 5; } s7 class N { static int b = 10 static int a = E.a + b } s7 class D { static int d = E.a+ N.a } s7 End s7
Implementation – Imperative Symbol Table(inefficient nondestructive update) Using a Hash Table Update s s’ = s + {d |-> t4} Undo a t1 d t4 b t3 c t2 See Appel Program 5.2 (p106)
Implementation – Functional Symbol Table • Efficient Functional Approach s’ = s + {a |-> t} would return [s + {a |-> t} ] • If implemented with a Hashtable would have to create O(n) buckets for each scope • Is this a good idea?
dog bat dog camel emu 42 1 3 2 3 Implementation - Tree m1 m2 How could this be implemented? m2 = {m1 + emu |-> 42 } Want m2 from m1 in O(n) m1 = { bat |-> 1 , camel |-> 2, dog |-> 3 }
Symbols v.s Strings as table key • Symbol: • a wrapper for Stirngs • Symbol Representation • Comparing symbols for equality is fast. • Extracting an integer hash key is fast. • Comparing two symbols for “greater-than” is fast. • Properties: • Symbol s1,s2 => • s1 == s2 iff s1.equals(s2) iff s1.string == s2.string • publicclassSymbol{ publicStringtoString(); publicstaticSymbolgetSymbol(Stringn); }
symbol.Symbol publicclassSymbol{ publicStringname; // Symbol cannot be constructed directly privateSymbol(Stringn){name=n;} publicStringtoString(){ returnname;} privatestaticMapmap=newHashtable(); publicstaticSymbolgetSymbol(Stringn){ // or symbol(..) in book Stringu=n.intern(); Symbols=(Symbol)map.get(u); if(s==null){ s=newSymbol(u); map.put(u,s); } returns; } }
s a t1 c c c c b c t4 t4 t4 t4 t4 t4 a t3 b t2 Symbol Table Implementastion(efficient destructive update) Using a Hash Table top: Symbol marker: Binder null null
Some sample program(I) /** * The Table class is similar to java.util.Dictionary, * except that each key must be a Symbol and there is * a scope mechanism. */ public class Table { private java.util.Dictionary dict = new java.util.Hashtable(); private Symbol top; private Binder marks; public Table(){}
Some sample program(II) /** * Gets the object associated with the specified * symbol in the Table. */ public Object get(Symbol key) { Binder e = (Binder)dict.get(key); if (e==null) return null; else return e.value; } /** * Puts the specified value into the Table, * bound to the specified Symbol. */ public void put(Symbol key, Object value) { dict.put(key, new Binder(value, top, (Binder)dict.get(key))); top = key; }
Some sample program(III) /** * Remembers the current state of the Table. */ public void beginScope() {marks = new Binder(null,top,marks); top=null;} /** * Restores the table to what it was at the most recent * beginScope that has not already been ended. */ public void endScope() { while (top!=null) { Binder e = (Binder)dict.get(top); if (e.tail!=null) dict.put(top,e.tail); else dict.remove(top); top = e.prevtop; } top=marks.prevtop; marks=marks.tail; }
Some sample program(IV) package Symbol; class Binder { Object value; Symbol prevtop; Binder tail; Binder(Object v, Symbol p, Binder t) { value=v; prevtop=p; tail=t; } }
Type-Checking in MiniJava • Binding for type-checking in MiniJava • Variable and formal parameter • Var name <-> type of variable • Method • Method name <-> result type, parameters( including position information), local variables • Class • Class name <-> variables, method declaration, parent class
Symbol Table: example See Figure 5.7 on page 111 • Primitive types • int -> IntegerType() • Boolean -> BooleanType() • Other types • Int [] -> IntArrayType() • Class -> IdentifierType(String s)
PARAMS p int q int LOCALS ret int a int FIELDS f C j int[] g int METHODS start int stop boolean B C PARAMS p int LOCALS …. A MiniJava Program and its symbol table(Figure 5.7) class B { C f; int[] j; int q; public int start(int p, int q) { int ret; int a; /* … */ return ret; } public boolean stop(int p) { /* …*/ return false; } } class{ C /* …*/ }
SymbolTable : Real Story class SymbolTable { public SymbolTable(); public boolean addClass(String id, String parent); public Class getClass(String id); public boolean containsClass(String id); public Type getVarType(Method m, Class c, String id); public Method getMethod(String id, String classScope); public Type getMethodType(String id, String classScope); public boolean compareTypes(Type t1, Type t2); }
Be careful! • getVarType(Method m, Class c, String id) • In c.m, find variable id • Precedence: • Local variable in method • Parameter in parameter list • Variable in the class • Variable in the parent class • getMethod(), getMethodType() • May be defined in the parent Classes • compareTypes() • Primitive types : int, boolean, IntArrayType • Subtype : IdentifierType
SymbolTalbe : Class class Class { public Class(String id, String parent); public String getId(); public Type type(); public String parent(); public boolean addMethod(String id, Type type); public Method getMethod(String id); public boolean containsMethod(String id); public boolean addVar(String id, Type type); public Variable getVar(String id); public boolean containsVar(String id); }
SymbolTable : Variable class Variable{ public Variable(String id, Type type); public String id(); public Type type() }
SymbolTable : Method class Method { public Method(String id, Type type); public String getId(); public Type type(); public boolean addParam(String id, Type type); public Variable getParamAt(int i); public Variable getParam(String id); public boolean containsParam(String id); public boolean addVar(String id, Type type); public Variable getVar(String id); public boolean containsVar(String id); }
Type-Checking : Two Phases • Build Symbol Table • Type-check statements and expressions public class Main { public static void main(String [] args) { try { Program root = new MiniJavaParser(System.in).Program(); BuildSymbolTableVisitor v1 = newBuildSymbolTableVisitor(); v1.visit(root); new TypeCheckVisitor(v1.getSymTab()).visit(root); } catch (ParseException e) { System.out.println(e.toString()); } } }
BuildSymbolTableVisitor(); • See Program 5.8 on Page 112 public class BuildSymbolTableVisitor extends TypeDepthFirstVisitor { …. private Class currClass; private Method currMethod; …… // Type t; // Identifier i; public Type visit(VarDecl n) { Type t = visit(n.t); String id = n.i.toString();
BuildSymbolTableVisitor(); - Cont’d if (currMethod == null){ if (!currClass.addVar(id,t)){ error.complain(id + "is already defined in " + currClass.getId()); } } else { if (!currMethod.addVar(id,t)){ error.complain(id + "is already defined in " + currClass.getId() + "." + currMethod.getId()); } } return null; }
BuildSymbolTableVisitor() :TypeVisitor() public Type visit(MainClass n); public Type visit(ClassDeclSimple n); public Type visit(ClassDeclExtends n); public Type visit(VarDecl n); public Type visit(MethodDecl n); public Type visit(Formal n); public Type visit(IntArrayType n); public Type visit(BooleanType n); public Type visit(IntegerType n); public Type visit(IdentifierType n);
TypeCheckVisitor(SymbolTable); • See Program 5.9 on page 113 package visitor; import syntaxtree.*; public class TypeCheckVisitor extends DepthFirstVisitor { static Class currClass; static Method currMethod; static SymbolTable symbolTable; public TypeCheckVisitor(SymbolTable s){ symbolTable = s; }
TypeCheckVisitor(SymbolTable); - Cont’d // Identifier i; // Exp e; public void visit(Assign n) { Type t1 = symbolTable.getVarType(currMethod,currClass, n.i.toString()); Type t2 = n.e.accept( new TypeCheckExpVisitor(symbolTable) ); if (symbolTable.compareTypes(t1,t2)==false){ error.complain("Type error in assignment to " +n.i.toString()); } }
TypeCheckExpVisitor(SymbolTable) package visitor; import syntaxtree.*; public class TypeCheckExpVisitor extends TypeDepthFirstVisitor { // Exp e1,e2; public Type visit(Plus n) { if (! (n.e1.accept(this) instanceof IntegerType) ) { error.complain("Left side of Plus must be of type integer"); } if (! (n.e2.accept(this) instanceof IntegerType) ) { error.complain("Right side of Plus must be of type integer"); } return new IntegerType(); }
TypeCheckVisitor : Visitor() public void visit(MainClass n); public void visit(ClassDeclSimple n); public void visit(ClassDeclExtends n); public void visit(MethodDecl n); public void visit(If n); public void visit(While n); public void visit(Print n); public void visit(Assign n); public void visit(ArrayAssign n);
TypeCheckExpVisitor() : TypeVisitor() public Type visit(And n); // boolean public Type visit(LessThan n); // boolean public Type visit(Plus n); // int public Type visit(Minus n); public Type visit(Times n); public Type visit(ArrayLookup n); // int public Type visit(ArrayLength n); // int public Type visit(Call n); // result type public Type visit(IntegerLiteral n); // int public Type visit(True n); // boolean public Type visit(False n); public Type visit(IdentifierExp n); // symbol table lookup public Type visit(This n); // current class public Type visit(NewArray n); // int[] public Type visit(NewObject n); // public Type visit(Not n); // boolean
Overloading of Operators, …. • When operators are overloaded, the compiler must explicitly generate the code for the type conversion. • 2 + 2 2.0 + 3.4 2.4 + 4 • “abc” + 4 • For an assignment statement, both sides have the same type. When we allow extension of classes, the right hand side is a subtype of lhs. • long x = (int) y + 3 • Person p = new Student();
Method Calls e.m(…) • Lookup method in the SymbolTable to get parameter list and result type • Find m in class e • The parameter types must be matched against the actual arguments. • Result type becomes the type of the method call as a whole. • Etc, etc, …….
TypeChecking method call // Exp e; Identifier i; ExpList el; publicTypevisit(Calln){ Type rcvType = visit(n.e); if(!(receiverType instanceof IdentifierType)) error.complain(…); Method m = symbolTable.getMethod( n.i.toString(), rcvType.toString()); if(n.el.size() != m.getParamSize()) error.complain(…) for(inti=0;i<n.el.size();i++){ Type acType = visit(n.el.get(i)); Type fmType = m.getParam(i); if(!symbolTable.compareType(acType,fmType)) error.complain(…) ; } returnm.type(); }
Error Handling • For a type error or an undeclared identifier, it should print an error message. • And must go on….. • Recovery from type errors? • Do as if it were correct. • Not a big deal in our homework. • Example: • int i = new C(); • int j = i + 1; • still need to insert i into symbol table as an integer so the rest can be typechecked..