210 likes | 227 Views
This text discusses semantic analysis and legality checks in programming languages, including disambiguation, name resolution, type resolution, and overload resolution. It also explores attributes and attribute computation in creating a formal model for representing semantic information.
E N D
Semantic Analysis • Legality checks • Check that program obey all rules of the language that are not described by a context-free grammar • Disambiguation • Name resolution, type resolution, overload resolution • Expanded intermediate representation • Annotate tree to guide subsequent phases
A formal model : attributes • Semantic information can be represented by computed values attached to an AST node • For an identifier: the corresponding entity • For a static expression: its computed value • For a function: the return type • For a record: the component names and types • For a derived type: its parent type • All of it is implicit in the original tree. Attributes provide compact, efficient representations
Attribute Computation • The value of an attribute at a node can be computed from the values of attributes at immediate neighbor nodes • The computation is keyed to the production in which the node appears • ProductionAssignment => Var := Lit ; • Equation: Type Lit = Type Var • The type of the literal is inherited from the variable that is the lhs of the assignment
Inherited and synthesized attributes • Production: N => ABC • An attribute of non-terminal N that is computed from the attributes of A, B, C is synthesized • An attribute of A that is computed from an attribute of N is inherited • An attribute of A that is computed from attributes of B, C Is inherited (“has to go through N to reach A”)
Attributes grammars • General formalism: define all context-dependent aspects as attributes. Provide equations for each attribute defined for each non-terminal. • There are no restrictions on dependencies: an attribute can depend on any attribute of other symbols appearing in the production • Semantic analysis is the computation of all attributes at each node of the AST • Attribute grammars are universal (Turing-equivalent)
The dream of full automation • Can define all aspects of the language with attribute grammars • Given a language for attributes, we can build an attribute evaluator, like a parser generator. • Attribute grammar + attribute evaluator • = automatically generated compiler • However: equations may be circular • detecting circularity is exponential • In practice, resulting compiler is too large / slow • Attributes are a powerful concept, not a universal tool.
Synthesized attributes Most useful attributes are synthesized, I.e. computed bottom-up. Example: numeric value of a base-2 representation: Bit => ‘0’ ValBit = 0 Bit => ‘1’ ValBit = 1 Bit_String => Bit ValStr = ValBit Bit_String => Bit_String Bit ValStr1 = 2*ValStr2 + ValBit
Inherited attributes • Inherited attributes describe context-dependent properties: visibility, typing. • Inherited attributes are computed top-down. Usually done as a separate pass over AST • The most important inherited attribute is the visibility environment, aka symbol table. • Typically represented as a global data structure, not as an attribute that is propagated from node to node.
Definitions and uses • A declaration introduces an entity: X : Integer; • The node for X is its defining occurrence • A subsequent occurrence of X in the current scope is a use of X X := 15; • The use-occurrence must indicate that this is the X defined above • The set of defining occurrences constitutes the symbol table.
Attributes of entities • The defining occurrence is a symbol table entry. • Holds all useful information about an entity • Type (another entity) • Size (numeric value: may be known statically) • Scope (another entity) • Name (pointer into names table) • Homonym (previous entity with same name) • Etc. (in GNAT, > 20 assorted fields. Described in Einfo)
Type entities and their attributes • Numeric types: low_bound, high_bound • Static expressions of related numeric type • Array types: list of index types, component type • Previously declared entities • Index bounds are expressions of the index type • Record types: list of components, variants • Entities appearing in component declarations • Variants indexed by values of discriminants • Flags: type is limited, type has tasks, type is packed, etc. (in GNAT, > 160 misc. predicates)
Attributes of program unit entities • All entities that contain local declarations have an attached list of local entities: • In GNAT, First_Entity, Last_Entity • Procedures: names and types of formals • Functions: names/types of formals, return type • Packages: separate lists of visible entities and private entities • Tasks: visible entries (operations), private data
Attributes of identifiers • For a variable: Entity denoted by identifier • Value, if entity is static constant • For a function: • set of possible interpretations (if overloaded) • single final interpretation (resolution) • For all: Type (redundant but convenient)
Attributes of Expressions • Possible types (if constituents are overloaded) • Type (after resolution) • Is_Static_Expression • Expr_Value (if static) • Raises_Constraint_Error (may be known) • In GNAT, described in Sinfo.
Bottom-up/Top Down processing • With recursion, very similar: procedure Analyze (N : Node_Id) is begin-- bottom-up analyze each child of N Compute local attributes end; procedure Resolve (N : Node_Id, Typ : Entity_Id) is begin– top-down Compute local attributes Resolve each child of N with information from N end;
Name Resolution • Compute the entity denoted by each identifier. • Apply visibility rules of language: • For a block-structured language, examine local scope first. • If not found, look at enclosing scopes • If not found, look at scopes in context (with_clauses, use_clauses) • If not found, look at implicit rules for operators • Entities with same name linked in homonym chain
Type resolution • Top-down pass: compute possible interpretations of each constituent, and their types • X + Y • : if X and Y have same numeric type, node has type of X • A (J) • if A is of an array type and J has the proper type for an index, node has component type of the type of A • F (X, Y, Z) • if F is a function and X, Y, Z have proper types for its formals, node has return type of F.
Overload Resolution • If a constituent is overloaded, context must impose a single type for resolution. function Convert (x : integer) return integer; function Convert (x : integer) return complex; function Convert (x : integer) return float; … Var := Convert (5); • Compute possible interpretations of Convert, resolve with type of Var. • Need to manipulate sets of names for types.
Finding a single interpretation • For a procedure call proc (f (x), g(y), h (z)); • proc, f, g, and h may be overloaded. • There must be a single interpretation of proc whose formal parameters are compatible with one of the possible interpretations of f, g, h. • Once proc is identified, resolve f with the type of its first formal, g with the type of the second, etc. • If more than one interpretation: ambiguous call • If none: illegal call
Analysis and expansion • Expansion replaces portions of AST with semantically equivalent portions for which it is easier to generate code • New tree fragments must be decorated with semantic information: • expansion and analysis are mutually recursive
Expansion: aggregates Length: integer := 5; type Arr is array (1..Length) of integer; X : integer := 22; Thing : Arr := (X, others => 42); -- complex construct Aggregate expands into: Thing : Arr; Thing (1) := X; Temp := 1; -- Created by compiler while Temp <= Length loop Thing (Temp) := 42; Temp := Integer’Succ (Temp); end loop;