300 likes | 579 Views
Chapter 5 Basic Semantics. Ways to specify semantics. Language reference manual – commonly used, but lack of precision By a defining translator – questions cannot be answered in advance – only by trying it By formal definition – denotational semantics – complex and abstract
E N D
Chapter 5Basic Semantics K. Louden, Programming Languages
Ways to specify semantics • Language reference manual – commonly used, but lack of precision • By a defining translator – questions cannot be answered in advance – only by trying it • By formal definition – denotational semantics – complex and abstract • A fundamental step in defining semantics is to describe the meaning of each identifier. K. Louden, Programming Languages
Attributes • Properties of language entities, especially identifiers used in a program. • Important examples: • Value of an expression • Data type of an identifier • Maximum number of digits in an integer • Location of a variable • Code body of a function or method • Declarations ("definitions") bind attributes to identifiers. • Different declarations may bind the same identifier to different sets of attributes. K. Louden, Programming Languages
Binding times can vary widely: • Value of an expression: during execution or during translation (constant expression). • Data type of an identifier: translation time (Java) or execution time (Smalltalk, PERL). • Maximum number of digits in an integer: language definition time or language implementation time. • Location of a variable: load or execution time. • Code body of a function or method: translation time or link time or execution time. K. Louden, Programming Languages
Two general categories of binding: static (prior to execution) and dynamic • Interpreters will perform most bindings dynamically • Concern is earliest time when it COULD be bound, not when it actually is • Possible times • Language definition • Language implementation • Translation time • Link time • Load time • Execution time - dynamic }static K. Louden, Programming Languages
Many of the most important and subtle difference between languages involve binding times. • Simple changes to the language (say adding recursion) may drastically change binding times. • Languages in which types are dynamically bound are dramatically different from those in which types are statically bound. • For example: Dynamic Type Binding • flexibility - can write a generic sort routine where type isn't specified • error detection is diminished • is expensive as type checking must be done at run time; often implemented using pure interpreters K. Louden, Programming Languages
Classes of Binding Times (listed from late to early) 1. Execution Time (Late Binding). Variables to their values. Variables to a particular storage location (termed dynamic storage allocation). • At entry to a subprogram or block.Example: formal to actual parameters and formal parameters to actual locations. • At arbitrary points during execution. Example: variables to values. In some languages, variables to types. Consider Prolog - variable type is determined dynamically 2. Load time: globals bound to location 3. Link time: body of external function bound to call instruction 4. Compile time (Early Binding). • Bindings chosen by programmer. Variable names, types, names. • Bindings chosen by translator.Example: variables to storage (load time) (termed static storage allocation).Example: particular machine instruction for a statement.Example: initial values of variables (if none specified) Example: in C declaration defines type by gives no space 5. Language Implementation Time.Example: Association of enumerated type values with integers.Example: maxint 6. Language definition Time - probably the most important binding time.Example: structure of language fixed, set of all basic data types, set of statements: syntax and semantics fixed, predefined types. K. Louden, Programming Languages
Symbol table and environment • A dictionary or table is used to maintain the identifier/attribute bindings. • It can be maintained either during translation (symbol table) or execution (environment) or both. • Pre-translation entities are entered into the initial or default table. • If both are maintained, the environment can usually dispense with names, keeping track only of locations (names are maintained implicitly). K. Louden, Programming Languages
Declarations bind identifiers to attributes.Examples of declarations (C) • int x = 0;Explicitly specifies data type and initial value. Implicitly specifies scope (see next slide) and location in memory. • int f(double);Explicitly specifies type (double int). Implicitly specifies nothing else: needs another declaration specifying code. • The former is called a definition(as specifies all potential attributes)in C, the latter is simply a declaration (as other attributes- like function body - need to be specified later). K. Louden, Programming Languages
Scope • The scope of a declaration is the region of the program to which the bindings established by the declaration apply. • Informally - Scope of a variable: range of statements in which the variable is visible • A variable is visible in a statement if it can be referenced in that statement. (Scope holes caused by new declarations) • In a block-structured language, the scope is typically the code from the end of the declaration to the end of the "block" (indicated by braces {…} in C and Java) in which the declaration occurs. • Scope can extend backwards to the beginning of the block in certain cases (class declarations in Java and C++, top-level declarations in Scheme). K. Louden, Programming Languages
Lexical vs. dynamic scope • Scope is maintained by the properties of the lookup operation in the symbol table or environment. • If scope is managed statically (prior to execution), the language is said to have static or lexical scope ("lexical" because it follows the layout of the code in the file). • If scope is managed directly during execution, then the language is said to have dynamic scope. • It is possible to maintain lexical scope during execution - via static links in the call stack. K. Louden, Programming Languages
Java scope example public class Test{ public static int x = 2; public static void f() { System.out.println(x); } public static void main(String[] args) { int x = 3; f(); }} • This prints 2, but under dynamic scope it would print 3 (the most recent declaration of x in the execution path is found). K. Louden, Programming Languages
Dynamic scope evaluated • Almost all languages use lexical scope • With dynamic scope the meaning of a variable cannot be known until execution time, thus there cannot be any static checking. • Originally used in Lisp. Scheme could still use it, but doesn't. Some languages still use it: VBScript, Javascript, Perl (older versions). • Lisp inventor (McCarthy) now calls it a bug. • Still useful as a pedagogical tool to understand the workings of scope. K. Louden, Programming Languages
Symbol table structure – how handle multiple uses of same name? • A table of little stacks of declarations under each name. For example the table for the Test class of slide 13 would look as follows inside main (using lexical scope): K. Louden, Programming Languages
Can be deleted after leaving f Current table inside main Symbol table structure (2) • Alternatively, a stack of little tables, one for each scope. For example, the previous example would look as follows (lexical scope): K. Louden, Programming Languages
Symbol table construction • Symbol table is constructed as declarations are encountered (insert operation). • Lookups occur as names are encountered • In lexical scope, lookups occur either as names are encountered in symbol table to that point (declaration before use—C), or all lookups are delayed until after the symbol table is fully constructed and then performed (Java class—scope applies backwards to beginning of class). • In dynamic scope, need links to tell you which declarations to use K. Louden, Programming Languages
Symbol table structure evaluated • Which organization is better? • Table of little stacks is simpler (C, Pascal). • Stack of little tables is more versatile, and helpful when you need to recover outer scopes from within inner ones or from elsewhere in the code (Ada, Java, C++). • Normally, no specific table structure is part of a language specification: any structure that provides the appropriate properties will do. K. Louden, Programming Languages
Overloading • Overloading is a property of symbol tables that allows them to successfully handle declarations that use the same name within the same scope. • It is the job of the symbol table to pick the correct choice from among the declarations for the same name in the same scope. This is called overload resolution. • Overloading typically applies only to functions or methods. • Overloading must be distinguished from dynamic binding in an OO language. K. Louden, Programming Languages
An example in Java:public class Overload { public static int max(int x, int y) { return x > y ? x : y;} public static double max(double x, double y) { return x > y ? x : y;} public static int max(int x, int y, int z) { return max(max(x,y),z);} public static void main(String[] args) { System.out.println(max(1,2)); System.out.println(max(1,2,3)); System.out.println(max(4,1.3)); }} • Adding more max functions that mix double and int parameters is ok. But adding ones that mix double and int return values is not! K. Louden, Programming Languages
Allocation • Can be constructed entirely statically (Fortran): all vars and functions have fixed locations for the duration of execution. • Can also be entirely dynamic: functional languages like Scheme and ML. • Names of constants may represent purely compile time quantities. • Most language use a mix: C, C++, Java, Ada. • Consists of three components: • A fixed area for static allocation • A stack area for lifo allocation (usually the processor stack) • A "heap" area for on-demand dynamic allocation (with or without garbage collection) K. Louden, Programming Languages
© 2003 Brooks/Cole - Thomson Learning™ Typical environment organization (possible C)[Figure 5.25, p. 165)] K. Louden, Programming Languages
The Runtime Stack • Used for: • Procedure/function/method calls • temporaries • local variables • Temporaries: intermediate results that cannot be kept in registers; not considered here further. • Procedure calls: (parameters and return values) • Local variables: part of calls, but can be considered independently, showing LIFO behavior for nested scopes (next slide). K. Louden, Programming Languages
Example of stack-based allocation in C within a procedure: (1) A: { int x; (2) char y; (4) B: { double x; (5) int a; (7) } /* end B */ (8) C: { char y; (9) int b; (11) D: { int x; (12) double y; (14) } /* end D */ (16) } /* end C */ (18) } /* end A */ Point #1 Point #2 K. Louden, Programming Languages
An alternative: a flat local spaceAll locations can be determined at compile time • All local variables allocated at onceWastes some space, but not critically. • the primary structure of the stack is still the call structure: a complete record of a call on the stack is called an activation record or frame, and the stack is referred to as the call stack. (Chapter 8) • Java promotes a flat space by forbidding nested redeclarations, but this is not an essential property: a symbol table can easily distinguish nested declarations as A.x, A.y, A.B.x, A.B.a, etc. K. Louden, Programming Languages
In "standard" languages (C, C++, Java) heap allocation requires a special operation: new. Any kind of data can be allocated on the heap in C/C++; in Java all objects and only objects are allocated on the heap. Even with heap allocation available in Java & C/C++, the stack is still used to represent calls. In C/C++, deallocation is typically by hand (destructors), but it is hard to do right. Java uses a garbage collector that periodically sweeps the heap looking for data that cannot be accessed any more by the program and adding it back to free space. Heap Allocation K. Louden, Programming Languages
Lifetime • The lifetime of a program entity is the duration of its allocation in the environment. • Allocation is static when the lifetime is the duration of the entire program execution. • Lifetime is related to but not identical to scope. With scope holes, lifetime can extend to regions of the program where the program entity is not accessible. • It is also possible for scope to exceed lifetime when a language allows locations to be manipulated directly (as for example manual deallocation). This is of course very dangerous! K. Louden, Programming Languages
Variables and Constants • A variable is an object whose stored value can change during execution. • x=y (we want value of y but address of x) • referred to as l-value and r-value • A constant is an object whose value does not change throughout its lifetime. • Constants are often confused with literals: constants have names, literals do not. K. Louden, Programming Languages
Constants • Compile-time constant in Java:static final int zero = 0; • Load-time constant in Java:static final Date now = new Date(); • Dynamic constant in Java:any non-static final assigned in a constructor. • Java takes a very general view of constants, since it is not very worried about getting rid of them during compilation. • C takes a much stricter view of constants, essentially forcing them to be capable of elimination during compilation. K. Louden, Programming Languages
Aliases • An alias occurs when the same object is bound to two different names at the same time. This is fairly common with Java objects. • Side Effect: changes in value that persists after execution. • Many side effects are intentional • x=5 intent was to change x • Obj.setT(2) – intent was to set value of t • Swap(a,b) - intent was to change values • Sometimes a side effect is not intentional • Sqrt(x) - what if it set x to zero? • T=0 - what if an aliased variable was also changed? K. Louden, Programming Languages
Dangling References, and Garbage • A dangling reference is a location that has been deallocated from the environment, but is still accessible within the program. Dangling references are impossible in a garbage-collected environment with no direct access to addresses. • Garbage is memory that is still allocated in the environment but has become inaccessible to the program. Garbage can be a problem in a non-garbage collected environment, but is much less serious than dangling references. K. Louden, Programming Languages