900 likes | 908 Views
Programming Languages Lecture Note #6 2012 년 1 학기. Yongjoo Cho ( 조용주) ycho@smu.ac.kr. Control Abstraction. Data abstraction Primary purpose is to represent information Control Abstraction Principal purpose is to perform a well-defined operation
E N D
Programming LanguagesLecture Note #62012년 1학기 Yongjoo Cho (조용주) ycho@smu.ac.kr
Control Abstraction • Data abstraction • Primary purpose is to represent information • Control Abstraction • Principal purpose is to perform a well-defined operation • Subroutines are the principal mechanism for control abstraction in most programming languages • Most subroutines are parameterized: the caller passes arguments that influence the subroutine’s behavior, or provide it with data on which to operate • Arguments are called actual parameters, which are mapped to the subroutine’s formal parameters
Review of Stack Layout – Stack-based Allocation • Central stack for • Arguments (parameters) and return values • Local variables • Temporaries • Other bookkeeping information • return PC (program counter), saved registers, debugging information • Why a stack? • allocate space for recursive routines (not necessary in FORTRAN – no recursion) • For the language that permits recursion cannot use the static locations to store the local variables of a subroutine • reuse space (in all programming languages)
Stack-based Allocation • Local variables and arguments are assigned fixed OFFSETS from the stack pointer or frame pointer at compile time • Maintenance of stack is responsibility of calling sequence and subroutine prologue and epilogue • space is saved by putting as much in the prolog and epilog as possible • time may be saved by • putting stuff in the caller instead or • combining what's known in both places (interprocedural optimization)
Contents of a Stack Frame • Bookeeping • Reutrn PC (dynamic link) • Saved registers • Line number • Saved display entries • Static link • Arguments and returns • Local variables • temporaries
Review of Stack Layout • At any given time, stack pointer register contains the address of either the last used location at the top of the stack or the first unused location, depending on the convention • The frame pointer register contains an address within the frame • Objects in the frame are accessed via displacement addressing with respect to the frame pointer
Review of Stack Layout • If the size of an object (e.g., a local array) is not known at compile time, then the object is placed in a variable-size area at the top of the frame; its address and dope vector are stored in the fixed-size portion of the frame at a statically known offset from the frame pointer • If the size of an argument is not known at compile time, do the similar thing as with the unknown sized object
Review of the Stack Layout • In a language with nested subroutines and static scoping (e.g., Pascal, Ada, ML, Common Lisp, or Scheme), objects in surrounding subroutines, which are neither local nor global, can be found by maintaining a static chain.
Review of the Stack Layout • Each stack frame contains a reference to the frame of the lexically surrounding subroutine – static link • The saved value of the frame pointer, which will be restored on subroutine return – dynamic link • The static and dynamic links may or may not be the same, depending on whether the current routine was called by its lexically surrounding routine, or by some other routine nested in that surrounding routine • In fig. 8.1, if subroutine D is called directly from B, then clearly B’s frame will already be on the stack • When control enters B (placing B’s frame on the stack), D comes into view • D can be called by C or by any other routine (not shown) that is nested inside of C or D, but only because these are also within B
Calling Sequence (8.2) • Calling sequence • The code executed by the caller immediately before and after a subroutine call • Sometimes, refers to the combined operations of the caller, the prologue, the epilogue • Prologue • Code executed at the beginning • Epilogue • Code executed at the end of the subroutine
Calling Sequence • Tasks that must be accomplished while calling the subroutine • Passing parameters • Saving the return address • Changing the program counter • Changing the stack pointer to allocate space • Saving registers that may contain important values • Changing the frame pointer to refer to the new frame • Executing initialization code for any objects in the new frame that require it
Calling Sequence • Tasks that must be executed on the way out from the subroutine • Passing return parameters or function values • Executing finalization code for any local objects that require it • Deallocating the stack frame (restoring the stack pointer) • Restoring saved registers • Restoring the frame counter • Some of the tasks (e.g., passing parameters) must be performed by the caller • Most of the tasks can be performed either by the caller or the callee
Calling Sequence • In general, we want to make the callee do as much work as possible • Tasks performed in the callee appear only once in the target program, but tasks performed in the caller appear at every call site
Saving and Restoring Registers • The ideal approach is to save precisely those registers that are both in use in the caller and needed for other purposes in the callee • A simple solution • the caller to save all registers that are in use, or for the callee to save all registers that it will overwrite • Calling sequence conventions • Registers not reserved for special purposes are divided into two sets of approximately equal size • One set is saved by callers, the other set is the callee’s responsibility • The compiler uses callee-saves registers for local variables and other long-lived values whenever possible • It uses the caller-saves set for transient values • The result of these conventions is that the caller-saves registers are seldom saved by either party
Maintaining the Static Chain • In languages with nested subroutines, the work maintaining the static chain must be performed by the caller • The standard approach is for the caller to compute the callee’s static link and to pass it as an extra, hidden parameter • The callee is nested (directly) inside the caller • The callee’s static link should refer to the caller’s frame • Thus, the caller passes its own frame pointer as the callee’s static link • The callee is k >= 0 scopes “outward” – closer to the outer level of lexical nesting • All scopes that surround the callee also surround the caller • The caller dereferences its own static link k times and passes the result as the callee’s static link
Typical Calling Sequence • The calling sequence may operate as follows • Saves any caller-saves registers whose values will be needed after the call • Computes the values of arguments and moves them into the stack or registers • Computes the static link (if this is a language with nested subroutines) and passes it as an extra, hidden argument • Uses a special subroutine call instruction to jump to the subroutine, simultaneously passing the return address on the stack or in a register
Typical Calling Sequence • Prologue • Allocates a frame by subtracting an appropriate constant from the sp • Saves the old frame pointer into the stack, and assigns it an appropriate new value • Saves any callee-saves registers that may be overwritten by the current routine (including the static link and return address, if they were passed in registers) • Epilogue • Moves the return value (if any) into a register or a reserved location in the stack • Restores callee-saves registers if needed • Restores the fp and the sp • Jumps back to the return address
Displays (8.2.1) • Disadvantage of static chains • Requires lots of memory reference • E.g., to access an object in a scope k levels out requires that the static chain be dereferenced k times, which requires k + 1 memory accesses • Displays is designed to solve this issue • Display • Small array that replaces the static chain • jth element of the display contains a reference to the frame of the most recently active subroutine at lexical nesting level j • An object k levels out can be found at a statically known offset from the address stored in element j = i – k of the display • Displays are not that common in these days
Case Studies: C on the MIPS; Pascal on the x86 • CISC and RISC conventions • CISC machines • Tend to pass arguments on the stack • Usually dedicate a register to the frame pointer • Often rely on special purpose instructions to implement parts of the calling sequence • Most machines provide push and pop instructions that combine a store or load with automatic update of the stack pointer • RISC machines • Tend to pass arguments in registers • Do not usually allocate a register to the frame pointer • Simpler instructions
Register Windows (8.2.3) • As an alternative to saving and restoring registers on subroutine calls and returns, the original Berkeley RISC machines introduced “register windows” • The idea is to map the ISA’s limited set of register names onto some subset (window) of a much larger collection of physical registers, and to change the mapping when making subroutine calls • Old and new mappings overlap a bit, allowing arguments to be passed (and function results returned) in the intersection • Sun Spark, Intel IA-64 (Itanium)
In-Line Expansion (8.2.4) • Many language implementations allow certain subroutines to be expanded in-line at the point of call • In-line expansion avoids a variety of overheads • space allocation • branch delays from the call and return • Maintaining static chain or display • Saving and restoring registers • In many implementations the compiler chooses which subroutines to expand in-line and which to compile conventionally • Some languages allows suggestion • In C++ and C99, the keyword inline can be perfixed to a function declaration inline int max(int a, int b) { return a > b ? a : b; }
In-Line Expansion • In Ada, with a significant comment or pragma function max(a, b : integer) return integer is begin if a > b then return a; else return b; end if; end max; pragma inline(max); • In-line expansion is preferable to macros • In-line expansion can increase code size • Often not suitable for recursive calls
Parameter Modes (8.3.1) • Suppose x is a global variable in a language with a value model of variables, and we want to pass x as a parameter to subroutine p p(x); • From an implementation point of view, we have to two ways to pass it • Call by value • Provide p with a copy of x’s value • Call by reference • Pass x’s address
Parameter Modes x : integer -- global procedure foo(y : integer) y := 3 print x … x:= 2 foo(x); print x • By value • No visible effect, the program prints 2 twice • By reference • The assignment inside foo changes x—y is just a local name for x • The program prints 3 twice
Variations on Value and Reference Parameters • In Pascal, • Parameters are passed by value by default • Keyword var allows passing by reference • In C • Always parameters are passed by value • Array is passed by pointer • Fortran • Passes all parameters by reference, but every actual parameter may not have to be an l-value • If an expression appears, the compiler creates a temporary variable to hold the value and pass it as a parameter
Call by Sharing • In Smalltalk, Lisp, ML, and Clu, that use reference model • Actual parameter is a reference to an object • Provide a single parameter-passing mode in which the actual and formal parameters refer to the same object • Usually implemented by pass by address • Immutable object can be passed with pass by value • In Java • Parameters of primitive types are passed by value • Object parameters are passed by sharing • In C# • Parameters are passed by value by default • Call by reference is accomplished with ref or out parameter
The ambiguity of Call by Reference • Two reasons why the programmer may choose one over the other (call by value or call by reference) • If the called routine is supposed to change the value of an actual parameter choose pass by reference • To ensure that the called routine cannot modify the parameter, the programmer can pass the parameter by value • Implementation of value parameters requires copying actuals to formals, a potentially time-consuming operation • Reference parameters can be implemented simply by passing an address • The programmers may pass an argument by reference when passing by value would be semantically more appropriate
Read-Only Parameters • Modula-3 • Provides a READONLY parameter mode that combines the efficiency of reference parameter and the safety of value parameters • Small READONLY parameters are generally implemented by passing a value • Larger READONLY parameters are implemented by passing an address • Modula-3’s compiler will create a temporary variable to hold the value of any built-up expression passed as a large READONLY parameter as in Fortran
Read-Only Parameters • C • Provides equivalence by using the keyword const void append_to_log(const huge_record *r) { … } … append_to_log(&my_record); • One problem with parameter modes—with the READONLY mode in particular • They tend to confuse the key pragmatic issue (does the implementation pass a value or a reference?) • Two semantic issues • Is the callee allowed to change the formal parameter, if so will the changes be reflected in the actual parameter?
Parameter Modes in Ada • Ada provides three parameter-passing modes • in • Parameters pass information from the caller to the callee • Can be read by the callee but not written • out • Pass information from the callee to the caller • In Ada 83, they can be written by the callee but not read • in out • Pass information in both directions • Can be both read and written • Changes to out or in out parameters change the actual parameter
Parameter Modes in Ada • For parameters of scalar and pointer types, all three modes are to be implemented by copying values • In • Call by value • Out • Call by result (the value of the formal parameter is copied into the actual parameter when the subroutine returns) • In out • Call by value/result • In most languages, two different mechanisms would lead to different semantics; changes made to an in out parameter that is passed as an address will affect the actual parameter immediately
Parameter Modes in Ada x: integer; procedure foo(y : integer) y := 3 print x … x := 2 foo(x) print x • If y is passed by reference the program will print 3 twice • If y is passed by value/result, it will print 2 and then 3
References in C++ • Reference parameters are specified by preceding their name with an ampersand in the header of the function: void swap(int &a, int &b) { int t = a; a = b; b = t; } • In the code of this swap routine, a and b are ints, not pointers to ints; • No dereferencing is required • The caller passes as arguments the variables whose values are to be swapped, rather than passing their addresses • C++ parameter can be declared to be const to ensure that it is not modified • For large types, const provides the same combination of speed and safety as READONLY
References in C++ • Any variable can be declared to be a reference int i; int &j = i; … i = 2; j = 3; cout << i; // prints 3 • Here j is a reference to (an alias for ) i • The initializer in the declaration is required; it identifies the object for which j is an alias • It is not possible to change the object to which j refers; it will always refer to i
References in C++ • Uses of references in C++ • Parameters (pass by reference) • Function returns • Some objects—file buffers, for example—do not support a copy operation, and therefore cannot be passed or returned by value • It is possible to return a pointer, the subsequent dereferencing operations can be cumbersome • The overloaded << and >> operators return a reference to their first argument, which can in turn be passed to subsequent << or >> operations cout << a << b << c; is short for ((cout.operator<<(a)).operator<<(b)).operator<<(c);
References in C++ • Without references, << and >> would have to return a pointer to their stream: ((cout.operator<<(a))->operator<<(b))->operator<<(c); or *(*(cout.operator<<(a)).operator<<(b)).operator<<(c); • This change would spoil the cascading syntax of the operator form *(*(cout << a) << b) << c; • Algol 68 also provided the capability of returning references from functions, which is useful for operator overloading
Closures as Parameters • A closure (a reference to a subroutine, together with its referencing environment) may be passed as a parameter • The parameter is declared to be a subroutine (sometimes called a formal subroutine) procedure apply_to_A(function f(n : integer) : integer; var A : array[low..high : integer] of integer); var i : integer; begin for i := low to high do A[i] := f(A[i]); end; • Early version of Pascal did not include the full header of the subroutine parameter in the header of the routine difficult or impossible to do strict type checking
Closures as Parameters • Fortran 77 allows a subroutine to be passed as a parameter but cannot check statically for consistent use • Fortran 90 allows (but does not require) the programmer to specify the parameter’s interface • Several languages provide first-class subroutine types, supporting not only subroutine parameters, but also subroutine variables • In Modula -2 TYPE int_to_int = PROCEDURE(INTEGER) : INTEGER; PROCEDURE apply_to_A(f : int_to_int; A : ARRAY OF INTEGER); VAR i : CARDINAL; (* unsigned integer *) BEGIN FOR i := 0 TO HIGH(A) DO A[i] := f(A[i]); END; END apply_to_A;
Closures of Parameters • C/C++ support pointers to subroutines, both as parameters and variables void apply_to_A(int (*f)(int), int A[], int A_size) { int i; for (i = 0; i < A_size; i++) A[i] = f(A[i]); } • f is the name of a function and a pointer to a function; the pointer need not be dereferenced explicitly
Closures of Parameters • Functional languages • Subroutines are often passed as parameters (define apply-to-L (lambda (f l) (if (null? L) ‘() (cons (f (car l)) (apply-to-L f (cdr l)))))) • Since Scheme (or Lisp) is not statically typed, f’s type needs not be specified
Special Purpose Parameters (8.3.3) • Conformant Arrays (Open Arrays) • A formal array parameter whose shape is finalized at run-time (in a language that usually determines shape at compile time) • Modula-2 TYPE int_to_int = PROCEDURE(INTEGER) : INTEGER; PROCEDURE apply_to_A(f : int_to_int; A : ARRAY OF INTEGER); VAR I : CARDINAL; (* unsigned integer *) BEGIN FOR i := 0 TO HIGH(A) DO A[i] := f(A[i]); END; END apply_to_A; • C/C++ void apply_to_A(int (*f)(int), int A[], int A_size) { int i; for (i = 0; i < A_size; i++) A[i] = f(A[i]); }
Special Purpose Parameters • Default (Optional) Parameters • The principal use of dynamic scope is to change the default behavior of a subroutine, which can be also accomplished with default parameters • A default parameter • If a parameter is missing from a caller, then a preestablished default value will be used instead
Special Purpose Parameters • One common use of default parameters is in I/O library routines • In Ada, the put routine for integers has the following declaration in the text_IO library package type field is integer range 0..integer’last; type number_base is integer range 2..16; default_width : field := integer’width; default_base : number_base := 10; Procedure put(item : in integer; width : in field := default_width; base : in number_base := default_base);
Special Purpose Parameters • Named Parameters • So far, we have been assuming that parameters are positional; the first actual parameter corresponds to the first formal parameter, the second actual to the second formal, and so on • In some languages, such as, Ada, Common Lisp, Fortran 90, Modula-3, and Python, parameters can be named put(item => 37, base => 8); put(base => 8, item => 37); (* order doesn’t matter *)
Special Purpose Parameters • Named parameter notation has the advantage of documenting the purpose of each parameter format_page(columns => 2, window_height => 400, window_width => 200, header_font => Helvetica, body_font => Times, title_font => Times_Bold, header_point_size => 10, body_point_size => 11, title_point_size => 13, justification => true, hyphenation => false, page_num => 3, paragraph_indent => 18, background_color => white);
Special Purpose Parameters • Variable Numbers of Arguments • Lisp, Python, C allow users to define subroutines that take a variable number of arguments • In C, #include <stdarg.h> int printf(char *format, …) { va_list args; va_start(args, format); … char cp = va_arg(args, char); … double dp = va_arg(args, double); … va_end(args); }