Subroutines and Control Abstraction

Subroutines and Control Abstraction

Control Abstraction • Abstraction • associate a name N to a program part P • name describes the purpose or function of P • we can use N instead of P (implementation) • Control abstraction • P is a well-defined operation • Data abstraction • P represents information (often with operations to access & modify that information)

Subroutines • Principal mechanism of control abstraction • subroutine performs some operation on behalf of a caller • caller waits for the subroutine to finish • subroutine may be parameterized • caller passes arguments (actual parameters) • influence the behavior of the subroutine • pass data to operate with • arguments are mapped to subroutine’s formal parameters • subroutine may return a value • functions & procedures

Chapter contents... • Review of the stack layout • Calling sequences • maintaining the stack • static chains & display to access nonlocals • subroutine inlining • closures • implementation examples • Parameter passing • mode determines • how arguments are passed and • how subroutine operations affect them • conformant array parameters, named & default parameters • variable number of arguments • function return mechanisms

...Chapter contents • Generic subroutines and modules • Exception handling • mechanism to ‘pop out’ of a nested context without returning • recovery happens in the calling context • Coroutines • a control abstraction other than a subroutine • useful for iterators, simulation, server programs

Review of stack layout • Stack frame / activation record • arguments, return values • bookkeeping information • return address • saved register values • local variables, temporaries • static part • variable-sized part

Accessing stack data • Hardware support • stack pointer register SP: top of the stack • frame pointer register FP: address within the current frame • Objects of the frame are accessed • using FP and a static displacement (offset) or • using address & dope vector (which are in the static part) • no variable-sized objects  all objects have a static offset  one can use SP instead of FP (and save one register) • Variable-sized arguments? • method 1 • store below current frame • use address/dope vector in argument area • method 2 • pass just the address (to caller object) & dope vector • copy object to ‘normal’ variable-sized area when subroutine is entered

Nested routines & static scoping • Pascal, Modula, Ada • Note: many voices against the need of these • development of object-oriented programming • C works just fine without them • Accessing non-local objects • maintain static chain in frames (next slide) • each stack frame contains a static link • reference to the frame of the last activation of the lexically enclosing subroutine • by analogy, saved value of FP = dynamic link • reference to the frame of the caller • used to reclaim stack data • may (or may not) be the same as the static link

Example • Figure slide -1 • Call from a lexically surrounding routine • C is called from B • we know that B must be active (and has a frame in the stack) • How else can C be called? • C gets visible only when control enters B •  C is visible only from B & D (and routines declared in C & D) •  whatever routine P calls C, it must have a frame in the stack

Display • Static chains may (in theory) be long • accessing an object k levels out requires the dereferencing of k static links •  k+1 memory accesses • Display / display table • static chain embedded into an array • an entry for each lexical depth of the program • Display[j] = FP of the last activation of a routine declared at depth j • Using display • caller nested i levels deep • object nested k levels out of caller • take frame Display[i-k] & use (compile-time constant) offset

Display or static chain? • Most programs are only 2 or 3 levels deep • static chains are short • If a non-local object X is used often • address calculation (arithmetic expression) of the frame of X, say FX, appears often in the code  common subexpression optimization automatically loads FX to some register  dereferencing is done only once • Cost of maintaining display • slightly higher than maintaining static links • Closures • easy to represent with static links • whole display has to be copied (if used) • some optimizations are possible • compilers that use a display have a limit on the depth of nesting

Calling sequences... • Maintenance of the call stack • code immediately before & after a call • prologue: code at the beginning of a subroutine • epilogue: code at the end of a subroutine • all above = ‘calling sequence’ • Tasks to do ‘on the way in’ • pass parameters • save return address • update program counter • update stack pointer (to allocate space) • save registers (including the frame pointer) • only those that are important and may be overwritten by the called routine (= callee) • update frame pointer (to point to the new frame) • initialize local data objects

...Calling sequences • Tasks to do ‘on the way out’ • pass return parameters & function values • finalize local objects • deallocate frame (restore SP) • restore other saved registers • restore program counter (PC) • Division of the labor • some tasks can be done only by the caller • passing parameters in general, things that may be different for different calls • most can be done by either one • the more work in the callee the less space we need for the code

Saving registers • Ideally, save only those that • are used by the caller and • are overwritten by the callee • hard to track in separate compilation • Simpler solution • caller saves all registers that are in use, or • callee saves all registers it will overwrite • Compromise • divide (data) registers into ‘caller-saves’ & ‘callee saves’ • callee can assume there is nothing of interest in caller-saves • caller can assume no callee destroys callee-saves registers • compiler allocates • callee-saves registers for long-term data • caller-saves for temporary data •  caller-saves are seldom saved at all (caller knows that they contain junk)

Maintaining static chain • Caller’s responsibility • links depend on the lexical nesting depth of the caller • Standard approach • compute the static link of the callee • callee directly inside caller: own FP • callee is k levels outward: follow k static links • pass it as an extra parameter • Maintaining displays • callee at level j  • save Display[j] in the stack • replace Display[j] with callee’s FP • why does it work: page 433 • Leaf routines • routines that make no subroutine calls • no need to update Display for these

Implementing closures • Display scheme breaks for closures • use 2 entry points for each subroutine • normal call • via closure call • save Display[1..j] into stack • replace those with ones stored in the closure • separate return code for closure calls • restore Display[1..j]

Cost of maintenance • Static chains • call: k >= 0 load instructions, 1 store • return: no extra operations • Display • 1 load & 1 store in prologue • 1 load & 1 store in epilogue • No work for leaf routines

Case study: C on MIPS • Hardware support • ra: register containing return address • jal: (jump and link) sets ra • sp, fp • Notes • a simple language on a simple machine • all stack object sizes known  • separate fp is not strictly needed (sp suffices) • GNU gcc uses it anyway • uniformity (gcc is highly portable) • makes it possible to allocate space dynamically from the stack (alloca library function)

Stack frame... • Example of a stack frame (slide +1) • Argument passing • assembled at the top of the frame (using sp) • build area is large enough to hold the largest argument list • no need to ‘push’ in the traditional sense (space is already allocated and sp does not change) • optimization • first 4 scalar arguments are passed in machine registers • space is reserved in stack for all arguments • register arguments are saved to stack if needed • e.g. we must pass a pointer to the argument

...Stack frame • Allocate temporary space from stack •  sp grows, fp stays • C: alloca library routine • fast to implement, automatic reclaiming • see slide +1 • Languages with nested subroutines • not C • use some register to pass the static link (r2)

Returning values • Scalar values (and pointers) • use some register (r2, f0) • Structures • store to a given memory address • address is passed in a ‘hidden’ register parameter (r4) • possible cases • x = foo(...)  address of x is passed to foo • p(...,foo(...),...) • pass an address to the build area (argument list of p) • overwrites arguments of foo but they are not in use when returning from foo • x = foo(...).a + y  use temporary variables

gcc calling sequence... • Caller • save “caller-save” registers in temporary variables • only those whose value is still needed after the call • put (up to 4) scalar arguments into registers • put remaining arguments (if any) into the build area • perform jal instruction (sets ra & jumps) • Callee (prologue) • subtract frame size from sp (stack grows downwards) • note: argument list belongs to the caller’s frame • save fp, ra (if not a leaf routine) • save callee-save registers • only those whose values may change before returning

...gcc calling sequence • Callee (epilogue) • place return value (r2, f0, memory address) • copy fp into sp (deallocate alloca space) • restore saved registers (using sp) • add frame size to sp (deallocate frame) • jump to ra • Caller (at return) • move return values to wherever needed • caller-save registers are restored lazily • when values are needed for the first time

Optimizations & debugging • Optimizations • many parts of the calling sequence can be omitted • e.g. no caller-saves • many leaf routines do not use the stack at all • everything happens inside registers • Debugger support • compiler places information in the symbol table • starting & ending address of routines • size of the frame • whether sp or fp is used for object access • which register holds return address • which registers are saved (callee-saves)

Inline expansion • Alternative to stack-based calling • Expand routine body at the place of the call • avoids various overheads • space allocation • branching to and from subroutine • saving and restoring registers (not always) • code improvement possible ‘over routine boundaries’ • Language design • compiler chooses which calls to expand • C++: keyword inline • only a suggestion to the compiler • Ada: compilation pragmas • pragma inline

Inline expansion & macros • In-line expansion • just an implementation technique • semantic of the program is not touched • Macros • side-effects in arguments are evaluated at each argument occurrence #define MAX(a,b) ((a) > (b) ? (a) : (b)) MAX (x++, y++) • only expressions can be used to ‘return’ values •  no loops etc can be used in macros

Inline expansion: discussion • Code speed increases •  programmers can use good programming style and still get good performance • e.g. class member functions to access/update instance data •  inline expansion may be a necessity for o-o languages • at least if we want programmers to write good programs • Code size increases • Recursive routines? • expand once • filter out the first special cases • nested calls are compiled ‘normally’ • example case: hash table lookup • most chains in a table are only one element long • nested call is often avoided

Parameter passing • Use of subroutine parameters • control behavior • provide data to operate on • parameters make subroutines more abstract • Formal parameters • names in the declaration of a subroutine • Actual parameters, arguments • variables & expressions in subroutine calls

Section contents • Parameter-passing modes • values, references & closures • Additional mechanisms • conformant array parameters • missing & default parameters • named parameters • variable-length argument lists • Returning values (from functions)

Subroutine call notation • Prefix • most commonly used: p(a,b,c) • Lisp: (p a b c) • Infix • functions specified to be ‘operators’ infixr 8 tothe; (* exponentiation *) fun x tothe 0 = 1.0 | x tothe n = x * (x tothe (n-1)); (* assume n>= 0 *) • Mixfix • Smalltalk: arguments & function name interleaved • Note on uniformity • Lisp & Smalltalk functions are like ‘standard’ control structures •  more natural to introduce ‘own’ control abstractions if a > b then max := a else max := b; (* Pascal *) (if (> a b) (setf max a) (setf max b)) ; Lisp (a > b) ifTrue: [max <_a] ifFalse: [max<- b]. ”Smalltalk”

Parameter modes • Semantic rules governing parameter passing • determine relationship between the actual & formal parameters • single set of rules that apply to all parameters • C, Fortran, ML, Lisp • two or more sets of rules • each corresponding to some mode • Pascal, Modulas, Ada • heavily influenced by implementation issues

Call by value & call by reference • Example: global x, call p(x) • what is passed to p? • call by value: a copy of x • x & the copy are independent of each other • call by reference: the address of x • formal parameter is an alias for x • most languages require that ‘x’ must have an l-value • Fortran 90 makes one if it doesn’t

Value model languages • Pascal • by value: default mode • by reference • use keyword VAR in the formal parameter list • C • all parameters are passed by value • exception: array is passed as a pointer aliases must be created explicitly • declare a formal pointer parameter use address operator on the actual parameter void swap (int *a, int *b) {int t = *a, *a = *b, *b = t;} ... swap (&v1, &v2); • Fortran • all variables are passed by reference

Reference model languages • Actual parameter is already a reference •  sensible to define only a single passing mode • Clu: call by sharing • actual & formal parameter refer to the same object • Implementation • as an address  pass the address • as a value (immutable objects)  copy value • Java: primitive values are copied, class objects shared

Function returns • restrictions on the types of objects that can be returned • Algol 60, Fortran: a scalar value • Pascal, early Modula-2: scalar or pointer • Algol 68, Ada, C, some Pascal: composite type values • Modula-3, Ada 95: a subroutine, implemented as a closure • Lisp, ML: closure

Function return syntax • Lisp, ML, Algol 68 • no distinction between statements and expressions • value of the function is the value of its body • Algol 60, Fortran, Pascal function := expression • problems with nested declarations • more recent languages return expression • immediate termination of the subroutine • if the function still has something to do, then place the return value into a temporary variable rtn := expression ... return rtn

...Function return • Fortran function name := ... return • Ada example • return expression • SR example • result of a function has its own name

Subroutines and Control Abstraction