510 likes | 745 Views
CSC 3210 Computer Organization and Programming. Chapter 7 SUBROUTINES D.M. Rasanjalee Himali. Outline. Introduction Open Subroutines Register Saving Subroutine Linkage Arguments to Subroutines Examples Leaf Subroutines Pointers as Arguments to Subroutines. Introduction.
E N D
CSC 3210Computer Organization and Programming Chapter 7 SUBROUTINES D.M. Rasanjalee Himali
Outline • Introduction • Open Subroutines • Register Saving • Subroutine Linkage • Arguments to Subroutines • Examples • Leaf Subroutines • Pointers as Arguments to Subroutines
Introduction • In programming there is frequently a need either to repeat a computation or to repeat the computation with different arguments. • It is possible to repeat a computation by means of a subroutine. • Subroutines may be either open or closed.
Introduction • An open subroutine • is handled by the text editor or by the macro preprocessor and • is the insertion of the required code whenever it is needed in the program. • A closed subroutine is one in which the code appears only once in the program; • whenever it is needed, a jump to the code is executed, and • when it completes, a return is made to the instruction occurring after the jump instruction. • Arguments to closed subroutines may be placed in registers or on the stack.
Introduction • Execution of the subroutine should not change the state of the machine, except possibly for the condition codes. • i.e. any registers that the subroutine uses must first be saved and then restoredafter the subroutine completes execution. • Arguments to subroutines are normally local variables of the subroutine, and generally, the subroutine is free to change them.
Register Saving • Almost any computation will involve the use of registers. • The SPARC architecture provides for a register file with a mapping register that indicates the active registers. • Typically, 128 registers are provided, with the programmer having access to the eight global registers, and only 24 of the mapped registers at any one time. • The save instruction changes the register mapping so that new registers are provided. • A similar instruction, restore, restores the register mapping on subroutine return.
Reserve new 24 registers • (8-in | 8-local | 8- 0ut) + 8 global registers common to all subroutines S1() { save %sp, -96, %sp ---- ---- S2() ---- ---- restore } S2() { save %sp, -96, %sp ---- ---- ---- ---- restore } 2. Reserve Stack memory (96 bytes in this case) • Restore/Release reserved registers 2. Release Stack memory (96 bytes in this case) REGISTER FILE 8 registers Register set =16 registers 8*16 =128 registers 8-Global
BEFORE EXECUTION S1() { save %sp, -96, %sp ---- ---- S2() ---- ---- restore } S2() { save %sp, -64, %sp ---- ---- ---- ---- restore } MEMORY REGISTER FILE 8*16 = 128 registers 8-Global %sp
EXECUTION S1() { save %sp, -96, %sp ---- ---- S2() ---- ---- restore } S2() { save %sp, -64, %sp ---- ---- ---- ---- restore } MEMORY REGISTER FILE CWP 8*16 = 128 registers %sp 96 bytes 8-Global %fp
EXECUTION S1() { save %sp, -96, %sp ---- ---- S2() ---- ---- restore } S2() { save %sp, -64, %sp ---- ---- ---- ---- restore } MEMORY REGISTER FILE CWP 8*16 = 128 registers %sp 64 bytes %fp 96 bytes 8-Global
EXECUTION S1() { save %sp, -96, %sp ---- ---- S2() ---- ---- restore } S2() { save %sp, -64, %sp ---- ---- ---- ---- restore } MEMORY REGISTER FILE CWP 8*16 = 128 registers %sp 96 bytes 8-Global %fp
EXECUTION S1() { save %sp, -96, %sp ---- ---- S2() ---- ---- restore } S2() { save %sp, -64, %sp ---- ---- ---- ---- restore } MEMORY REGISTER FILE 8*16 = 128 registers 8-Global %sp
Register Saving • The 32 registers are divided into four groups: • in, local, out, and general • The eight general registers, %g0—%g7, are NOT mapped and are global to all subroutines. • The in registers are used to pass arguments to closed subroutines, • The local registers are for a subroutine’s local variables, • The out registers are used to pass arguments to subroutines that are called by the current subroutine. • The in, local, and out registers are mapped.
Register Saving • When the save instruction is executed • the out registers become the in registers, and • a new set of local and out registers is provided. • The mapping pointer into the register file is changed by 16 registers
REGISTER FILE REGISTER FILE 8-Global 8-Global
Register Saving • The current register set is indicated by the current window pointer, “CWP,” a machine register. • The last free register set is marked by the window invalid bit, in the “WIM,” another machine register. • Each register set contains 16 general registers; • the number of register sets is implementation dependent. • There are really 8 x 16 hardware registers and that the set selected is controlled by the cwp. • When the save instruction is executed, the prior subroutine’s register contents remain unchanged until a restore instruction is executed, resetting the cwp.
One additional subroutine call After further 6 subroutine calls without return (hardware trap) After further 5 subroutine calls without return Register Saving • If a further five subroutine calls are made without any returns, the situation in Figure 7.3 exists. • The out registers being used are from the invalid register window marked by the win bit.
Register Saving • If more than 5 additional subroutine call is made, a hardware trap occurs. • Its effect is to move the 16 registers from window set seven onto the stack where the stack pointer of register window seven is pointing. • The trap handler may use the local registers of the invalid window. • The cwp and wim pointers are moved as shown in Figure 7.2.
Register Saving • Register window mapping explains the process by which the stack pointer becomes the frame pointer. • The stack pointer is register %o6, which, after a save, becomes %i6 the frame pointer.
=(4bytes per register * 16 registers per register set) Register Saving • The save and restore instructions are both also add instructions. • However, the source registers are always from the current register set, and the destination register is always in the new register set. • Thus the following instruction subtracts 64 from the current stack pointer but stores the result into the new stack pointer, leaving the old stack pointer contents unchanged. • After the save instruction is executed, the old, unchanged stack pointer becomes the new frame pointer.
Register Saving • The restore instruction, restores the register window set. • On doing this a register window can underflow if the cwp is moved to the wim. • When this happens the window trap routine restores the registers from the stack and resets the pointers. • The restore instruction is also an add instruction and is frequently used as the final add instruction in a subroutine
Subroutine Linkage • To branch to the first instruction of a subroutine, a ba instruction might be used. • Unfortunately, if it is used there is no way of returning to the point where the sub routine was called. • The SPARC architecture supports two instructions for linking to subroutines. : jmplandcall • Both instructions may be used to store the address of the instruction that called the subroutine into register %o7. • Question : What is the return address of the subroutine with no save instruction executed at the beginning? • Question: What is the return address of the subroutine with a save instruction executed at the beginning?
Subroutine Linkage • As the instruction following the instruction that called the subroutine will also be executed, the return from a subroutine is to %o7 + 8, which is the address of the next instruction to be executed in the main program. • If a save instruction is executed at the beginning of the subroutine, the contents of %o7 will become the contents of %i7 and the return will have to be to %i7 + 8.
Subroutine Linkage • If the subroutine name is known at assembly time, the call instruction may be used to link to a subroutine. • The call instruction has as operand the label at the entry to the subroutine and transfers control to that address. • It also stores the current value of the program counter, %pc, into %o7. • Like any instruction that changes the %pc, the call instruction is always followed by a delay slot instruction. The call instruction delay instruction may not be annulled.
Subroutine Linkage • If the address of the subroutine is computed, it must be loaded into a register. • If this is done, the jmpl instruction is used to call the subroutine. • Like most other instructions, the jmpl instruction has two source arguments and a destination register. • The source may be a register and a constant or two registers. • The address of the subroutine is the sum of the register contents or the sum of the register and the constant. • It is this address to which the transfer takes place.
Subroutine Linkage • Like all branching instructions, jmpl is followed by a delay slot instruction. • The address of the jmpl instruction is stored in the destination register. • Thus, to call a subroutine whose address is in register %oO storing the return address into %o7, we would write: • The assembler recognizes • as • You may use the call for both types of subroutine calls. Destination address (Called Subroutine) Return address (Calling Subroutine)
Subroutine Linkage • The return from a subroutine also makes use of the jmpl instruction. • In this case we need to return to %i7 + 8 and the assembler recognizes the mnemonic ret for: • The call to a subroutine is then: • At the entry of the subroutine: with the return: Save instruction in called subroutine causes %o6 in calling subroutine to map to %i6 in called subroutine
Subroutine Linkage • The ret instruction is expanded by the assembler to • The restore instruction: • is normally used to fill the delay slot of the ret instruction. • Restore the register window set • Can befinal add instruction in a subroutine
Arguments to Subroutines • Arguments to subroutines can • follow in-line after the call instruction, • be on the stack, or • be located in registers.
Arguments to Subroutines • 1. Arguments follow in-line after the call instruction: • For example, a Fortran routine to add two numbers, 3 and 4, together would be called by: • and handled by the following subroutine code: • Note that the return is to %i7 + 16 jumping over the arguments. • This type of argument passing is very efficient but is limited. Recursive calls are not possible, nor is it possible to compute any of the arguments.
Arguments to Subroutines • 2. Arguments placed in stack: • Placing argument onto the stack is, • very general but time consuming. • Each argument must be stored on the stack before the subroutine may be called. • allows us complete flexibility to compute arguments, pass any number of arguments, and support recursive calls.
Arguments to Subroutines • 3. Arguments placed in in-registers: • We can use registers %o0 to %o5 (6 registers) to pass on six values to the new subroutine ( where they will be stored in registers %i0 to %i5). • But for more arguments than that, they have to be stored on the stack. Hence the save command at the start of the function will have to be modified accordingly. • After execution of a save instruction the arguments will be in the first six in registers, %i0—%i5.
Arguments to Subroutines • The convention established in the SPARC architecture is to pass the first six arguments in the first six out registers, %o0—%o5, with any additional arguments placed on the stack. • However, space is always reserved for the first six arguments on the stack even though they are not there(similar to reserved space for register saving). • In fact, the space is reserved even if there are no arguments at all. • Each argument occupies ONE WORD on the stack or register, so that when passingbyte arguments to subroutines, they must bemoved into word quantities before passing.
Arguments to Subroutines • The arguments are located on the stack, after the 64 bytes reserved for register window saving. • However, immediately after the 64 bytes reserved for register window saving, there is a pointer to where a structure may be returned (this is discussed in Section 7.7). • Thus, the structure return pointer will be at %sp + 64 and the first argument, if it were on the stack, at %sp + 68.
Arguments to Subroutines • As we have seen in the previous examples, 64 bytes are reserved on the stack for register window saving. • Further4 bytes are now needed for a pointer to an address where a structure may be returned by the function. • After that, 24 bytes are reserved by convention for the first six arguments. • After that, more space can be reserved for local variables on the stack. The typical save command will now have to be modified as: =92 bytes
Arguments to Subroutines • The save instruction provides: • Space for saving the register window set, if necessary • A Structure Pointer • A place to save 6 arguments • Space for any local variables While keeping the %sp aligned in a double-word boundary.
Arguments to Subroutines • If we had a subroutine vector with local variables: • then the save instruction would be: • resulting in 104 bytes being subtracted from the stack pointer.
Arguments to Subroutines • Structure pointer and space to save the called routine’s arguments are all accessed positively with respect to the stack pointer • The subroutine’s arguments are located positively with respect to the frame pointer. • Local variables are accessed negatively with respect to the frame pointer.
Arguments to Subroutines • The argument offsets are logically defined as: • Notice the positive offsets! (w.r.t. %sp) define(arg1_s, 68) define(arg2_s, 72) define(arg3_s, 76) define(arg4_s, 80) define(arg5_s, 84) define(arg6_s, 88)
Example – Called Subroutine • Let us look at an example. We will express the algorithm in C as follows :
…… ld [%fp+x_s], %o0 !y = x*a call .mul mov %a_r, %o1 st %o0, [%fp+y_s] ld [%fp+x_s], %o0 !j = x+i add %i_r. %o0, %j_r ld [%fp+x_s], %o0 !return x+y ld [%fp+y_s], %o1 end_example: ret restore %o0, %o1, %o0 !result in o0 !incoming arguments define(a_r, i0) define(b_r, i1) define(c_r, i2) !automatic variables define(x_s,-4) define(y_s,-8) define(ary_s,-264) !register variables define(i_r, l0) define(j_r, l1) .global example_function example: save %sp, (-92+-264)&-8, %sp add %a_r,%b_r,%o0 !x=a+b st %o0, [%fp+x_s] add %c_r,64,%o0 !i=c+64 add %a_r,%c_r,%o0 !ary[i] =c+a sll %i_r, 1, %o1 add %fp, ary_s, %o2 sth %o0, [%o1 + %o2]
Return Values • Subroutines that return a value are called functions. • A function in C and C++ can also return a structure. • The value returned by a function or subroutine is always returned in register %o0 of the calling program. • If a save instruction has been executed in called function, %o0 will be %i0 before the restore instruction is executed.
Subroutines with Many Arguments • Arguments beyond the sixth are passed on the stack. • In this case we must first make room for the arguments by subtracting from the stack pointer. • For example, to call a subroutine with eight arguments: • which returns the sum: • We first have to make room for arguments seven and eight, which will go on the stack making sure that the stack is still double word aligned.
Subroutines with Many Arguments ! Make space for two additional args on stack for foo add %sp,-2*4 &-8,%sp ! Load additional args to stack mov 7,%o0 !load arg 7 with its value st %o0,[%sp+92] mov 8,%o0 !load arg 8 with its value st %o0,[%sp+96] ! Load first 6 args going to in registers mov 6, %o5 mov 5, %o4 mov 4, %o3 mov 3, %o2 mov 2, %o1 mov 1, %o0 ! Call foo subroutine call foo nop ! Release space on stack reserved for additional args sub %sp, -2*4&-8,%sp • Calling Subroutine: • The seventh and eighth arguments will go onto the stack at %sp + 92 and at %sp + 96, respectively. We can then pass the arguments as follows: • Notice the positive offsets of additional arguments w.r.t %sp Calling sub() { -- -- foo(1,2,3,4,5,6,7,8) -- -- } Two additional arguments
Subroutines with Many Arguments !define incoming argument offsets define(a1_r, i0) define(a2_r, i1) define(a3_r, i2) define(a4_r, i3) define(a5_r, i4) define(a6_r, i5) define(a7_s,92) define(a8_s,96) .global foo foo: save %sp,-96,%sp ld [%fp+a8_s],%o0 !8th argument ld [%fp+a7_s],%o1 !7th argument add %o0, %o1, %o0 add %a6_r,%o0,%o0 !6th argument add %a5_r,%o0,%o0 !5th argument add %a4_r,%o0,%o0 !4th argument add %a3_r,%o0,%o0 !3rd argument add %a2_r,%o0,%o0 !2nd argument ret restore %a1_r,%o0,%o0!1st argument • Called Subroutine: • Inside foo the arguments may be accessed by: • Notice the positive offsets of additional arguments w.r.t %fp foo(int: a1,a2,a3,a4,a5,a6,a7,a8) { return a1+a2+a3+a4+a5+a6+a7+a8 }
Before calling foo at caling sub: (but after placing arguments in registers and memory) After calling foo at caling sub: (but before foo returns) MEMORY REGISTER FILE MEMORY REGISTER FILE … … In Registers –calling sub %sp In Registers –calling sub 92 bytes ? local Registers –calling sub local Registers –calling sub %fp 92 bytes %sp 92 bytes 1 1 2 2 %fp+92 7 3 in Registers –foo sub 4 %sp+92 7 3 %fp+96 8 out Registers –calling sub 5 4 %sp+96 8 ? 6 5 ? 6 local Registers –foo sub %fp … … out Registers –foo sub … …
Leaf Subroutines • A leaf routine is one that does not call any other routines. • For a leaf routine the register usage is restricted as follows: • The leaf routine may only use the first six out registers and the global registers %go and %g1. • A leaf routine does not execute either a save or a restore instruction but simply uses the calling subroutine’s register set, observing the restrictions listed above. • The elimination of register saving and restoring makes calling a leaf routine very efficient. • The .mul routine is a leaf routine.
Leaf Subroutines • A leaf routine is called in the same manner as a regular subroutine, placing the return address into %o7. • As a save instruction is not executed, the return address for a leaf routine is%o7 + 8, not %i7 +8. • To return from a leaf subroutine, we use retl statement • The assembler recognizes retl for:
Leaf Subroutines !define incoming argument offsets define(a1_r, o0) define(a2_r, o1) define(a3_r, o2) define(a4_r, o3) define(a5_r, o4) define(a6_r, o5) define(a7_s,92) define(a8_s,96) .global foo foo: add %a2_r,%a1_r,%o0 !o0 = 1st + 2nd add %a3_r,%o0,%o0 !o0 += 3rd add %a4_r,%o0,%o0 !o0 += 4th add %a5_r,%o0,%o0 !o0 += 5th add %a6_r,%o0,%o0 !o0 += 6th ld [%sp+a7_s], %o1 add %o1, %o0, %o0 !o0 += 7th ld [%sp+a8_s], %o1 add %o1, %o0, %o0 !o0 += 8th end_foo: retl nop • The subroutine foo should have been written as a leaf routine as follows: