210 likes | 221 Views
Dive into the world of calling conventions and learn how parameters are passed, control is managed, and functions interact with other code and the OS. Gain a deeper understanding of the importance of calling conventions in computer programming.
E N D
Programs – Calling Conventions CS/COE 0449 (term 2184) Jarrett Billingsley
Class announcements • hhhiiiiiiiiiiiii • exams back thursdayyyyokayyyyyyy • we're gonnago down the rabbit hole today • stay with me • this stuff will be really useful to know for project 2 CS449 (2184)
Calling Conventions CS449 (2184)
So I just met you I'm sorry • we talked about the stack, how it's used to support function calls • when you call a function, we push an AR onto the stack • when it returns, its AR is popped • remember what's in the AR? • what about the rest of the machinery? • how do those parameters get in there? • how does the stack pointer get moved around? • how are values returned from the function? • how does control get into/out of the function? • what responsibilities/protocols do functions have to follow to "play nice" with other code and with the OS? • these and other questions are answered by calling conventions CS449 (2184)
WHYYYYYY???????? • why does all this stuff matter? • everything your computer does is based on every function following this honor system • isn't that reassuring :^) • if anyone messes up the convention, your program crashes • even if you, the programmer, aren't writing assembly… • the compiler has to • the compiler has to know how to call functions CS449 (2184)
Like I said at the beginning... • CPUs today = C machines • so, they havebuilt-in mechanisms tocall and return from functions • call instructions do two things: • save the return address(address of the instruction after the call) • jump to the function being called • MIPS puts the return address in the ra register • x86's callpushes the return address onto the stack • return instructions do the opposite • MIPS uses "jump to register" • x86's retpops it off the stack, then jumps to it CS449 (2184)
Passing parameters • here's where things get really wild • how do you pass parameters into functions in MIPS? • the a registers • but there are only 4 • what if you have more than 4 params? • you use the stack!! (maybe you learned that or maybe not) • x86-32 is............................................... ;)))))))))) CS449 (2184)
x86-32 registers • there are 8 you have 6 "general purpose" registers hahaha 6 we'll look at thecdecl convention these are used tomanage the stack CS449 (2184)
Parameter passing with cdecl • all parameters are pushed onto the stack inreverse order: the last parameter is pushed first • no really – it makes sense! • what direction does the stack grow? • if we call f(1, 2, 3), a cdecl function call might look like either of these: esp sub esp, 16 mov [esp+8], 3 mov [esp+4], 2 mov [esp], 1 call f push 3 push 2 push 1 call f CS449 (2184)
Peeking under the hood • first let's write a little program to call a function • now let's compile it, using these options: • -g includes debugging info in the executable • this makes it easier to debug the program in gdb • -m32 compiles to 32-bit x86 • if we leave this off we get x64… not what we're talking about • now we can run it in gdb • gdb <program name> • disas main • oh god. oh no. oH GOD this looks horrible CS449 (2184)
One machine language, two assembly syntaxes • idk why this happened but x86 has two ways of writing assembly • the GNU tools default to AT&T, which is this monstrosity we're seeing movl$0x3,0x8(%esp) • the Much Better One is… INTEL SYNTAX mov DWORD PTR [esp+0x8],0x3 • let's make a file called ~/.gdbinit • ~ is short for your home directory (the directory you start in) • .gdbinit is a settings file for gdb • in it, let's write set disassembly-flavor intel • save, and let's try again CS449 (2184)
Peeking under the much easier-to-read hood • now let's see how main calls f with disas /m main • /m shows the C source if it's available • won't work on your project ;))))))))))) • let's look inside f to see how it computes its return value • when it accesses the arguments, it uses [ebp+offs] • what the heck is ebp? • and why is the first argument at offset 8 instead of 0? CS449 (2184)
The base pointer register and { } • esp is the stack pointer: it marks the bottom of the AR • ebp – "base pointer" – marks the top*of the AR • when we first came into f, the call instruction just pushed the return address, so the stack looks like: • then we have this weird sequence that { does: pushebp movebp, esp • push saves the old value of ebp • movmakes ebp point to that old ebp • and now.... uh........ um.......... what did that do? esp ebp CS449 (2184)
It's a linked list of ARs! • ebp is the pointer to the head of a linked list • every AR stores a pointer to the top of the AR of the function that called it • when the function is about to return, it does popebp • this "unlinks" the AR from the list • but if we put a local variable in f we'll see this: leave wha... ebp CS449 (2184)
The leave instruction • when we entered the function, we did: pushebp movebp, esp • the x86 leave instruction is functionally identical to: movesp, ebp pop ebp • which is the inverse • there is enter too, which does the same thing as the push-mov • x86 is very silly • why did the compiler use leave but not enter? ¯\_(ツ)_/¯ CS449 (2184)
Space for locals • when we added a local to f, the prologuechanged a bit • the prologue is the sequence of instructions that sets up the AR • now there's subesp, 0x10 • this is allocating space for the local variables • it's allocating 16 bytes for alignmentreasons • (modern versions of x86 like the stack to be aligned like this. caches. SSE. etc.) • so arguments are [ebp+offs] • …and local variables are [ebp-offs] • sometimes it accesses locals with [esp+offs],sometimes it doesn't • I don't really know why! CS449 (2184)
Returning • notice what the return statement assembles to • it puts a value in eax • this is like putting a value in v0 in MIPS • what if the value is bigger than 4 bytes? what if it's a float SHHHH shhhhhhhhhhyou don't wanna know shhhhhhhhhhhhhhhhh • then there's the function epilogue which we saw before • and finally, the ret instruction CS449 (2184)
Being a good roommate • when maincalls f, it assumes that some registers will be unchanged • how does MIPS handle this? remember sand tregisters? • in cdecl, ebx, edi, and esiare saved registers • a function must push them in the prologue and pop them in the epilogue in order not to smash their values • eax, ecx, and edx are free to be changed without saving them • you'll see these used for temporary values alllllllllll the time CS449 (2184)
Loose ends • let's look at the stack, esp, and ebp through this whole process: in main, about to call f in f, after the prologue just did ret, back in main just got into f esp ebp esp ebp esp ebp esp ebp uhh, is anyone gonna clean up this mess?? CS449 (2184)
Who cleans the stack? • it's always important to make sure the stack stays balanced: • put things back the way they were when you finish • but in cdecl, the stack pointer always "lags behind" • the caller is responsible for putting esp back, but it can be lazy • the compiler is being clever in main and relying on leave to clean up those arguments still sitting on the stack • why? C has this weird feature that cdecl was created to support... • variadicfunctions. CS449 (2184)
printf is very confused • have you ever really looked at the docs for printf? intprintf(const char* format, ...); • so there's an argument and then... maybe it's just being cryptic... • let's see what a call to printf looks like • how how does printf know, exactly, how many things you passed it? • it... doesn't • the only information printf has about the parameters is the format string you give it as the first parameter • this is why the caller is responsible for cleaning the stack: because the callee has no idea how many parameters it was passed CS449 (2184)