430 likes | 532 Views
Advanced Programming in the Unix Environment. Ch 7. Process Environment. The Environment of a Unix Process. Program Startup & Termination Command Line Arguments Environment Variables Memory Layout of a C program Memory Allocation setjmp and longjmp functions. What is a process?.
E N D
Advanced Programming in the Unix Environment Ch 7. Process Environment System programming
The Environment of a Unix Process • Program Startup & Termination • Command Line Arguments • Environment Variables • Memory Layout of a C program • Memory Allocation • setjmp and longjmp functions System programming
What is a process? • The process is the OS’s abstraction for execution • the unit of execution • the dynamic (active) execution context • compared with program: static, just a bunch of bytes • Process is often called a job or task • Difference between process and program? From Program to a Living Process System programming
What happens when we compile and execute? #include <stdio.h> int main(void) { int array[] = {2,5,1,4,3}; print_array(array); #ifdef SELECTION_SORT selection_sort(0, 4, array); printf("Selection sorting....\n"); #else insertion_sort(0, 4, array); printf("Insertion sorting....\n"); #endif print_array(array); } $ gcc array_sort.c a.out 생성 (실행파일, 프로그램 파일) $ a.out 프로그램을 실행하면 프로세스 생성 System programming
Process Startup Special start-up routine called before main() Specified in executable program file as program starting address while linking Start-up routine takes values from kernel: - command-line argument & environment Set things up for calling main() Callmain(argc, argv) exec system call C start-up code Call Return int main(int argc, char * argv[]); System programming
Process Termination • Normal termination • Return from main : Return to C startup code, which calls exit() • E.g. exit(main(argc, argv)) • Calling _exit or _Exit: Returns to kernel immediately • _Exit() is equivalent to _exit() • Calling exit : Perform cleanup and returns to kernel • Abnormal termination • Calling abort (Chapter 10) • Terminated by a signal (Chapter 10) System programming
Normal Process Termination void exit(int status); void _exit(int status); • _exit()/ _ Exit(): returns to the kernel immediately • exit(): performs cleanup processing and then returns to kernel • Clean shutdown of the standard I/O • fclose for all open streams all buffered output flushed • exit status • All three exits provide an exit status to the process that executed the program • ex) exit(0) (or return(0) in main function) returns the exit status of 0 • Most Unix shells provide a way to examine the exit status of a process • undefined exit status • exit call without an exit status or main return without a return value or an implicit return of main function System programming
Exit handlers • exit handler • called when a program exits. • atexit() • register exit handlers (up to 32 functions/process) • registered functions are called in reverse order of the registration int atexit(void (*func) (void)); System programming
Example of exit handlers #include <stdio.h> static void my_exit1(void){ printf("first exit handler\n");} static void my_exit2(void){ printf("second exit handler\n");} int main(void){ if (atexit(my_exit2) != 0) printf("can't register my_exit2"); if (atexit(my_exit1) != 0) printf("can't register my_exit1"); if (atexit(my_exit1) != 0) printf("can't register my_exit1"); printf("main is done\n"); return(0); } $ a.out main is done first exit handler first exit handler second exit handler System programming
return return call call Program Starting and Termination user process _exit exit handler user functions call return . . . exit (does not return) call return exit handler _exit mainfunction exit (does not return) exit function call return standard I/O cleanup exit (does not return) C start-up routine _exit exec kernel System programming
atexit examples • atexit1.c • atexit2.c • atexit3.c System programming
Command-Line Arguments • The process that does the exec can pass command-line arguments to the new program(like Unix shells) int main(int argc, char *argv[]){ int i; for (i = 0; i < argc; i++) printf(“argv[%d]= %s\n”, i, argv[i]); return 0; } $ ./a.out kku "computer engineering" 2013 argv[0]= ./a.out argv[1]= kku argv[2]= computer engineering argv[3]= 2013 /* argv[argc] == NULL (guaranteed by ANSI C and POSIX.1) */ ... /* while(argv[i]!=NULL) or */ for(i = 0; argv[i] !=NULL; i++) printf(“argv[%d]: %s\n”, i, argv[i]); System programming
echo_reverse.c • 각 명령행인자(command line arguments)로 주어진 문자열을 뒤집어 출력하시오. $ gccecho_reverse.c $ ./a.out hello 2013 "computer engineering" argv[0] = tuo.a/. argv[1] = olleh argv[2] = 3102 argv[3] = gnireenigneretupmoc System programming
addnum.c • 명령행으로 주어진 숫자를 모두 더하기 • 숫자는 모두 정수(int)로 간주 • atoi() 함수를 사용하시오. • 예 $ ./a.out 4 51 42 5 sum=102 System programming
Environment Variables • The environment variables: name=value • Interpretation of environment variables is up to the various applications • Normally set in a shell start-up file to control the shell’s actions HOME=/home/sikim PATH=.:/home/sikim/bin:/usr/local/bin:/bin:/usr/bin: CDPATH=.:/home/sikim:/home/sikim/work: SHELL=/bin/bash USER=sikim MANPATH=.:/home/sikim/man:/usr/share/man: System programming
Environment List • Each program is passed an environment list (the list of environment variables) • extern char **environ : Global variable for environment list • Environment list: contains pointers to environment strings • Use getenv and putenv to access an environment variables • Use environ pointer to go through the entire environment environment pointer environment list environment string environ : HOME=/home/run\0 PATH=/bin\0 SHELL=/bin/bash\0 USER=run\0 NULL System programming
Environment Variables (cont’d) • getenv: fetch a specific value from the environment • putenv:takes string of form “name=value” and put it in the environment list • setenv: set name to value • ifname already exists, the existing definition is removed (rewrite0)orthe definition is not removed(rewrite = 0) • unsetenv:removes any definition of name char *getenv(const char *name); returns : pointer to value associated with name, NULL if not found int putenv(const char *str); int setenv(const char *name, const char *value, int rewrite); returns : 0 if OK, nonzero on error void unsetenv(const char *name); System programming
env_test.c #include <stdio.h> int main(void) { putenv("DEPT=software"); printf("%s\n", getenv("DEPT")); setenv("DEPT", "computer engineering", 1); printf("\n"); extern char **environ; int i; for(i=0; environ[i]; i++) printf("%s\n", environ[i]); } $ ./a.out software HOSTNAME=fc5vm TERM=xterm SHELL=/bin/bash … DEPT=computer engineering System programming
high address command-line arguments and environment variables stack heap uninitialized data (bss) initialized to zero by exec initialized data read from program file by exec text low address Memory Layout of a C program • Memory segment of C program • Text segment • Instruction code • Usually read-only • Initialized data segment • Uninitialized data segment (called “bss”) • Initialized as 0 by kernel before the program starts • Bss: block started by symbol (bss) • Stack • Automatic variables, temporary variables, return address, caller’s environment (registers) • Heap • dynamically allocated memory System programming
Memory Layout of a C program • Memory Image of C program main return address i buf f1 return addr ptr 100 byte memory block Stack long array[100]; long bufsize = 100; int main(void) { int i; char* buf; i=10; buf=f1(); return(0); } char* f1(void){ char *ptr; ptr=malloc(bufsize); return ptr; } Heap array[100] Unitialized data bufsize = 100 Initialized Data int main(void){ i=10; buf=f1(); return 0; } ... Text System programming
memory.c #include <stdio.h> #include <stdlib.h> long array[100]; long bufsize = 100; void f1(void) { } int main(void) { int i; char* buf = malloc(bufsize); printf("&i= %p\n", &i); printf("buf= %p\n", buf); printf("&bufsize= %p\n", &bufsize); printf("&array[0]= %p\n", &array[0]); printf("main= %p\n", main); printf("f1= %p\n", f1); } $ ./a.out &i= 0xbf9d2fcc buf= 0x8e77008 &bufsize= 0x804965c &array[0]= 0x8049680 main= 0x80483b9 f1= 0x80483b4 System programming
Memory Allocation Functions • dynamic allocation of memory from heap • provide suitable alignment (i.e. follow most restrictive requirement of the system) • ex) returns an addresses that are multiples of 4 • library manages memory pool • free block list • freed space is kept in the malloc pool, not returned to the kernel • first fit algorithm for allocation • Allocates extra space for bookkeeping: size, next allocated block, etc • expands the heap with sbrk void *malloc(size_t size); void *calloc(size_t nobj, size_t size); void *realloc(void *ptr, size_t newsize); returns : nonnull pointer if OK, NULL on error void free(void *ptr); System programming
Memory Allocation Functions • malloc : Allocate specified number of bytes of memory • calloc: Allocate space for specified number of objects and size • space is initialized to all 0 bits • realloc : Change size of previously allocated memory • if newsize > oldsize, it may need copying the existing data to the new area • Catastrophic but hard-to-find mistakes • Writing beyond allocated memory boundary – result in overwriting record keeping area • freeing already freed memory block • free calls with a pointer not from three alloc functions • Not freeing allocated memory (memory leak) System programming
setjmp and longjmp • Jump across function call (nonlocal goto) Useful for dealing with error conditions in deeply nested function calls and interrupts encountered in a low-level subroutine of a program. • setjmp store information for return to setjmp point at env • returns 0 if called directly • returns nonzero(val) if returning from a call to longjmp • longjmp use env to jump to setjmp point and val as a return value. • Several longjmp can use the same setjmp location with different val. • normally env is a global variable since it can be referenced from another function. #include <setjmp.h> int setjmp(jmp_buf env); returns : 0 if called directly, nonzero if returning from a call tolongjmp void longjmp(jmp_buf env, int val); System programming
setjmp and longjmp #include <setjmp.h> #define TOK_ADD 5 jmp_buf jumpbuffer; /* global variable! */ int main(void){ char line[MAXLINE]; while(fgets(line, MAXLINE, stdin) != NULL) /* main line */ do_line(line); exit(0); } void do_line(int line){ ... cmd_add(); ... } void cmd_add(void) { int token; token = get_token(); if (token < 0) { /* error! need to go back to the main line; what to do? */ } /* rest of processing for this command */ } System programming
setjmp and longjmp #include <setjmp.h> #define TOK_ADD 5 jmp_buf jumpbuffer; /* global variable! */ int main(void){ char line[MAXLINE]; while(fgets(line, MAXLINE, stdin) != NULL) /* main line */ do_line(line); exit(0); } void do_line(int line){ ... cmd_add(); ... } void cmd_add(void) { int token; token = get_token(); if (token < 0) { /* error! need to go back to the main line; what to do? */ } /* rest of processing for this command */ } if (setjmp(jumpbuffer) != 0) printf(“error\n”); longjmp(jmpbuffer, 1); System programming
setjmpand longjmp mechanism : save and restore some registers (sp, ip, …) top of stack stack frame for main setjmp save sp, ip and registers longjmp restore sp,ip and registers stack frame for do_line direction of stack growth stack frame for cmd_add System programming
Automatic, Register & Volatile Variables • Automatic variables: variables internal to a function • Register variables: advise the compiler to put specified variables to machine register; not guaranteed -do not rely on it • Volatile variables: forces compiler to suppress optimizations • Effects the variable “right away” • saved at system memory not CPU registers • Problems with automatic variables and register variables when using setjmp and longjmp • After longjmp,the values of these variables are implementation dependent • A variable stored in a register can be rolled back (returned to the original value saved by setjmp value before setjmp is called); • Volatile variables are not rolled back • Global and static variables are left alone System programming
Problems with Register Variables #include <setjmp.h> #include <stdio.h> #include <stdlib.h> static jmp_buf jumpbuffer; static void f1(int, int, int); static void f2(void); int main(void){ int count; register int val; volatile int sum; count=2; val=3; sum=4; if (setjmp(jumpbuffer) != 0) { printf(“after longjmp: count=%d, val=%d, sum=%d\n”, count, val, sum); exit(0); } count=97; val=98; sum=99; /*after setjmp before longjmp*/ f1(count, val, sum); /* never returns */ } System programming
Problems with Register Variables static void f1(int i, int j, int k) { printf(“in f1(): count=%d, val=%d, sum=%d\n”, i, j, k); f2(); } static void f2(void) { longjmp(jmpbuffer, 1); } System programming
Problems with Register Variables • Why this happens? • Variable stored in memory will have values as of the time of the longjmp, while variables in the CPU and floating point registers are restored to their values when setjmpwas called • However, system dependent!! • The optimization puts valand count in registers, while sum is stored in memory because it is a volatile variable $ gcc testjmp.ccompile without any optimization; no rollback happens $ ./a.out in fl(): count=97, val=98, sum=99 after longjmp: count=97, val=98, sum=99 $ gcc -O testjmp.ccompile with full optimization; partially rollbacked $ ./a.out in fl(): count=97, val=98, sum=99 after longjmp: count=2, val=3, sum=99 System programming
Problems with Register Variables gcc testjmp.c gcc –O testjmp.c • All variables are stored in memory • some variables are stored in registers stack frame for main count=97 val=98 sum=99 stack frame for main sum=99 longjmp restores sp, ip and registers count = 2 val = 3 longjmp restores sp, ip and registers stack frame for f1 stack frame for f1 stack frame for f2 stack frame for f2 System programming
Problems with Automatic Variables • Don’t make automatic variable references go out of the function FILE * open_data(void) { FILE *fp; char databuf[BUFSIZE]; /* automatic variable. */ fp = fopen(“pail.irum”,”r”); setvbuf(fp, databuf, _IOLBF, BUFSIZE);/* Don’t do this */ return fp; /* databuf will be lost after return */ } POINT* addPoint(POINT op1,POINT op2) { POINT res; res.x = op1.x + op2.x; res.y = op1.y + op2.y; return &res; /* res will be lost after return */ } System programming
addpoint.c - 오류있는 코드 #include <stdio.h> struct point { int x; int y; }; typedef struct point POINT; POINT *addPoint(POINT *p1, POINT *p2) { POINT res; res.x = p1->x + p2->x; res.y = p1->y + p2->y; return &res; } int main(void) { POINT a={10, 3}; POINT b={1, -1}; POINT *cp; cp = addPoint(&a, &b); printf("Now cp(%p) is not a valid address\n", cp); printf("cp=(%d, %d)\n", cp->x, cp->y); return 0; } System programming
addpoint.c - 해결방안 • callee가 메모리 동적할당 • addPoint()에서 메모리를 동적할당 받아, 그 곳에 세번째 좌표를 계산해 넣고 반환한다. • main()에서 나중에 free()를 호출해야 함 • caller가 메모리 제공 • main()에서 세번째 좌표를 저장할 메모리를 제공 • addPoint()가 세번째 좌표의 주소를 받도록 수정해야 함 • 전역변수 사용 System programming