440 likes | 661 Views
Unix System Interface Programming Part 6.1 – System Interface Overview Prepared by Xu Zhenya( xzy@buaa.edu.cn ). Draft – Xu Zhenya( 2002/10/01 ) Rev1.0 – Xu Zhenya( 2002/10/10 ). Agenda. 1. System Interface Programming Standards 2. how a “C” program is started?
E N D
Unix System Interface Programming Part 6.1 – System Interface Overview Prepared by Xu Zhenya( xzy@buaa.edu.cn ) Draft – Xu Zhenya( 2002/10/01 ) Rev1.0 – Xu Zhenya( 2002/10/10 )
Agenda • 1. System Interface Programming Standards • 2. how a “C” program is started? • 3. System call and library functions • 4. String functions • 5. Error handling • 6. Memory management • 7. man & api • 8. Debugging programs • 9. Summary • Appendix: • Understanding O.S. Kernel • Recommanded Books
Standards (2) • X/Open Common Applications Environment (CAE) Portability Guide Issue 3 (XPG3) and Issue 4 (XPG4) • SUS( Single UNIX Specification ), SUSv2 • XNS4: Networking Services Issue4 • ILP32 and LP64 programming environments • Notes: • 1. The developers of SVID3( UNIX Systems Laboratories) are no longer in business, and this specification defers to POSIX and X/Open CAE. • 2. Utilities: conflictions with historical Solaris utility • Included /usr/xpg4/bin into PATH, and before other utilities.
Standards (4) • POSIX inlcues the following standards: • 1003.1 :System Interface( POSIX.1 ) • POSIX.1 has adopted virtually all ANSI C library calls. • However, POSIX.1 has not adopted include operations on wide characters and multi-byte characters( as used for Chinese ). • 1003.1b : Real-time extensions • 1003.1c: User-level threads( pthreads, POSIX threads library ) • 1003.1g: Networking standards • 1003.2: Shells and utilities
Standards (5) • Feature test macros • __EXTENSIONS__: the application with access to all interfaces and headers not in conflict with the specified standard.
Standards (6) • Case Study • Solaris: /usr/include/sys/feature_test.h • Linux: /usr/include/features.h • Compiler options • - see C/C++ Manual: for Gnu C/C++, see the links page. #if ( __STDC__ == 0 && !defined(_POSIX_C_SOURCE) && \ !defined(_XOPEN_SOURCE)) || \ (defined(_XOPEN_SOURCE) && _XOPEN_VERSION - 0 >= 4) || \ defined(__EXTENSIONS__)
How a C program is started? (3) LINK EDITOR MEMORY MAP (ELF): ld –m –o who who.o
How a C program is started? (4) • Command line arguments & environment variables • => Limits on the length of arguments and environment • sysconf( _SC_ARG_MAX ) • exec(): copy the arguments list and environmental variables after setting the user’s stack ( crt_init() ) • getenv, setenv() & putenv() • Environmental table • extern char ** environ; • int main( int argc, char *argv[], char *env[] );
System calls & library calls ( 2 ) • 1. malloc & sbrk • 2. time & date: time() • 3. We can conclude that: • A few of syscalls (256 or less) • Apps. -> syscall / libc(->syscall) • Making a syscall • When we make a system call parameter that is a pointer to a data object, we must allocate space for the object and pass its address in the call. Example: time( time_t * p_time )
System calls & library calls ( 3 ) • Make a library call • For a return value or parameter declared as a pointer type, there are three possibilities: • Don’t Allocate space: • The same as the system calls • Allocate space statically: • Copy the object pointed by the address: ctime(3C) • Allocate space dynamically: • MUST remember to free the space: strdup() • Read the man pages carefully.
System calls & library calls ( 4 ) • 12 time( tptr ); • 0x08050820: main : pushl %ebp • 0x08050821: main+0x0001: movl %esp,%ebp • 0x08050823: main+0x0003: subl $8,%esp /* time_t *tptr */ • 0x08050826: main+0x0006: movl -8(%ebp),%eax • 0x08050829: main+0x0009: pushl %eax • 0x0805082a: main+0x000a: call time [PLT] <0x80506dc> • 0x0805082f: main+0x000f: addl $4,%esp • 13 printf( "Machine time in sec = %d\n", *tptr ); • 0x08050832: main+0x0012: movl -8(%ebp),%eax • 0x08050835: main+0x0015: movl (%eax),%eax • 0x08050837: main+0x0017: pushl %eax • 0x08050838: main+0x0018: pushl $0x8050900 • 0x0805083d: main+0x001d: call printf [PLT] <0x80506ec> • 0x08050842: main+0x0022: addl $8,%esp
System calls & library calls ( 5 ) • static char cbuf[26]; • char * • ctime( t ) • const time_t *t; • { • return ( asctime( localtime( t ) ) ); • } • char * • asctime( t ) • const struct tm *t; • { • register char *cp; • /* …… */ • return( cbuf ); • } • Ctime.c
System calls & library calls ( 6 ) struct sysent { char sy_narg; /* total number of arguments */ #ifdef _LP64 unsigned short sy_flags; /* various flags as defined below */ #else unsigned char sy_flags; /* various flags as defined below */ #endif int (*sy_call)(); /* argp, rvalp-style handler */ krwlock_t *sy_lock; /* lock for loadable system calls */ int64_t (*sy_callc)(); /* C-style call hander or wrapper */ }; #define NSYSCALL 256 /* number of system calls */ extern struct sysent sysent[]; #ifdef _SYSCALL32_IMPL extern struct sysent sysent32[]; #endif /* See /usr/include/sys/systm.h & /etc/name_to_sysnum */
System calls & library calls ( 7 ) • fork – • __asm__(" movl $0x2, %eax // SYS_fork, see /etc/name_to_synum lcall $0x27, $0x0 "); • Solaris Kernel • 1. trap into the kernel • 2. enter the common trap handler • Save context : the pointer to CPU structure, the return address • Save sycall arguments into LWP_CB • if ( t_pre_sys ) the do the pre-action /* truss, micro-state */ • Call sysent[ 2 ].sy_callc() – sys_fork(); return; • Check if here is a signal? • If ( t_post_sys) then post_action • Set the returned value or errno • Return; • Fast System call – gethrtime(), gethrvtime() & gettimeofday() • Performance: using the registers
System calls & library calls ( 8 ) • Misc: Tools used to trace system calls made by a process: • It executes the specified command and produces a trace of the system calls performed by theat command, the signals the command receives, and the machie faults the command incurs. • Solaris: truss • Linux: strace
String Functions • 1. textbook, p129 – Table6-2 • 2. Parsing an input string: • strpbrk, strstr, strspn, strcspn, strtok, index, rindex • 3. Memory functions: • memcpy, memmove, memccpy, memchr, memcmp, memset • Used to copy structures. But some compilers support structures assignement. • Don’t use the memcmp() to compare data structures: padding • 4. String Conversion Functions: • strtol, strtoll, atol, atoll, atoi • 5. Byte string functions: • bzero, bcopy, etc • 6. Notes: • Buffer overflow => security ( size_t count ) • For programs in C++, we should use its native string manipulation class & methods.
Dynamic Memory Allocation • 1. Memory Allocation functions: • malloc, free, realloc, calloc, alloca • Heap & stack( automatic storage class ) • => Memory leaks: CASE Tools • => Alignment: malloc() & compiler supporting • 2. Resource management • Resource includes many things in our code: memory, file descriptors, locks • Resource management: allocate & free, lock & unlock, state management, etc • Open issues: • who should allocate and free them? Specially who free them? • => answer: do allocate and free resources at the same level • In multithreaded programs, very difficult!!!! => so be carefully. • But here C++-class can be used to the basic unit to manage resources. • In some situations, we have to implement our own memory pool for the central management: performance, debugging, etc.
Error Handling ( 1 ) • 1. Error handling functions: • perror, strerror • Notes: • The return value & “errno” • System calls & Library functions • When system calls fail they always (1) return -1; (2) set errno. • Library calls might set error when an error occurs. Check the man pages. • 2. terminating a process: • exit(), abort() • Signals => terminate abnormally.
Error Handling ( 2 ) • 3. Discussion about error handling • 1. we should handle errors at high-level, and check errors in the low-level => but how to return the error information? • In Unix, return -1, and set errno • If there are more information than one integer value? • In C++, pair in STL is one choice. • In multithreaded programs, TLS is one choice. • 2. carefully using exception in C++: • Performance issues: • Control sequence, and our code are filled with try. • Memory may not be freed. ( in java, GC can do it. Maybe it’s why Java uses exceptions everywhere? )
Man Pages & API (1) • 1. Xman, Answerbook2, etc • 2. Discussion: • UML • Syntax: • Signature: internal & external representation • Type check • Semantics • Parameters:value/reference • Resource?
Man Pages & API (2) Corba-IDL: the famous example –ATM interface BankServer { ::xaction HandleTransaction( inout ::xaction Transaction ); long BankID(); }; interface Customers { exception CustomerException{ string s; }; // Get the data packet for a single customer any GetCustomers( in boolean metadata ); // Apply a delta packet to the customer table any ApplyCustomerUpdates( in any Delta, out long ErrorCount ); };
How to document our own code?(1) • documenting our assumptions, our approach, and our reasons for choosing the approach we did. • Donald Knuth once observed that we should be able to read a well-written program just as we read a well-written book. • "Self-Documenting Code," Chapter 19 by Steve McConnell • We also need to keep our comments coordinated with the code. • Each function or method needs a sentence or two that clarifies the following information: • What the routine does • What assumptions the routine makes • What each input parameter is expected to contain • What each output parameter is expected to contain on success and failure • Each possible return value
How to document our own code?(2) • Each part of the function that isn't completely obvious from the code needs a sentence or two that explains what it's doing. • Any interesting algorithm deserves a complete description. • Any nontrivial bugs we've fixed in the code need to be commented with the bug number and a description of what we fixed. • Well-placed trace statements, assertions, and good naming conventions can also serve as good comments and provide excellent context to the code. • Comment as if we were going to be the one maintaining the code in five years.
How to document our own code?(3) • Case Study int32_t terminal_mngr::configure ( sal_himkey * p_conf ) Read the configuration information from "p_conf", create terminals & configure them, and add them into the terminal table. One example of the format of HiM configuration is defined in conf/term1234_conf.orig. • the caller of this member method is "login" • executed in the main thread Parameters: p_conf : sal_himkey [in] the pointer to the configuration inforamtion. Returns: if successfully loading, then return 0, or a non-zero errval. Error Codes • ERRID_TERMINAL_MNGR_CONF_NR_TERM • invalid nr_terms in the configuration paratmers • ERRID_TERMINAL_MNGR_CONF_NOSUBKEY • no subkey for one terminal • ERRID_TERMINAL_MNGR_CONF_INVALIDTYPE • invalid type for one terminal • ERRID_TERMINAL_MNGR_CONF_CREATE_INSTANCE • fail to create the instance for a terminal
Debugging programs (1) • 1. Debuggers on Unix ( Linux here ) • Backend: ( gdb, read the page “links” to get details ) • Frontend: insight, DDD, kdbg • Notes: • Commercial development environments: workshop for Solaris • Other CASE Tools: runtime memory check, etc • 2. Suggestions – Controlling the process • the debugger can answer all our debugging questions as long as we ask it the right questions. • having one or more hypothesis in mind—something you want to prove or disprove—before leveraging the debugger.
Debugging programs (2) 3. Debugging Process
Summary • In our programming practice, the following issues are the most important: • Interface • Error handling • Resource management: ( memory ) • Document code, add many “assert” into our code • Control our debugging process • refractoring
Exercise: Part6-1 • Read the homework page to get details. • Makefile, C/C++ Compilers, and the debugger(gdb) • Sample Output: [upe@linux exercise1]$ ./shell myshell -> who am i [0] : who [1] : am [2] : i myshell -> I am UPE [0] : I [1] : am [2] : UPE myshell -> exit [upe@linux exercise1]$
Appendix - A • Understanding O.S. Kernel – Process and its environment
Understanding “process” – its environment • 1. a process is a VM implemented by the kernel to run a executable program. • 2. define the VM – thinking about a computer • Instruction set: the user-level instructions & system calls • Memory: address space • I/O subsystem: file system • Interrupt: signal, asynchronous event • Process-to-process communication: IPC • Misc: protection mechanisms, debugging, etc • 3. executable programs – produced by the linker, map file • Text, data segment( bbs, ro, etc ), stack, heap • Image after loaded into the memory by exec()
Understanding “process” – its environment • 4. the first step: Process control block ( 1 ) • Hardware context: registers, flags, etc ( user & kernel stack pointer !!! ) • Address space: text, data, stack, heap • File system: hierarchical name management (current directory, umask, etc), opened file descriptors( object references ) • Signal management: signal handlers, blocked & ignored, pending • Current terminal: input and output text • Identification: process ID • 5. O.S. is a hardware resource manager for many VM. (2) • Scheduling class, priority management • State management/control: the famous state machine • 6. relationship management ( 3 ) • relationships among VMs: family( father and child ), process groups • VM and the owner( users )/credentials: user’s ID, user’s group ID
Understanding “process” – its environment • 7. it’s very important to understand the runtime scenario: • How to enter the kernel? Interrupt, trap and exception • How to leave from the kernel? signal processing, scheduling • Synchronization: • among processes: lock( mutex ), sleep/wakeup • the interrupt handlers and the remainder: the interrupt level • 7. now we can further to a multiprocessors (SMP) (4) • Threads & process: now two context (control block) • PCB should include threads management: • threads list: • CPU context, scheduling parameters and stack pointer (SP) should move to the thread’s context
Understanding “process” – its environment • 8. Solaris’ thread and process model: • Process, the user-level thread, LWP and kernel thread • Why choose this model: • Flexibility and high-performance • Book, p7-10 • Concurrency and parallel • ULT: not require kernel resource, and fast context switch, user-level scheduling • LWP: real parallelism • Questions: • Too complicated for tuning and programming • mapping the user-level thread(Pthread)’s scheduling attributes? • Scheduling activation & SIGWAITING • Need more consideration here!!!! • user-level scheduling • Computation-bound thread
Understanding “process” – its environment • Modeling the concepts: process, LWP and kthread • Kthreads own the scheduling parameters specific TS, RT, IA, SYS • Process control block • Program: vnode, arg & env • Lock: fine-grain lock for threads • Cred: • Address space: including “segments”- segdriver – page table • Pid, pgid: • P_child: family relationship • Singal: siganl-q, single hander vector, mask of ignored blocked • u-area: file descriptors • See: /usr/include/sys/proc.h, /usr/include/sys/user.h
Understanding “process” – its environment • LWP Control Block: /usr/include/sys/kwp.h • An LWP identifier • A signal mask that tells the kernel which signals will be accepted • Saved values of user-level registers (when the LWP is not running) • system call arguments, results, and error • Resource usage and profiling data • Pointer to the corresponding kernel thread • Kthread control block: /usr/include/sys/thread.h • Reference to its own scheduling parameters – specific to the class • A kernel stack • some pointer to CPU structure: affinity, bind, • Flags: t_schedflag, t_state, t_preempt • Linked pointer to other kthreads
Appendix - B • Recommended books
Appendix - B • System Interfaces Programming • The practice of programming, Brian Kernighan • Advanced Programming in the UNIX Environment, Richard Stevens, Addison-Wesley, 1993 • UNIX Networking Programming Volume 2: Inter-Process Communication, Richard Stevens, Addison-Wesley • UNIX System Programming, Keith Haviland, Dina Gray and Ben Salama, 2nd ed., Addison-Wesley, 1999. • Interprocess Communications in UNIX, John Gray, Prentice Hall, 1997. • Programming with UNIX Threads, Charles J. Northrup, John Wiley & Sons, Inc., 1997. • Practical UNIX Programming: A Guide to Concurrency, Communication, and Multithreading, Kay Robbins and Steven Robbins, Prentice Hall, 1996. • A Practical Guide to the UNIX System, Mark Sobell, 3rd ed., Benjamin Cummings, 1995. • UNIX for Programmers and Users, Graham Glass & King Ables, 2nd ed., Prentice Hall, 1999. • The following three books on the Unix Kernel are absolutely classic. • The Design of the UNIX operating System, Maurice Bach, Prentice Hall, 1986. • The Design and Implementation of the 4.4 BSD Operating System, Marshall McKusick, Keith Bostic, Michael Karels and John Quarterman, Addison-Wesley, 1996. • UNIX Internals: The New Frontiers, Uresh Vahalia, Prentice Hall, 1996.