1 / 58

CPS 310 Processes and Unix Processes

CPS 310 Processes and Unix Processes. Jeff Chase Duke University http:// www.cs.duke.edu /~chase/ cps310. The story so far: process and kernel. A (classical) OS lets us run programs as processes . A process is a running program instance (with a thread ).

goro
Download Presentation

CPS 310 Processes and Unix Processes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CPS 310 Processes and Unix Processes Jeff Chase Duke University http://www.cs.duke.edu/~chase/cps310

  2. The story so far: process and kernel • A (classical) OS lets us run programsas processes. A process is a running program instance (with a thread). • Program code runs with the CPU core in untrusted user mode. • Processes are protected/isolated. • Virtual address space is a “fenced pasture” • Sandbox: can’t get out. Lockbox: nobody else can get in. • The OS kernel controls everything. • Kernel code runs with the core in trusted kernel mode.

  3. Processes and the kernel Each process has a private virtual address space and one or more threads. Programs run as independent processes. data data Protected system calls, and faults, ...and upcalls (e.g., signals) Protected OS kernel mediates access to shared resources. Threads enter the kernel for OS services. The kernel code and data are protected from untrusted processes.

  4. Upcall example: Unix signals • Signals are asynchronous notifications to a user process that some event of interest to it has occurred. • A process may register signal handlersfor various events relating to the process. The signal handlers are procedures in user space. • To deliver a signal, the kernel redirects a user thread to execute a selected registered signal handler in user mode. • Unix signals take a default action if no handler is registered. • E.g., segmentation fault  die. Other actions: ignore, stop data data ...and upcalls (e.g., signals) Protected system calls

  5. Processes and their threads stack main thread virtual address space other threads (optional) +… + STOP Each process has a virtual address space (VAS): a private name space for the virtual memory it uses. The VAS is both a “sandbox” and a “lockbox”: it limits what the process can see/do, and protects its data from others. wait Each process has a main thread bound to the VAS, with a stack. If we say a process does something, we really mean its thread does it. The kernel can suspend/restart a thread wherever and whenever it wants. On real systems, a process can have multiple threads. We presume that they can all make system calls and block independently.

  6. The theater analogy virtual memory (stage) script Threads Program Address space Running a program is like performing a play. [lpcox]

  7. The sheep analogy Address space “fenced pasture” Code and data Thread

  8. The core-and-driveranalogy The machine has a bank of CPU cores for threads to run on. The OS allocates cores to threads (the “drivers”). Cores are hardware. They go where the driver tells them. OS can force a switch of drivers any time. Core #2 Core #1

  9. Threads drive cores

  10. What was the point of that whole thing with the electric sheep actors? • A process is a running program. • A running program (a process) has at least one thread (“main”), but it may (optionally) create other threads. • The threads execute the program (“perform the script”). • The threads execute on the “stage” of the process virtual memory, with access to a private instance of the program’s code and data. • A thread can access any virtual memory in its process, but is contained by the “fence” of the process virtual address space. • Threads run on cores: a thread’s core executes instructions for it. • Sometimes threads idle to wait for a free core, or for some event. Sometimes cores idle to wait for a ready thread to run. • The operating system kernel shares/multiplexes the computer’s memory and cores among the virtual memories and threads.

  11. More analogies: threads and stacks stack stack • Threads drive their cores on paths across the stage. • Each thread chooses its own path. (Determined by its program.) • But they must leave some “bread crumbs” to find their way back on the return! • Where does a thread leave its crumbs? On the stack! • Call frames with local variables • Return addresses This means that each thread must have its own stack, so that their crumbs aren’t all mixed together.

  12. Kernel Stacks and Trap/Fault Handling stack stack stack stack System calls and faults run in kernel mode on a kernel stack for the current thread. Threads execute user code on a user stack in user space (the process virtual address space). data Each thread has a second kernel stack in kernel space (VM accessible only in kernel mode). Kernel code running in P’s process context has access to P’s virtual memory. syscall dispatch table The syscall (trap) handler makes an indirect call through the system call dispatch table to the handler registered for the specific system call.

  13. Thread context • Each thread has a context (exactly one). • Context == values in the thread’s registers • Including a (protected) identifier naming its VAS. • And a pointer to thread’s stack in VAS/memory. • Each core has a context (at least one). • Context == a register set that can hold values. • The register set is baked into the hardware. • A core can change “drivers”: context switch. • Save running thread’s register values into memory. • Load new thread’s register values from memory. • Enables time slicing or time sharing of machine. CPU core R0 Rn x PC y SP registers

  14. Two threads stack stack Virtual memory “on deck” and ready to run program x code library running thread data R0 CPU (core) Rn y x PC y SP registers Register values saved in memory

  15. Thread context switch stack stack switch out switch in Virtual memory program x code library data R0 1. save registers CPU (core) Rn y x PC y SP registers 2. load registers Running code can suspend the current thread just by saving its register values in memory. Load them back to resume it at any time.

  16. More analogies: context/switching 1 2 3 Page links and back button navigate a “stack” of pages in each tab. Each tab has its own stack. One tab is active at any given time. You create/destroy tabs as needed. You switch between tabs at your whim. Similarly, each thread has a separate stack. The OS switches between threads at its whim. One thread is active per CPU core at any given time. time 

  17. What causes a context switch? There are three possible causes: • Preempt (yield). The thread has had full use of the core for long enough. It has more to do, but it’s time to let some other thread “drive the core”. • E.g., timer interrupt, quantum expired  OS forces yield • Thread enters Ready state, goes into pool of runnable threads. • Exit. Thread is finished: “park the core” and die. • Block/sleep/wait. The thread cannot make forward progress until some specific occurrence takes place. • Thread enters Blocked state, and just lies there until the event occurs. (Think “stop sign” or “red light”.) STOP wait

  18. Thread states and transitions running “driving a car” yield/preempt STOP Scheduler governs these transitions. wait sleep “waiting for someplace to go” dispatch blocked ready wakeup “requesting a car” wait, STOP, read, write, listen, receive, etc. Sleep and wakeup are internal primitives. Wakeup adds a thread to the scheduler’s ready pool: a set of threads in the ready state.

  19. Thread states and transitions We will presume that these transitions occur only in kernel mode. This is true in classical Unix and in systems with pure kernel-based threads. Before a thread can sleep, it must first enter the kernel via trap (syscall) or fault. Before a thread can yield, it must enter the kernel, or the core must take an interrupt to return control to the kernel. STOP wait running On entry to the running state, kernel code decides if/when/how to enter user mode, and sets up a suitable context E.g., for initial start, return from fault or syscall, or to deliver a signal. yield preempt sleep dispatch blocked ready wakeup

  20. More analogies See the student art in the solutions for the 13f final exam, which is posted in the exam archive.

  21. More of my favorite student art See the student art in the solutions for the 13f final exam, which is posted in the exam archive.

  22. Process management • OS offers system call APIs for managing processes. • Create processes (children) • Control processes • Monitor process execution • “Join”: wait for a process to exit and return a result • “Kill”: send a signal to a process • Establish interprocess communication (IPC: later) • Launch a program within a process • We study the Unix process abstraction as an example. • Illustrative and widely used for 40+ years! • Use it to build your own shell.

  23. Unix: A lasting achievement? “Perhaps the most important achievement of Unix is to demonstrate that a powerful operating system for interactive use need not be expensive…it can run on hardware costing as little as $40,000.” DEC PDP-11/24 • The UNIX Time-Sharing System* • D. M. Ritchie and K. Thompson • 1974 http://histoire.info.online.fr/pdp11.html

  24. The essence of Unix process “fork” fork Oh Ghost of Walt, please don’t sue me.

  25. Unix fork/exec/exit/wait syscalls • intpid = fork(); • Create a new process that is a clone of its parent, running the same program. • exec*(“program”[argvp, envp]); • Overlay the calling process with a new program, and transfer control to it, passing arguments and environment. • exit(status); • Exit with status, destroying the process. • intpid = wait*(&status); • Wait for exit (or other status change) of a child, and “reap” its exit status. • Recommended: use waitpid(). fork child fork parent parent program initializes child context exec time wait exit

  26. The Shell • Users may select from a range of command interpreter (“shell”) programs available. (Or even write their own!) • csh, sh, ksh, tcsh, bash: choose your flavor… • Shells execute commands composed of program filenames, args, and I/O redirection symbols. • Shell uses fork, exec, wait, etc., etc. • Can coordinate multiple child processes that run together as a process group or job. • Shells can run files of commands (scripts) for more complex tasks, e.g., by redirecting I/O channels (descriptors). • Shellbehavior is guided by environment variables, e.g., $PATH • Parent may control/monitor all aspects of child execution.

  27. Postnote • The following slides were used in recitation on 1/27. • They fill in details on Unix fork/exec/exit/wait, variations and how they evolved, usage, and the power of combining these simple concepts. • They also illustrate fundamental threading concepts: thread create/exit/join, physical and logical concurrency, blocking/sleeping and wakeup, and parallelism. • You should be sure to understand all the concepts, but the details of the system calls and C/Unix programming model are for illustration only. • All of the C examples are available on the course web.

  28. Unix fork/exit syscalls • intpid = fork(); • Create a new process that is a clone of its parent. Return child process ID (pid) to parent, return 0 to child. • exit(status); • Exit with status, destroying the process. • Status is returned to the parent. • Note: this is not the only way for a process to exit! parent fork parent child time data data exit exit p pid: 5587 pid: 5588

  29. exit syscall (original concept)

  30. fork The forksyscall returns twice: It returns a zero in the context of the new child process. It returns the new child process ID (pid) in the context of the parent. intpid; int status = 0; if (pid = fork()) { /* parent */ ….. } else { /* child */ ….. exit(status); }

  31. A simple program: sixforks … int main(intargc, char* argv) { fork(); fork(); fork(); fork(); fork(); fork(); printf("Process %d exiting.\n", getpid()); } How many processes are created by these six forks? chase$ cc –o sixforkssixforks.c chase$ ./sixforks ??? chase$ getpidsyscall: Get processID of current process.

  32. A simple program: sixforks Process 15220 exiting. Process 15209 exiting. Process 15232 exiting. Process 15219 exiting. Process 15233 exiting. Process 15223 exiting. Process 15210 exiting. Process 15234 exiting. Process 15228 exiting. Process 15192 exiting. Process 15230 exiting. Process 15211 exiting. Process 15227 exiting. Process 15239 exiting. Process 15231 exiting. Process 15242 exiting. Process 15243 exiting. Process 15240 exiting. Process 15236 exiting. Process 15241 exiting. Process 15244 exiting. Process 15247 exiting. Process 15235 exiting. Process 15245 exiting. Process 15250 exiting. Process 15248 exiting. Process 15249 exiting. Process 15204 exiting. Process 15238 exiting. Process 15251 exiting. Process 15237 exiting. Process 15252 exiting. Process 15253 exiting. Process 15246 exiting. Process 15254 exiting. chase$ cc –o sixforkssixforks.c chase$ ./sixforks Process 15191 exiting. Process 15200 exiting. Process 15195 exiting. Process 15194 exiting. Process 15197 exiting. Process 15202 exiting. Process 15193 exiting. Process 15198 exiting. Process 15215 exiting. Process 15217 exiting. Process 15218 exiting. Process 15203 exiting. Chase$ Process 15212 exiting. Process 15196 exiting. Process 15222 exiting. Process 15213 exiting. Process 15221 exiting. Process 15224 exiting. Process 15206 exiting. Process 15216 exiting. Process 15205 exiting. Process 15207 exiting. Process 15201 exiting. Process 15214 exiting. Process 15225 exiting. Process 15199 exiting. Process 15226 exiting. Process 15208 exiting. Process 15229 exiting. … int main(intargc, char* argv) { fork(); fork(); fork(); fork(); fork(); fork(); printf("Process %d exiting.\n", getpid()); }

  33. sixforks: some questions • What if I want to create six children, but I don’t want my children to have children of their own? • What if I want the program to print the total number of processes created? How? (Other than by having the program do the math.) • How much memory does this program use? How many pages? • How does this test system assign process IDs? • Why do the process IDs print out of order?

  34. fork (original concept)

  35. fork in action today Fork is conceptually difficult but syntactically clean and simple. I don’t have to say anything about what the new child process “looks like”: it is an exact clone of the parent! The child has a new thread executing at the same point in the same program. The child is a new instance of the running program: it has a “copy” of the entire address space. The “only” change is the process ID and return code cpid! The parent thread continues on its way. The child thread continues on its way. void dofork() { intcpid = fork(); if (cpid < 0) { perror("fork failed: "); exit(1); } else if (cpid == 0) { child(); } else { parent(cpid); } }

  36. A simple program: forkdeep int count = 0; int level = 0; void child() { level++; output pids if (level < count) dofork(); if (level == count) sleep(3); /* pause 3 secs */ } void parent(intchildpid) { output pids wait for child to finish } main(intargc, char *argv[]) { count = atoi(argv[1]); dofork(); output pid } We’ll see later where arguments come from. level==1 level==2

  37. chase$ ./forkdeep 4 30866-> 30867 30867 30867-> 30868 30868 30868-> 30869 30869 30869-> 30870 30870 30870 30869 30868 30867 30866 chase$ chase$ ./forkdeep 3 11496-> 11498 11498 11498-> 11499 11499 11499-> 11500 11500 11500 11499 11498 11496 chase$

  38. wait Process states (i.e., states of the main thread of the process) “sleep” “wakeup” Note: in modern systems the waitsyscall has many variants and options.

  39. wait today • Parent uses wait to sleep until the child exits; wait returns child pid and status. • Wait variants allow wait on a specific child, or notification of stops and other “signals”. • Recommended: use waitpid(). int pid; int status = 0; if (pid = fork()) { /* parent */ ….. pid = wait(&status); } else { /* child */ ….. exit(status); }

  40. A simple program: parallel … int main(intargc, char* argv) { for i to N dofork(); for i to N wait(…); } void child() { BUSYWORK {x = v;} exit(0); } … Parallel creates N child processes and waits for them all to complete. Each child performs a computation that takes, oh, 10-15 seconds, storing values repeatedly to a global variable, then it exits. How does N affect completion time? chase$ cc –o parallel parallel.c chase$ ./parallel ??? chase$

  41. A simple program: parallel Three different machines Completion time (ms) N (# of children)

  42. Parallel: some questions • Which machine is fastest? • How does the total work grow as a function of N? • Does completion time scale with the total work? Why? • Why are the lines flatter for low values of N? • How many cores do these machines have? • Why is the timing roughly linear, even for “odd” N? • Why do the lines have different slopes? • Why would the completion time ever drop with higher N? • Why is one of the lines smoother than the other two? • Can we filter out the noise? • Do the processes contend for the single global variable?

  43. But how do I run a new program in my child process? • The child, or any process really, can replace its program in midstream. • exec* system call: “forget everything in my address space and reinitialize my entire address space with stuff from a named program file.” • The exec system call never returns: the new program executes in the calling process until it dies (exits). • The code from the parent program runs in the child process and controls its future. The parent program selects the child program that the child process will run (via exec), and sets up its connections to the outside world. The child program doesn’t even know how those connection are set up! • But don’t forget to check error status from exec*! It returns an error to parent program if it fails.

  44. Running a program data code (“text”) constants initialized data Process sections segments Thread Unix: fork/exec Program virtual memory When a program launches, the OS creates a process to run it, with a main thread to execute the code, and a virtual memory to store the running program’s code and data.

  45. exec (original concept)

  46. A simple program: forkexec … main(intargc, char *argv[]) { int status; intrc = fork(); if (rc < 0) { perror("fork failed: "); exit(1); } else if (rc == 0) { printf("Iam a child: %d.\n", getpid()); argv++; execve(argv[0], argv, 0); /* NOTREACHED */ } else { waitpid(rc, &status, 0); printf(“Child %d exited with status %d\n.", rc, WEXITSTATUS(status)); } } Always check return from syscallsand show any errors! Parent program running in child process A successful exec* never returns to calling program. Reap exitstatus returnvalue from child via exit/wait.

  47. A simple program: prog0 … int main() { printf("Hi from %d!\n", getpid()); exit(72); } exitsyscall: Pass exitstatus returnvalue to parent via exit/wait. chase$ cc –o forkexecforkexec.c chase$ cc –o prog0 prog0.c chase$ ./forkexec prog0 I am a child: 11384. Hi from 11384! Child 11384 exited with status 72. chase$ getpidsyscall: Get processID of current process.

  48. Exec setup (ABI) The details aren’t important. The point is: The exec system call sets up the VAS of the calling process to execute a named program. Exec passes two arrays of strings to the new program’s main(): an array of arguments and an array of named environment variables. It stages the argv/env arrays in the VAS before returning to user mode to start execution at main(). System V Application Binary Interface AMD64 Architecture Processor Supplement

  49. Simple I/O: args and printf chase$ cc –o prog1 prog1.c chase$ ./forkexecprog1 arguments: 1 0: prog1 child 19178 exited with status 0 chase$ ./forkexec prog1 one 2 3 arguments: 4 0: prog1 1: one 2: 2 3: 3 Child 19181 exited with status 0. #include <stdio.h> int main(intargc, char* argv[]) { inti; printf("arguments: %d\n", argc); for (i=0; i<argc; i++) { printf("%d: %s\n", i, argv[i]); } }

  50. Environment variables and property lists • The environment variable array is a property list. • The property list construct is very common and useful! • Also commonly used for configuration files. • It goes by various names: Java plist, Windows Registry, INI files • Each element of the list is a string: “NAME=VALUE”. • The standard library has primitives to look up the VALUE corresponding to a NAME. • In Unix systems: standard environment variables are handed down through the shell: they give programs lots of information about the environment. • The parent specifies them to the exec* syscall.

More Related