390 likes | 414 Views
Learn about processes, their definitions and key abstractions, memory management, context switching, multiprocessing, and exceptional control flow in computer science.
E N D
Lecture 18: Processes CS 105 April 1, 2019
Definition: A program is a file containing code + data that describes a computation • Definition: A process is an instance of a running program. • One of the most profound ideas in computer science • Not the same as “program” or “processor” • Process provides each program with two key abstractions: • Private address space • Each program seems to have exclusive use of main memory. • Provided by kernel mechanism called virtual memory • Logical control flow • Each program seems to have exclusive use of the CPU • Provided by kernel mechanism called context switching Processes Memory Stack Heap Data Code CPU Registers
Computer runs many processes simultaneously • Applications for one or more users • Web browsers, email clients, editors, … • Background tasks • Monitoring network & I/O devices Multiprocessing: The Illusion Memory Memory Memory Stack Stack Stack Heap Heap … Heap Data Data Data Code Code Code CPU CPU CPU Registers Registers Registers
Running program “top” on Mac • System has 123 processes, 5 of which are active • Identified by Process ID (PID) Multiprocessing Example
Single processor executes multiple processes concurrently • Process executions interleaved (multitasking) • Address spaces managed by virtual memory system • Register values for nonexecuting processes saved in memory Multiprocessing: The (Traditional) Reality Memory Stack Stack Stack Heap Heap Heap … Data Data Data Code Code Code Saved registers Saved registers Saved registers CPU Registers
Save current registers in memory Multiprocessing: The (Traditional) Reality Memory Stack Stack Stack Heap Heap Heap … Data Data Data Code Code Code Saved registers Saved registers Saved registers CPU Registers
Save current registers in memory Schedule next process for execution Multiprocessing: The (Traditional) Reality Memory Stack Stack Stack Heap Heap Heap … Data Data Data Code Code Code Saved registers Saved registers Saved registers CPU Registers
Multiprocessing: The (Traditional) Reality Memory Stack Stack Stack Heap Heap Heap … Data Data Data Code Code Code Saved registers Saved registers Saved registers CPU Registers • Save current registers in memory • Schedule next process for execution • Load saved registers and switch address space (context switch)
Control flows for concurrent processes are physically disjoint in time However, we can think of concurrent processes as running in parallel with each other User View of Concurrent Processes Process A Process B Process C Time
Each process is a logical control flow. • Two processes run concurrently (are concurrent) if their flows overlap in time • Otherwise, they are sequential • Examples (running on single core): • Concurrent: A & B, A & C • Sequential: B & C Concurrent Processes Process A Process B Process C Time
Processes are managed by a shared chunk of memory-resident OS code called the kernel • Important: the kernel is not a separate process, but rather runs as part of some existing process. • Control flow passes from one process to another via a context switch Context Switching Process A Process B user code context switch kernel code Time user code context switch kernel code user code
To implement a context switch, OS maintains a PCB for each process containing: • location in memory • register values • PC, SP, eflags/status register • location of executable on disk • page tables • which user is executing this process • process identifier (pid) • process privilege level • process arguments (for identification with ps) • process status • scheduling information ... and more! Process Control Block (PCB)
Multicore processors • Multiple CPUs on single chip • Share main memory (and some of the caches) • Each can execute a separate process • Scheduling of processors onto cores done by kernel Multiprocessing: The (Modern) Reality Memory Stack Stack Stack Heap Heap Heap … Data Data Data Code Code Code Saved registers Saved registers Saved registers CPU CPU Registers Registers
Exists at all levels of a computer system • Low level mechanisms • 1. Exceptions • Change in control flow in response to a system event (i.e., change in system state) • Implemented using combination of hardware and OS software • Higher level mechanisms • 2. Process context switch • Implemented by OS software and hardware timer • 3. Signals • Implemented by OS software • 4. Nonlocal jumps: setjmp() and longjmp() • Implemented by C runtime library Exceptional Control Flow
An exception is a transfer of control to the OS kernel in response to some event (i.e., change in processor state) • Kernel is the memory-resident part of the OS • Examples of events: Divide by 0, arithmetic overflow, page fault, I/O request completes, typing Ctrl-C Exceptions User code Kernel code Exception Event I_current Exception processing by exception handler I_next • Return to I_current • Return to I_next • Abort
Each type of event has a unique exception number k k = index into exception table (a.k.a. interrupt vector) Handler k is called each time exception k occurs Exception Tables Exception numbers Code for exception handler 0 Exception Table Code for exception handler 1 0 1 Code for exception handler 2 2 ... ... n-1 Code for exception handler n-1
Caused by events external to the processor • Indicated by setting the processor’s interrupt pin • Handler returns to “next” instruction • Examples: • Timer interrupt • Every few ms, an external timer chip triggers an interrupt • Used by the kernel to take back control from user programs • I/O interrupt from external device • Hitting Ctrl-C at the keyboard • Arrival of a packet from a network • Arrival of data from a disk Asynchronous Exceptions (Interrupts)
Caused by events that occur as a result of executing an instruction: • Traps • Intentional • Examples: system calls, breakpoint traps, special instructions • Returns control to “next” instruction • Faults • Unintentional but possibly recoverable • Examples: page faults (recoverable), protection faults (unrecoverable), floating point exceptions • Either re-executes faulting (“current”) instruction or aborts • Aborts • Unintentional and unrecoverable • Examples: illegal instruction, parity error, machine check • Aborts current program Synchronous Exceptions
From a programmer’s perspective, we can think of a process as being in one of three states • Running • Process is either executing, or waiting to be executed and will eventually be scheduled (i.e., chosen to execute) by the kernel • Stopped • Process execution is suspended and will not be scheduled until further notice • Terminated • Process is stopped permanently Process Status
Parent process creates a new running child process by calling fork • int fork(void) • Returns 0 to the child process, child’s PID to parent process • Child is almost identical to parent: • Child get an identical (but separate) copy of the parent’s virtual address space. • Child gets identical copies of the parent’s open file descriptors • Child has a different PID than the parent • fork is interesting (and often confusing) because it is called oncebut returns twice Creating Processes
pid_tgetpid(void) • Returns PID of current process • pid_tgetppid(void) • Returns PID of parent process Obtaining Process IDs
Process becomes terminated for one of three reasons: • Receiving a signal whose default action is to terminate (next lecture) • Returning from the main routine • Calling the exit function • void exit(int status) • Terminates with an exit status of status • Convention: normal return status is 0, nonzero on error • Another way to explicitly set the exit status is to return an integer value from the main routine • exit is called once but never returns. Terminating Processes
fork Example • Call once, return twice • Concurrent execution • Can’t predict execution order of parent and child • Duplicate but separate address space • x has a value of 1 when fork returns in parent and child • Subsequent changes to x are independent • Shared open files • stdout is the same in both parent and child intmain() { pid_tpid; intx = 1; pid = Fork(); if (pid == 0) { /* Child */ printf("child : x=%d\n", ++x); exit(0); } /* Parent */ printf("parent: x=%d\n", --x); exit(0); } fork.c
A process graph is a useful tool for capturing the partial ordering of statements in a concurrent program: • Each vertex is the execution of a statement • a -> b means a happens before b • Edges can be labeled with current value of variables • printf vertices can be labeled with output • Each graph begins with a vertex with no inedges • Any topological sort of the graph corresponds to a feasible total ordering. • Total ordering of vertices where all edges point from left to right Modeling fork with Process Graphs
Process Graph Example intmain() { pid_tpid; intx = 1; pid = Fork(); if (pid == 0) { /* Child */ printf("child : x=%d\n", ++x); exit(0); } /* Parent */ printf("parent: x=%d\n", --x); exit(0); } child: x=2 Child printf exit parent: x=0 x==1 Parent exit main fork printf fork.c
Original graph: Relabled graph: Interpreting Process Graphs child: x=2 printf exit e f Feasible total ordering: parent: x=0 x==1 exit main fork printf a b c d a b e c f d Infeasible total ordering: a b f c e d
fork Example: Two consecutive forks voidfork2() { printf("L0\n"); fork(); printf("L1\n"); fork(); printf("Bye\n"); } Bye forks.c printf Bye L1 L0 Bye L1 Bye L1 Bye Bye L0 L1 Bye Bye L1 Bye Bye Which of these outputs are feasible? printf fork printf Bye printf Bye L1 L0 printf fork printf fork printf
fork Example: Nested forks in parent voidfork4() { printf("L0\n"); if (fork() != 0) { printf("L1\n"); if (fork() != 0) { printf("L2\n"); } } printf("Bye\n"); } forks.c Bye Bye printf printf Which of these outputs are feasible? L0 L1 Bye Bye L2 Bye L0 Bye L1 Bye Bye L2 L0 L1 L2 Bye printf printf fork printf printf fork
fork Example: Nested forks in children L2 Bye voidfork5() { printf("L0\n"); if (fork() == 0) { printf("L1\n"); if (fork() == 0) { printf("L2\n"); } } printf("Bye\n"); } printf printf Bye L1 printf fork printf L0 Bye printf printf fork forks.c Which of these outputs are feasible? L0 Bye L1 L2 Bye Bye L0 Bye L1 Bye Bye L2
Non-terminating Child voidfork8() { if (fork() == 0) { /* Child */ printf("Running Child, PID = %d\n", getpid()); while (1) ; /* Infinite loop */ } else { printf("TerminatingParent, PID = %d\n", getpid()); exit(0); } }
voidfork8() { if (fork() == 0) { /* Child */ printf("Running Child, PID = %d\n", getpid()); while (1) ; /* Infinite loop */ } else { printf("TerminatingParent, PID = %d\n", getpid()); exit(0); } } linux> ./forks 8 Terminating Parent, PID = 6675 Running Child, PID = 6676 linux> ps PID TTY TIME CMD 6585 ttyp9 00:00:00 tcsh 6676 ttyp9 00:00:06 forks 6677 ttyp9 00:00:00 ps linux> kill 6676 linux>ps PID TTY TIME CMD 6585 ttyp9 00:00:00 tcsh 6678 ttyp9 00:00:00 ps forks.c • Child process still active even though parent has terminated • Must kill child explicitly, or else will keep running indefinitely
Idea • When process terminates, it still consumes system resources • Examples: Exit status, various OS tables • Called a “zombie” • Living corpse, half alive and half dead • Reaping • Performed by parent on terminated child (using wait or waitpid) • Parent is given exit status information • Kernel then deletes zombie child process • What if parent doesn’t reap? • If any parent terminates without reaping a child, then the orphaned child will be reaped by init process (pid == 1) • So, only need explicit reaping in long-running processes • e.g., shells and servers “Reaping” Children
voidfork7() { if (fork() == 0) { /* Child */ printf("Terminating Child, PID = %d\n", getpid()); exit(0); } else { printf("RunningParent, PID = %d\n", getpid()); while (1) ; /* Infinite loop */ } } linux> ./forks 7 & [1] 6639 Running Parent, PID = 6639 Terminating Child, PID = 6640 linux> ps PID TTY TIME CMD 6585 ttyp9 00:00:00 tcsh 6639 ttyp9 00:00:03 forks 6640 ttyp9 00:00:00 forks <defunct> 6641 ttyp9 00:00:00 ps linux> kill 6639 [1] Terminated linux> ps PID TTY TIME CMD 6585 ttyp9 00:00:00 tcsh 6642 ttyp9 00:00:00 ps forks.c • ps shows child process as “defunct” (i.e., a zombie) • Killing parent allows child to be reaped by init
Parent reaps a child by calling the wait function • int wait(int *child_status) • Suspends current process until one of its children terminates • Return value is the pid of the child process that terminated • If child_status!= NULL, then the integer it points to will be set to a value that indicates reason the child terminated and the exit status: • Checked using macros defined in wait.h • WIFEXITED, WEXITSTATIS, WIFSIGNALED, WTERMSIG, WIFSTOPPED, WSTOPSIG, WIFCONTINUED • See textbook for details wait: Synchronizing with Children
wait Example voidfork9() { intchild_status; if (fork() == 0) { printf("HC: hello from child\n"); exit(0); } else { printf("HP: hello from parent\n"); wait(&child_status); printf("CT: child has terminated\n"); } printf("Bye\n"); } forks.c HC exit printf Feasible output: HC HP CT Bye Infeasible output: HP CT Bye HC CT Bye HP printf wait printf fork
intexecve(char *filename, char *argv[], char *envp[]) • Loads and runs in the current process: • Executable file filename • Can be object file or script file beginning with #!interpreter (e.g., #!/bin/bash) • …with argument list argv • By convention argv[0]==filename • …and environment variable listenvp • “name=value” strings (e.g., USER=droh) • getenv, putenv, printenv • Overwrites code, data, and stack • Retains PID, open files and signal context • Called once and never returns • …except if there is an error execve: Loading and Running Programs
Linux Process Heirarchy [0] init [1] … … … Daemon e.g. httpd Login shell Login shell Child Child Child Grandchild Grandchild Note: you can view the hierarchy using the Linux pstree command
pstree on big big:~ 2$ pstree systemd─┬─accounts-daemon─┬─{gdbus} │ └─{gmain} ├─acpid ├─agetty ├─atd ├─cron … ├─rpcbind ├─rsyslogd─┬─{in:imklog} │ ├─{in:imuxsock} │ └─{rs:mainQ:Reg} ├─4*[sh───csim] ├─snapd───16*[{snapd}] ├─sshd─┬─sshd───sshd───bash───pstree │ └─sshd───sshd───bash … partial output