490 likes | 514 Views
Understand process management, mutual exclusion, deadlock, and implementing processes and threads. Explore threads vs. events and context switching in dual-mode operation.
E N D
CS 5204Operating SystemsReview Basic Concepts Godmar Back
Announcements • Send email with your presentation preference CS 5204 Fall 2012
Concurrency • Review of basic concepts • Process Management as OS responsibility • process vs thread abstraction • Synchronization Issues: • mutual exclusion & race conditions • deadlock & starvation • Implementing processes & threads • Programming models for communication • threads vs events CS 5204 Fall 2012
Definition: Process/Thread • Process • “program in execution” • resources: CPU context, memory maps, open files, privileges, ….; isolated • Threads • CPU context (state + stack); not isolated • “thread” is a historically recent term • Threads used to be called “processes” • Q: what primitives does an OS need to provide? CS 5204 Fall 2012
Processes vs Threads P1 P2 P3 • Processes execute concurrently and share resources: • Files on disk; Open files (via inherited file descriptors); Terminal, etc. • Kernel-level concurrency • but do not (usually) share any of their memory • cannot share data easily by exchanging pointers to it • Threads are separate logical flows of control (with separate stacks!) that share memory and can refer to same data • Different models and variations exist • Application-level concurrency Kernel CS 5204 Fall 2012
Context Switching • Historical motivation for processes was introduction of multi-programming: • Load multiple processes into memory, and switch to another process if current process is (momentarily) blocked • This required protection and isolation between these processes, implemented by a privileged kernel: dual-mode operation. • Time-sharing: switch to another process periodically to make sure all processes make equal progress • Switch between processes is called a context switch CS 5204 Fall 2012
Dual-Mode Operation • Two fundamental modes: • “kernel mode” – privileged • aka system, supervisor or monitor mode • Intel calls its PL0, Privilege Level 0 on x86 • “user mode” – non-privileged • PL3 on x86 • Bit in CPU – controls operation of CPU • Privileged operations can only be performed in kernel mode. Example: hlt • Must carefully control transition between user & kernel mode int main() { asm(“hlt”); } CS 5204 Fall 2012
Mode Switching • User Kernel mode • For reasons external or internal to CPU • External (aka hardware) interrupt: • timer/clock chip, I/O device, network card, keyboard, mouse • asynchronous (with respect to the executing program) • Internal interrupt (aka software interrupt, trap, or exception) • are synchronous • can be intended (“trap”): for system call (process wants to enter kernel to obtain services) • or unintended (usually): (“fault/exception”) (division by zero, attempt to execute privileged instruction in user mode, memory access violation, invalid instruction, alignment error, etc.) • Kernel User mode switch on iret instruction CS 5204 Fall 2012
Timer interrupt: P1 is preempted, context switch to P2 I/O device interrupt:P2’s I/O completeswitch back to P2 user mode kernel mode System call: (trap): P2 starts I/O operation, blocks context switch to process 1 Timer interrupt: P2 still has time left, no context switch A Context Switch Scenario Process 1 Process 2 Kernel CS 5204 Fall 2012
user mode kernel mode User processes access kernel services by trapping into the kernel, executing kernel code to perform the service, then returning – very much like a library call. Unless the system call cannot complete immediately, this does not involve a context switch. System Calls Process 1 Kernel Kernel’s System Call Implementation CS 5204 Fall 2012
user mode kernel mode KernelThreads Most OS support kernel threads that never run in user mode – these threads typically perform book keeping or other supporting tasks. They do not service system calls or faults. Process 1 Process 2 Kernel Kernel Thread Careful: “kernel thread” not the same as kernel-level thread (KLT) – more on KLT later CS 5204 Fall 2012
RUNNING Scheduler picks process Process must wait for event Process preempted BLOCKED READY Event arrived Reasoning about Processes:Process States • Only 1 process (per CPU) can be in RUNNING state CS 5204 Fall 2012
Process Creation • Two common paradigms: • Cloning vs. spawning • Cloning: (Unix) • “fork()” clones current process • child process then loads new program • Spawning: (Windows) • “exec()” spawns a new process with new program • Difference is whether creation of new process also involves a change in program CS 5204 Fall 2012
#include <unistd.h> #include <stdio.h> int main() { int x = 1; if (fork() == 0) { // only child executes this printf("Child, x = %d\n", ++x); } else { // only parent executes this printf("Parent, x = %d\n", --x); } // parent and child execute this printf("Exiting with x = %d\n", x); return 0; } fork() Child, x = 2 Exiting with x = 2 Parent, x = 0 Exiting with x = 0 CS 5204 Fall 2012
The fork()/join() paradigm • After fork(), parent & child execute in parallel • Unlike a fork in the road, here we take both roads • Used in many contexts • In Unix, ‘join()’ is called wait() • Purpose: • Launch activity that can be done in parallel & wait for its completion • Or simply: launch another program and wait for its completion (shell does that) Parent: fork() Parent process executes Child process executes Child process exits Parent:join() OS notifies CS 5204 Fall 2012
fork() #include <sys/types.h> #include <unistd.h> #include <stdio.h> int main(int ac, char *av[]) { pid_t child = fork(); if (child < 0) perror(“fork”), exit(-1); if (child != 0) { printf ("I'm the parent %d, my child is %d\n", getpid(), child); wait(NULL); /* wait for child (“join”) */ } else { printf ("I'm the child %d, my parent is %d\n", getpid(), getppid()); execl("/bin/echo", "echo", "Hello, World", NULL); } } CS 5204 Fall 2012
User View of Threads • Unix/C: • fork()/wait() vs pthread_create()/pthread_join() • Java: • new Thread() • Thread.start()/join() • See also [Boehm PLDI 2005] Runnable r = new Runnable() { public void run() { /* body */ } }; Thread t = new Thread(r); t.start(); // concurrent execution starts // main t.join(); // concurrent execution ends CS 5204 Fall 2012
Threading APIs • How are threads embedded in a language/environment? • POSIX Threads Standard (in C) • pthread_create(), pthread_join() • Uses function pointer to denote start of new control flow • Largely retrofitted in Unix world • Needed to define interaction of signals and threads • Java/C# • Thread.start(), Thread.join() • Java: Using “Runnable” instance • C#: Uses “ThreadStart” delegate CS 5204 Fall 2012
Example pthread_create/join static void * test_single(void *arg) { // this function is executed by each thread, in parallel } pthread_t threads[NTHREADS]; int i; for (i = 0; i < NTHREADS; i++) if (pthread_create(threads + i, (const pthread_attr_t*)NULL, test_single, (void*)i) == -1) { printf("error creating pthread\n"); exit(-1); } /* Wait for threads to finish. */ for (i = 0; i < NTHREADS; i++) pthread_join(threads[i], NULL); Use Default Attributes – could set stack addr/size here 2nd arg could receive exit status of thread CS 5204 Fall 2012
Java Threads Example public class JavaThreads { public static void main(String []av) throws Exception { Thread [] t = new Thread[5]; for (int i = 0; i < t.length; i++) { final int tnum = i; Runnable runnable = new Runnable() { public void run() { System.out.println("Thread #"+tnum); } }; t[i] = new Thread(runnable); t[i].start(); } for (int i = 0; i < t.length; i++) t[i].join(); System.out.println("all done"); } } Use this form of explicit threading only when knowing number & rolesof threads before hand! Threads implements Runnable – could also have subclassed Thread & overridden run() Thread.join() can throw InterruptedException – can be used to interrupt thread waiting to join via Thread.interrupt CS 5204 Fall 2012
import java.util.concurrent.*; public class FixedThreadPool { public static void main(String []av) throws Exception { ExecutorService ex = Executors.newFixedThreadPool(4); final int N = 4; Future<?> f[] = new Future<?>[N]; for (inti = 0; i < N; i++) { final int j = i; f[i] = ex.submit(new Callable<String>() { public String call() { return "Future #" + j + " brought to you by “ + Thread.currentThread(); } }); } System.out.println("Main thread: " + Thread.currentThread()); for (inti = 0; i < N; i++) System.out.println(f[i].get()); ex.shutdown(); } } Java Threadpools Tasks must implement “Callable” – like Runnable except returns result. get() waits for the execution of call() to be completed (if it hasn’t already) and returns result CS 5204 Fall 2012
Explicit Threads vs. Pools • Overhead: • Startup overhead per thread relatively high (between 1e4 & 1e5 cycles); pools amortize • There is no point in having more threads than there are physical cores • Compete for available CPUs • Unless some subset is blocked on I/O or other conditions • Still, sizing of pools that maximizes throughput can be challenging • “cachedThreadPool” creates thread whenever needed, but reuses existing ones that are idle • “fixedThreadPool” - # of threads fixed • Can implement custom policies CS 5204 Fall 2012
Aside: Hybrid Models • The “threads share everything” + “processes share nothing” mantra does not always hold • Hybrids: • WEAVES allows groups of threads to define their own namespace, so they only share data they want • Java multitasking systems (KaffeOS, MVM): multiple “processes” may share same address space CS 5204 Fall 2012
Implementation CS 5204 Fall 2012
Implementing Threads • Issues: • Who maintains thread state/stack space? • How are threads mapped onto CPUs? • How is coordination/synchronization implemented? • How do threads interact with I/O? • How do threads interact with existing APIs such as signals? • How do threads interact with language runtimes (e.g., GCs)? • How do terminate threads safely? CS 5204 Fall 2012
Cooperative Multi-threading • Special case of user-level threading • Is easy to implement using ‘swapcontext’ – see next slide • Support multiple logical execution flows • Each needs own stack so has its own (procedure-) local (automatic) variables • But share address space so shares heap, global vars (all kinds: global, global static, local static) • In cooperative multi-threading, a context switch can occur only if a thread voluntarily offers the CPU to (any) other thread (“yield”); later resumes and returns • Can build resource abstractions on top where threads yield if they cannot obtain the abstracted resource • This is called a “non-preemptive” model • If yield is directed (“yield to x”) this model is called “co-routines” CS 5204 Fall 2012
Cooperative Multithreading via ‘swapcontext’ static char stack[2][65536]; // a stack for each coroutine static ucontext_tcoroutine_state[2]; // container to remember context // switch current coroutine (0 -> 1 -> 0 -> 1 ...) static inline void yield_to_next(void) { static int current = 0; intprev = current; int next = 1 - current; current = next; swapcontext(&coroutine_state[prev], &coroutine_state[next]); } static void coroutine(intcoroutine_number) { inti; for (i = 0; i < 5; i++) { printf("Coroutine %d counts i=%d (&i=%p)\n", coroutine_number, i, &i); yield_to_next(); } } CS 5204 Fall 2012
Cooperative Multi-threading (cont’d) • Advantages: • Requires no OS support • Context switch very fast (usually involves only saving callee-saved regs + stack pointer) • Reduced potential for certain types of race conditions • E.g., i++ will never be interrupted • Used in very high-performance server designs & discrete event simulation • Disadvantages • OS sees only one thread in process, system calls block entire process (if they block) • Cannot make use of multiple CPUs/cores • Cannot preempt infinitely-looping or uncooperative threads • (though can fake preemption in just-in-time compiled languages by letting compiler insert periodic checks) CS 5204 Fall 2012
Use Cases forApp-level Concurrency • Overlap I/O with computation • E.g. file sharing program downloads, checksums, and saves/repairs files simultaneously • Parallel computation • Use multiple CPUs • Retain interactivity while background activity is performed • E.g., still serve UI events while printing • Handling multiple clients in network server apps • By and large, these are best handled with a preemptive, and (typically) kernel-level multi-threading model CS 5204 Fall 2012
Example Use: Threads in Servers CS 5204 Fall 2012
Preemptive Multi-Threading • Don’t require the explicit, programmer-inserted “yield” call to switch between threads • “Switch” mechanism can be implemented at user-level or kernel-level • User-level threads: can be built using signal handlers (e.g. SIGVTALRM) • Requires advanced file descriptor manipulation techniques to avoid blocking entire process in system call • Kernel-level threads: natural extension for what OS already does when switching between processes • Integrated with system calls – only current thread blocks • Hybrids • Kernel-level threads is the dominant model today CS 5204 Fall 2012
Kernel-level Threads 1:1 Model Threading Models Hybrid, so-called M:N model: User-level Threads 1:N Model Source: Solaris documentation (left), Stallings (right) CS 5204 Fall 2012
Threading Implementations Overview * Most commonmodeltoday CS 5204 Fall 2012
Threading Models • Linux, Windows, Solaris 10 or later, OSX: use 1:1 model with kernel-level threads. OS manages threads in each process • threads in Java/C#, etc. are typically mapped to kernel-level threads • Solaris (pre-10), Windows “fibers”: provide M:N model • Attempted to obtain “best of both worlds” – turned out to be difficult in practice • User-level Threads • used mainly in special/niche applications today CS 5204 Fall 2012
stack1 guard stack2 guard Managing Stack Space • Stacks require continuous virtual address space • virtual address space fragmentation (example 32-bit Linux) • What size should stack have? • How to detect stack overflow? • Ignore vs. software vs. hardware • Related: how to implement • Get local thread id “pthread_self()” • Thread-local Storage (TLS) CS 5204 Fall 2012
On Termination • If you terminate a thread, how will you clean up if you have to terminate it? • Strategies: • Avoid shared state where possible • Disable termination • Use cleanup handlers try/finally, pthread_cleanup Queue q1, q2; // shared thread_body() { while (true) { Packet p = q1.dequeue(); q2.enqueue(p); } } Queue q1, q2; // shared thread_body() { while (!done) { Packet p = q1.dequeue(); q2.enqueue(p); } } CS 5204 Fall 2012
Synchronization CS 5204 Fall 2012
Synchronization • Access to resources must be protected • Race Condition problem • Definition • Approaches for detecting them • Static vs dynamic CS 5204 Fall 2012
Critical Section Problem • Many algorithms known • purely software-based (Dekker’s, Peterson’s algorithm) vs. hardware-assisted (disable irqs, test-and-set instructions) • Criteria for good algorithm: • mutual exclusion • progress • bounded waiting while (application hasn’t exited) { enter critical section inside critical section exit critical section in remainder section } CS 5204 Fall 2012
Synchronization Abstractions • Atomic primitives • e.g. Linux kernel “atomic_inc()” • Dijkstra’s semaphores • P(s) := atomic { while (s<=0) /* no op */; s--; } • V(s) := atomic { s++; } • Q: what’s wrong with this implementation? • Binary semaphores, locks, mutexes • Difference between mutex & semaphore CS 5204 Fall 2012
Expressing Critical Sections pthread_mutex_t m; … pthread_mutex_lock(&m); /* in critical section */ if (*) { pthread_mutex_unlock(&m); return; } pthread_mutex_unlock(&m); synchronized (object) { /* in critical section */ if (*) { return; } } Pthreads/C vs Java CS 5204 Fall 2012
Expressing Critical Sections pthread_mutex_t m; … pthread_mutex_lock(&m); /* in critical section */ if (*) { pthread_mutex_unlock(&m); return; } pthread_mutex_unlock(&m); synchronized (object) { /* in critical section */ if (*) { return; } } Pthreads/C vs JavaNote benefits of language support CS 5204 Fall 2012
Monitors (Hoare) Enter • Data Type: • internal, private data • public methods wrapped by Enter/Exit • wait/signal methods • “Monitor Invariant” Region of mutual exclusion Wait Signal Wait Signal Exit CS 5204 Fall 2012
Expressing Monitors pthread_mutex_t m; pthread_cond_t c; … pthread_mutex_lock(&m); /* in critical section */ while (somecond != true) pthread_cond_wait(&c, &m); pthread_mutex_unlock(&m); synchronized (object) { /* in critical section */ while (somecond != true) { object.wait(); } } pthread_mutex_lock(&m); /* in critical section */ pthread_cond_signal(&c, &m); pthread_mutex_unlock(&m); synchronized (object) { /* in critical section */ object.notify(); } See also Java’s insecure parallelism [Per Brinch Hansen 1999] CS 5204 Fall 2012
A B Thread 1 Thread 2 Deadlock pthread_mutex_t A; pthread_mutex_t B; … pthread_mutex_lock(&A); pthread_mutex_lock(&B); … pthread_mutex_unlock(&B); pthread_mutex_unlock(&A); pthread_mutex_lock(&B); pthread_mutex_lock(&A); … pthread_mutex_unlock(&A); pthread_mutex_unlock(&B); CS 5204 Fall 2012
Reusable vs. Consumable Resources • Distinguish two types of resources when discussing deadlock • A resource: • “anything a process needs to make progress” • (Serially) Reusable resources (static, concrete, finite) • CPU, memory, locks • Can be a single unit (CPU on uniprocessor, lock), or multiple units (e.g. memory, semaphore initialized with N) • Consumable resources (dynamic, abstract, infinite) • Can be created & consumed: messages, signals • Deadlock may involve reusable resources or consumable resources CS 5204 Fall 2012
Resource Allocation Graph R → P Assignment P → R Request R1 R3 P1 P2 P3 P4 R2 R4 Deadlocks, more formally • 4 necessary conditions • Mutual Exclusion • Hold and Wait • No Preemption • Circular Wait • Q.: what are strategies to detect/break/avoid deadlocks? CS 5204 Fall 2012
Strategies for dealing with Deadlock • Deadlock Prevention • Remove a necessary condition • Deadlock Avoidance • Can’t remove necessary condition, so avoid occurrence of deadlock – maybe by clever resource scheduling strategy • Deadlock Recovery • Deadlock vs. Starvation CS 5204 Fall 2012
Example: x86 • Nonpreemptive = C calling conventions: • Caller-saved: eax, ecx, edx + floating point • Callee-saved: ebx, esi, edi, esp • ebp, eip for a jmpbuf size of 6*4 = 24 bytes • Preemptive = save entire state • All registers + 108 bytes for floating point context • Note: context switch cost = save/restore state cost + scheduling overhead + lost locality cost CS 5204 Fall 2012