700 likes | 923 Views
Concurrency: Background and Implementation. Fred Kuhns (fredk@arl.wustl.edu, http://www.arl.wustl.edu/~fredk) Department of Computer Science and Engineering Washington University in St. Louis. Concurrency: Origins and problems. Context Processes need to communicate
E N D
Concurrency: Background and Implementation Fred Kuhns (fredk@arl.wustl.edu, http://www.arl.wustl.edu/~fredk) Department of Computer Science and Engineering Washington University in St. Louis
Concurrency: Origins and problems • Context • Processes need to communicate • Kernel communication/controlling hardware resources (for example I/O processing) • Issues: • How is information exchanged between processes (shared memory or messages)? • How to prevent interference between cooperating processes (mutual exclusion)? • How to control the sequence of process execution (conditional synchronization)? • How to manage concurrency within the OS kernel? • Problems: • Execution of the kernel may lead to concurrent access to state • Deferred processing pending some event • Processes explicitly sharing resource (memory) • Problems with shared memory • Concurrent programs may exhibit a dependency on process/thread execution sequence or processor speed (neither are desirable!) • race condition – who’s first, and who goes next. affects program results. • There are two basic issues resulting from the need to support concurrency: • Mutual exclusion: ensure that processes/threads do not interfere with one another, i.e. there are no race conditions. In other words, program constraints (assumptions) are not violated. • Conditional synchronization: Processes/threads must be able to “wait” for the data to arrive or constraints (assertions) to be satisfied. CSE522– Advanced Operating Systems
Definitions • Process: sequence of statements which are implemented as one or more primitive atomic operations (hardware instructions) • Concurrent program: results in the interleaving of statements from different processes • Program state: value of a program’s variables at a given point in time. • Execution viewed as a sequence of states si. • Atomic actions transform states. • Program history: specific sequence of program states: s0 -> s1 -> ... -> sN • Synchronization constrains the set of possible histories to only those that are desirable • Mutual Exclusioncombines a sequence of actions into a critical sectionwhich appear to execute atomically. CSE522– Advanced Operating Systems
Interference or Race Conditions • If two or more process access and modify shared data concurrently, and the final value of the shared data depends on the order in which it was accessed. • Race condition manifest themselves as intermittent problems and are generally difficult to debug. • To prevent race conditions, concurrent processes must synchronize their access to the shared memory locations: known as the critical section problem. CSE522– Advanced Operating Systems
Race Conditions - Example • There are 4 cases for x: • case 1: task A runs to completion first loading y=0 then z=0. x = 0 + 0 = 0 • case 2: Task B runs loading y=1, then Task A runs loading y=1 and z=0. x = 1 + 0 = 1 • case 3: Task A runs loading y=0, then Task B runs to completion, then Task A runs loading z=2.x = 0 + 2 = 2 • case 4: Task B runs to completion, then Task A runs loading y=1, z=2,x = 1 + 2 = 3 • Example 1 • int y = 0, z = 0; • Task A { • x = y + z; • } • Task B { • y = 1; z = 2; • } • Results: • x = {0, 1, 2, 3} • load y into R0 • load z into R1 • set R0 = R0 + R1 • set x = R0 CSE522– Advanced Operating Systems
Kernel File Table 0 myfile.c: attributes myfile.o: attributes 1 ... NULL 7 ... Race Condition: OS Example Task A: ... get KFT_NextFree = (7) -- preempted –- KFT_NextFree += 1; update entry 7 Task B: ... get KFT_NextFree = (7) KFT_NextFree += 1; update Entry 7 -- preempt KFT_NextFree = 7 • Final value of kernel table entry 7 is indeterminate. • Final value of KFT_NextFree is 9 • Kernel File Table entry 8 is not allocated CSE522– Advanced Operating Systems
Properties • Property := something that is always true of a program • Safety : never enter a bad state • Examples: absence of deadlock, providing mutual exclusion • Liveness : eventually enter a good state, in particular that a process or thread makes progress toward a “goal” state. • Examples: process eventually enters a critical section, a service request is eventually honored, a message eventually reaches its destination • May depend on the fairness of scheduler • Partial Correctness: if the program terminates then the final answer is correct. • Safety property. • It says nothing about whether the program will terminate only that if it does then the answer will be correct. • Total Correctness: combines partial correctness with termination. • Liveness property • Says that a program will indeed always terminate (i.e. complete) and produce a valid (correct) answer. • Mutual exclusion is a safety property (no bad states). CSE522– Advanced Operating Systems
Fairness • Unconditional fairness:a scheduling policy is unconditionally fair if every unconditional atomic action that is eligible is executed eventually. • round-Robin is unconditionally fair in the absence of conditional atomic expressions (atomic expression awaits some condition to be true before being executed). • If conditional expressions are present then need stronger guarantees. • Weak fairness: A scheduling policy is weakly fair if 1) it is unconditionally fair and 2) every conditional atomic action that is eligible is executed eventually, assuming that its condition becomes true and then remains true until it is seen by the process executing the conditional atomic action. • ensures that every process keeps getting chances • progress then depends on the possibility that the process will get a chance at the right time • Strong fairness: A scheduler is strongly fair if 1) it is unconditionally fair and 2) every conditional atomic action that is eligible is executed eventually, assuming that its condition is infinitely often true. CSE522– Advanced Operating Systems
Some Definitions • Independent processes: Two processes are independent if the write set of each is disjoint from both the read and write sets of the other. • Critical reference: reference to a variable changed by another process. Assume the variable is written or read atomically. • Critical assertion: A pre/post condition that is not in a critical section. • Noninterference: An assignment a and its precondition pre(a) in process A does not interfere with a critical assertion C in another process if the following is always true. {C ^ pre(a)} a {C}That is, C is not affected by the execution of a when pre(a) is true. CSE522– Advanced Operating Systems
Avoiding Race Conditions • At-Most-Once property: If the assignment statement x = e satisfies1)e contains at most one critical reference and x is not read by another process or 2)e contains no critical references, in which case x may be read by another process.In other words, there can be at most one critical reference within the expression S: x = e • Four techniques for avoiding interference: • Disjoint variables : write set of one process is disjoint from the reference set of another. Reference set is the set of variables appearing in assertions. • Weakened assertions : if variables are not disjoint then loosen constraints (i.e. the set of assertions). • Global invariants : expresses relationships between shared variables • Synchronization : define atomic actions CSE522– Advanced Operating Systems
Consumer Producer Example Producer item nextProduced; while (TRUE) { while (counter == BUFSZ); buffer[in] = nextProduced; in = (in + 1) % BUFSZ; counter++; } #define BUFSZ 10 typedef struct { . . . } item; item buffer[BUFSZ]; int in = 0; int out = 0; int counter = 0; Consumer item nextConsumed; while (TRUE) { while (counter == 0) ; nextConsumed = buffer[out]; out = (out + 1) % BUFSZ; counter--; } critical section. Must protect the “counter write” critical section. CSE522– Advanced Operating Systems
Bounded Buffer • If both the producer and consumer attempt to update the buffer concurrently, the assembly language statements may get interleaved. • Interleaving depends upon how the producer and consumer processes are scheduled. • Assume counter is initially 5. One interleaving of statements is:producer: register1 = counter (register1 = 5)producer: register1 = register1 + 1 (register1 = 6)consumer: register2 = counter (register2 = 5)consumer: register2 = register2 – 1 (register2 = 4)producer: counter = register1 (counter = 6)consumer: counter = register2 (counter = 4) • The value of count may be either 4 or 6, where the correct result should be 5. CSE522– Advanced Operating Systems
Critical Section Problem Entry/exit protocol satisfies: • Mutual Exclusion: At most one process may be active within the critical section • Absence of deadlock (livelock): if two or more processes attempt to enter CS, one will eventually succeed. • Absence of Unnecessary delay: If one process attempts to enter CS and all other processes are either in their non-critical sections or have terminated then it is not prevented from entering. • Eventual entry (no starvation, more a function of scheduling): if a process is attempting to enter CS it will eventually be granted access. Task A { while (True) { entry protocol; critical section; exit protocol; non-critical section; } } CSE522– Advanced Operating Systems
Mutual Exclusion • First we look at mutual exclusion alone, also known as busy waiting • Simplest approach (on single processor systems) is to disable interrupts • why does this work? • traditional kernels use this approach • less attractive for user process. why? • Algorithms implemented in software without support from the hardware • Hardware support for atomic read/write operations • Then we look at a solution which puts a process to sleep when a resource is not available, permitting more efficient use of the system (conditional synchronization: sleep until the condition is true) CSE522– Advanced Operating Systems
Mutual Exclusion - Interrupt Disabling • Process runs until requests OS service or interrupted • Process disables interrupts for mutual exclusion • Processor has limited ability to interleave programs • Efficiency of execution may be degraded • Multiprocessing • disabling interrupts on one processor will not guarantee mutual exclusion. Why? • Entry/exit protocol satisfies (*single CPU): • Mutual Exclusion: Yes • Absence of deadlock: Yes • Absence of Unnecessary delay: Yes • *Eventual entry: Yes CSE522– Advanced Operating Systems
Lock Variables Process A { while (True){ //entry protocol while (lock == 1); lock = 1; critical section; // exit protocol lock = 0; non-critical; } } int lock = 0; Process B { while (True) { // entry protocol while (lock == 1); lock = 1; critical section; // exit protocol lock = 0; non-critical; } } CSE522– Advanced Operating Systems
Lock Entry/exit protocol satisfies: • Mutual Exclusion: No • Absence of deadlock: Yes • Absence of Unnecessary delay: Yes • Eventual entry: No CSE522– Advanced Operating Systems
Taking Turns Task A { int myid = 0; while (True) { // entry protocol while (turn != 0) ; critical section; // exit protocol turn = 1; non-critical; } } int turn = 0; Task B { int myid = 1; while (True) { // entry protocol while (turn != 1) ; critical section; // exit protocol turn = 0; non-critical; } } CSE522– Advanced Operating Systems
Taking turns Entry/exit protocol satisfies: • Mutual Exclusion: Yes • Absence of deadlock: Yes • Absence of Unnecessary delay: No • Eventual entry: Yes CSE522– Advanced Operating Systems
Combined Approach aka Peterson’s Solution Process A { int myid = 0; while (1) { enter(myid); critical section; leave(myid); } } #define N 2 int last = 0; int want[N] = {0,0}; void enter(int proc) { int other = proc ? 1 : 0; want[proc] = 1; last = proc; while (last == proc && want[other]); } void leave(int proc) { want[proc] = 0; } Process B { int myid = 1; while (1) { enter(myid); critical section; leave(myid); } } CSE522– Advanced Operating Systems
Peterson’s Solution Entry/exit protocol satisfies: • Mutual Exclusion: Yes • Absence of deadlock: Yes • Absence of Unnecessary delay: Yes • Eventual entry: Yes CSE522– Advanced Operating Systems
Help from Hardware • Special Machine Instructions • Performed in a single instruction cycle • Not subject to interference from other instructions • Reading and writing • Reading and testing • For example the test and set instruction: boolean TSL (boolean &lock) { boolean tmp = lock; lock = True; return tmp; } You have the lock iff False is returned. • if lock == False before calling TSL(lock) • it is set to True and False is returned • if lock == True before calling TSL(lock) • it is set to True and True is returned CSE522– Advanced Operating Systems
Mutual Exclusion with TSL • Shared data:boolean lock = False; // initialize to false • Task Pi do { // Entry protocol while (TSL(lock) == True) ; // spin: wait for lock // execute critical section code -- critical section -- // Exit protocol lock = False; // Non-critical section code -- remainder section -- } while (1); CSE522– Advanced Operating Systems
Using a Swap Instruction • Atomic swap: void Swap(boolean &a, boolean &b) {int temp = a; a = b; b = temp;}; Atomically swap values, after calling swap, a == original value of b, b = original value of a • Implementing a mutex. Shared data (initialized to False): boolean lock = False; boolean waiting[n]; Process Pi do { // Entry protocol key = True; // when lock is false key becomes false while (key == True) Swap(lock, key) ; // Execute critical section code -- critical section -- // Exit protocol lock = false; // Non-critical section code -- remainder section -- } CSE522– Advanced Operating Systems
Machine Instructions • Advantages • Can be used on single or multi-processor systems (shared memory) • It is simple and therefore easy to verify • It can be used to support multiple critical sections • Disadvantages • Busy-waiting consumes processor time • Starvation possible when a process leaves a critical section and more than one process is waiting. • Who is next? Lowest priority process may never acquire lock. • Deadlock - If a low priority process has the critical region (i.e. lock) but is preempted by a higher priority process spinning on lock then neither can advance. CSE522– Advanced Operating Systems
Must we always busy wait? • While busy waiting is useful in some situations it may also lead to other problems: inefficient use of CPU and deadlock resulting from a priority inversion • What we really want is a way to combine mutual exclusion schemes with conditional synchronization. • In other words, we want the option of blocking a process until it is able to acquire the mutual exclusion lock. • Simple solution is to add two new functions: • sleep() and wakeup() CSE522– Advanced Operating Systems
Adding Conditional Synchronization: Almost a Complete Solution int N 10 int buf[N]; int in = 0, out = 0, cnt = 0; Task producer { int item; while (1) { item = mkitem(); if (cnt == N) sleep(); buf[in] = item; in = (in + 1) % N; lock(lock);cnt++;unlock(0); if (cnt == 1) wakeup(consumer); } } Task consumer { item_t item; while (TRUE) { if (cnt == 0) sleep(); item = buf[out]; out = (out + 1) % N; lock(lock);cnt--;unlock(lock); if (cnt == N-1) wakeup(producer); consume(item); } } CSE522– Advanced Operating Systems
Lost wakeup problem • Assume that the lock() and unlock() functions implement a simple spin lock • I’ve left out the details to simplify the previous example • There is a race condition that results in a lost wakeup, do you see it? • We will solve this problem when we talk about semaphores and monitors. CSE522– Advanced Operating Systems
Intro to Semaphores • Synchronization mechanism: • No busy waiting • No lost wakeup problem. • Integer variable accessible via two indivisible (atomic) operations : • P(s) or wait(s):If s > 0 then decrementelse block thread on semaphore queue • V(s) or signal(s):If (s == 0 and waiting threads) then wake one sleeping threadelse increment s • Kernel guarantees operations are atomic • Can define a non-blocking version: trylock. • Each Semaphore has an associated wait queue. • Ways to use a semaphore: • (1) Mutual Exclusion: binary semaphore initialized to 1 • (2) Event Waiting: binary semaphore initialized to 0 • (3) Resource counting: Initialized to number of available resources • If using for mutual exclusion, semantics ensure lock ownership transfers to waking thread • Threads are woken up in FIFO order • May result in unnecessary context switches, Semaphore convoys. A thread currently running on another CPU may attempt to lock resource immediately after it has been “transferred” to a sleeping thread. May be better to allow the already running thread to acquire lock and waking thread to wait on ready queue. This reduces the number of context switches. CSE522– Advanced Operating Systems
Critical Section of n Tasks • Shared data: semaphore mutex; // initialize mutex = 1 • Process Ti: do {wait(mutex); -- critical section -- signal(mutex); -- remainder section – } while (1); CSE522– Advanced Operating Systems
Semaphore Implementation • Define a semaphore as a record typedef struct { int value; // value of semaphore queue_t sq; // task/thread queue // also need lock(s) to protect value and thread queue } semaphore; • Assume two simple operations: wait() and signal() • Is the below a complete solution? How do we protect the semaphore object from concurrent access by other threads? void signal(sem_t *S) { thread_t t; S->value++; if (S->value 0) { t = getthread(S->sq);wakeup(t); } return; } void wait(sem_t *S) { S->value--; if (S->value< 0) { addthread(S->sq); sleep(); } return; } CSE522– Advanced Operating Systems
Some Reference Background • Semaphores used in Initial MP implementations • Threads are woken up in FIFO order (convoys) • forcing a strictly fifo order may result in unnecessary blocking • Used to provide • Mutual exclusion (initialized to 1) • Event-waiting (initialized to 0) • Resource counting (initialized to number available) • High level abstraction built using lower-level primitives. • Always caries the overhead of sleep queue management (event notification mechanism) • Always caries the overhead of locking (mutual exclusion mechanism) • Hides whether thread has blocked, may want to actively “wait” for resource (i.e. spin lock) CSE522– Advanced Operating Systems
Potential Problems • Incorrect use of semaphores can lead to problems • Critical sections using semaphores: must keep to a strict protocol • wait(S); {critical section}; signal(S) • Problems: • No mutual exclusion: • Reverse: signal(S); {critical section}; wait(S); • Omit wait(S) • Deadlock: • wait(S); {critical section}; wait(S); • Omit signal(S) CSE522– Advanced Operating Systems
Other Synchronization Mechanisms • Mutex - Mutual Exclusion Lock, applies to any primitive that enforces mutual exclusion semantics • Condition variables – event notification, usually has an associated predicate and lock to protect predicate. • Read/Write Locks – there are other paradigms, for example only requiring exclusive access to a resource if it is being modified. Otherwise any number of readers. • Reference Counting – the kernel often has to maintain resources while they are actively used. A thread may indicate its interest in a resource by incrementing a reference count then decrementing when it is done. When the reference count goes to 0 then resource may be freed (garbage collection). CSE522– Advanced Operating Systems
Multiprocessor Support • The traditional approaches do not work. • Lost wakeup problem: Between checking a locked flag and placing a thread on a sleep queue (and setting wanted flag) the event may occur. • thundering heard problem: waking all threads sleeping on a resources may cause them to all to run in parallel (on different CPUs), but only one can lock resource so remainder go back to sleep. • Kernel relies on interrupt disabling to protect one processors context but this is not enough for multiple processors: • need both interrupt disabling and spin locks • Leverage Hardware support for synchronization • Atomic test-and-set: test bit, set it to 1, return old value. • load-linked and store-conditional (read-modify-write): load variable, modify then store. A flag is set indicating if the write was successful – can be used to implement semaphores (atomic increment, keep trying until you succeed in incrementing variable). CSE522– Advanced Operating Systems
Spin Locks (MP Support) • The idea is to provide a basic, HW supported primitive with low overhead. • Lock held for short periods of time • If locked, then busy-wait on the resource • Must not give up processor if holding lock! • I show interrupts being blocked • For a mutual exclusion lock you must also block interrupts! void lock (spinlock_t *s) { while (testNset(s) != 0) while (*s != 0) ; } void unlock (spinlock_t *s) { s = 0; } CSE522– Advanced Operating Systems
Blocking Locks/Mutex • Allows threads to block • Interface: lock(), unlock () and trylock () • Consider traditional kernel locked flag • Mutex allows for exclusive access to flag, solving the race condition • flag can be protected by a spin lock. CSE522– Advanced Operating Systems
Condition Variables • Associated with a predicate which is protected by a mutex (usually a spin lock). • Useful for event notification • Can wakeup (signal) one or all (broadcast) sleeping threads. • Up to 3 or more mutex locks are typically required: • one for the predicate • one for the sleep queue (or CV list) • one or more for the scheduler queue (swtch ()) • Deadlock avoided by requiring a strict order CSE522– Advanced Operating Systems
predicate Condition variable mutex List of blocked threads mutex (for list) kthread_3 kthread_2 kthread_1 Condition Variables update predicate wake up one thread Thread sets event CSE522– Advanced Operating Systems
CV Implementation void signal (cv *c) { lock (&cv->listlock); remove a thread from list unlock (&cv->listlock); if thread, make runnable; return; } void broadcast (cv *c) { lock (&cv->listlock); while (list is nonempty) { remove a thread make it runnable } unlock (&cv->listlock); return; } void wait (cv *c, mutex_t *s) { lock (&cv->listlock); add thread to queue unlock (&cv->listlock); unlock (s); swtch (); /* return after wakup */ lock (s); return; } CSE522– Advanced Operating Systems
Monitors • High-level synchronization construct that allows the safe sharing of an abstract data type among concurrent processes. monitor monitor-name { shared variable declarations procedure body P1 (…) { . . . } procedure body P2 (…) { . . . } procedure body Pn (…) { . . . } { initialization code } } CSE522– Advanced Operating Systems
Monitors: Condition Variables • To allow a process to wait within the monitor, a condition variable must be declared, as conditionx, y; • Condition variable can only be used with the operations wait and signal • x.wait() : the process invoking this operation is suspended until another process invokes x.signal(); • x.signal : resumes exactly one suspended process. If no process is suspended, then the signal operation has no effect. • If a task A within a monitor signals a suspended task B, there is an issue of which process is permitted to execute with the monitor • signal-and-wait: Task A must wait while process B is permitted to continue executing within the monitor • signal-and-continue: Task A continues to execute within the monitor, task B must wait • A case can be made for either approach CSE522– Advanced Operating Systems
Monitor With Condition Variables CSE522– Advanced Operating Systems
Monitor Implementation • Since only one task may be active in a monitor there must be a mutex protecting the monitor. • The mutex effectively serializes access to the monitor. • We also need to account for sending signals within a monitor – that is, guard against having more than one active task within the monitor. • Our implementation uses semaphores for both mutual exclusion (initialized to 1 for a mutex) and counting the number of tasks waiting to be resumed within the monitor (initialized to 0 – counting semaphore) CSE522– Advanced Operating Systems
Example using Hoare Semantics; signal-and-wait for CV x define: semaphore xSem = 0; int xCnt = 0; x.wait(…) { xCnt++; if (urgCnt > 0) signal(urgent); else signal(mutex); wait(xSem); xCnt--; } x.signal(…) { if (xCnt > 0) { urgCnt++; signal(xSem); wait(urgent); urgCnt--; } } semaphore mutex = 1; semaphore urgent = 0; int urgCnt = 0; methodX(…) { // entry protocol wait(mutex); … body of M; … // exit protocol if (urgCnt > 0) signal(urgent) else signal(mutex); } CSE522– Advanced Operating Systems
CV: Hoare; signal-and-wait semaphore mutex = 1; semaphore urgSem = 0, cvSem = 0; int urgCnt = 0, cvCnt = 0; wait() { cvCnt++; if (urgCnt > 0) signal(urgSem); else signal(mutex); wait(cvSem); cvCnt--; } signal() { urgCnt++; if (cvCnt > 0) { signal(cvSem); wait(urgSem); } urgCnt--; } CSE522– Advanced Operating Systems
CV: Mesa; signal-and-continue • Implement so that the signaling task continues semaphore monLock = 1; // monitor mutex semaphore cvSem = 0; cv.wait() { wait(cvSem); wait(monLock); } cv.signal() { signal(cvSem); } CSE522– Advanced Operating Systems
More on Monitor Implementations • How can we control the task resumption order? • Conditional-wait construct: x.wait(c); c – integer expression evaluated when wait executed. • value of c (a priority number) stored with the name of the process that is suspended. • when x.signal is executed, process with smallest associated priority number is resumed next. • Verifying correctness: Check two conditions: • User processes must always make their calls on the monitor in a correct sequence. • Must ensure that an uncooperative process does not ignore the mutual-exclusion gateway provided by the monitor, and try to access the shared resource directly, without using the access protocols. CSE522– Advanced Operating Systems
Critical Regions • High-level synchronization construct • A shared variable v of type T, is declared as: v: shared T • Variable v accessed only inside statement region v when B do Swhere B is a Boolean expression. • While statement S is being executed, no other process can access variable v. CSE522– Advanced Operating Systems
Critical Regions • Regions referring to the same shared variable exclude each other in time. • When a process tries to execute the region statement, the Boolean expression B is evaluated. If B is true, statement S is executed. If B is false, the process is delayed until B becomes true and no other process is in the region associated with v. CSE522– Advanced Operating Systems