600 likes | 741 Views
Chapter 7. Synchronization and Multiprocessors. Contents. Introduction Synchronization in traditional UNIX kernel Multiprocessor system Multiprocessor Synchronization issues Spin locks Condition variables Read-write locks Reference counts. Introduction. Compute-intensive
E N D
Chapter 7 Synchronization and Multiprocessors
Contents • Introduction • Synchronization in traditional UNIX kernel • Multiprocessor system • Multiprocessor Synchronization issues • Spin locks • Condition variables • Read-write locks • Reference counts
Introduction • Compute-intensive • Multiprocessors also provides reliability. • MTBF • Fault-tolerant • Recover from the failure • Performance scales linearly? • Other components • Synchronization • Extra functionality • Three major changes • Synchronization • Parallelization • Scheduling policy
7.2 Synchronization in traditional UNIX • The UNIX kernel is reentrant. • The UNIX kernel is nonpreemptive.
Interrupt Masking • The kernel can be interrupted. • The interrupts can be preemptive. • The kernel thread keep a current ipl to mask the interrupts with the ipls lower than it.
Sleep and wakeup • Flags for shared resources • Locked • Wanted • Before accessing, check locked, yes, blocking and set wanted • After accessing, check wanted, yes, awaken all the waiting
Limitations • Performance problem • Mapping resources to sleep queues • Sleep channel(the resource address) and sleep queue(hash map) • >1 may map to the same channel • The number of hash queue is much smaller than the number of different sleep channels. • Too much time for awakening • Separate sleep queue for each resource • Too many spaces
Alternatives • Separate sleep queue for each resource or event – latency is optimized but requires large memory overhead • Solaris turnstiles –more predictable real-time behaviour with minimal overhead; • It is desirable to have more sophisticated sharing protocol to allow multiple readers but exclusive writing access
7.3 Multiprocessor Systems • 3 important characteristics of MP • Memory model, hardware support, & software architecture • Memory Model • UMA (Uniform Memory Access) • NUMA (Non-Uniform Memory Access) • NORMA (NO-Remote Memory Access)
Synchronization Support • Locking a resource for exclusive use • Read the flag • If the flag is 0, then set it to 1 • Return true if the lock was obtained, else false. • Atomic test-and-set • test&set->access ->reset • Load-linked & store-conditioned • loadl(M,R)-> access(R)->storec(M,R)
Implement test-&-set by load-link & store-conditional • void test&set(int s)//primitive • { while (s!=0); • s=1; • } • void test&set(int s) //primitive • { register r; • while (loadl(r,s)& r!=0); • r=1; • storec(r,s); • }
Software Architecture • Three types of multiprocessor systems • Master-slave • One for kernel, the others for user • One for I/O, the others for computing • Functionally asymmetric multiprocessors • Different CPU is for different types of applications • Symmetric multiprocessing • All CPUs are peer-to-peer, share a kernel code, run the user program.
7.4 Multiprocessor Synchronization issues • Need to protect all the data shared • Two kernel threads may access the data simultaneously • Locked flag is not enough • Test&set • Impossible to block an interrupt • Other CPU running the handler may corrupt a data being accessed.
The lost wakeup problem Wait()
The Thundering Herd Problem • Uni-processor • When releasing the resource, wake up all the threads waiting for it. • Multiprocessor • All waked up, all the threads will run on different CPUs.What a mess! • It is the Thundering Herd Problem.
7.5 Semaphores void initsem(semaphore *sem, int val){ *sem = val; } void P(semaphore *sem){ *sem-=1; while(*sem<0) sleep; } void V(semaphore *sem){ *sem+=1; if(*sem>=0) wakeup a thread blocked on sem; } Semaphore operations boolean_t CP(semaphore *sem){ if(*sem>0) { *sem-=1; return TRUE; } else return FALSE; }
Mutual Exclusion ;/*initialization*/ semaphore sem initsem(&sem,1); /*on each use*/ P(&sem); Use resource; V(&sem);
Event-Wait semaphore event; initsem(&event,0); P(&event); Doing something; V(&event); If (event occurs) V(&event);
Countable resources semaphore counter; Initsem(& counter,resourceCount); P(&counter); Use resource; V(&counter);
Drawbacks of semaphores • It needs the hardware to support its atomicity. • Context switching is time-costing, so blocking is expensive for a short-term using resource. • May hide the details for blocking, e.g., buffer cache .
Convoys • A convoy is a situation when there is frequent contention on a semaphore.
Spin Locks • Also called simple lock or a simple mutex void spin_lock(spinlock_t *s) { while (test_and_set (s) !=0); } void spin_unlock(spinlock_t *s) { *s=0; }
Another version • For saving time of locking the bus void spin_lock(spinlock_t *s) { while (test_and_set (s) !=0) while (*s!=0); } void spin_unlock(spinlock_t *s) { *s=0; }
Using spin-locks • The difference between UNIP and MP. • Spin-lock is not expensive. • Ideal for locking data structures that need to be accessed briefly (removing item from a list) spin_lock_t list; spin_lock( &list); item->forw->back = item->back; item->back ->forw = item ->forw; spin_unlock(&list);
Condition Variables • v – condition variable, m – mutex lock predicate x==y; Thread A a: lock_mutex(&m); b: while (x!=y) c: cond_wait(&v, &m);/*atomically releases*/ /* the mutex and blocks*/ /*when unblocked by the cond_signal, it reacquires the mutex*/ d: /*do something with x and y*/ e: unlock_mutex(&m);
Condition Variables • Thread B f: lock_mutex(&m); g: x++; h: cond_signal(&v); i: unlock_mutex(&m);
Condition Variables • Associated with a predicate. Client 1 Server 1 Client 2 Server 2 Request queue … … Empty? … … Client m Server n A message queue is the shared data The predicate is that the queue be nonempty .
implementation • Atomic operation to test the predicate. struct condition{ proc *next; proc *prev; spinlock_t listlock; }
wait() void wait(condition *c, spinlock_t *s) { spin_lock(&c->listlock); add self to the linked list; spin_unlock(&c->listlock); spin_unlock(s); swtch(); spin_lock(s); return; }
void do_signal() void do_signal(condition*c){ spin_lock(&c->listlock); remove one thread from linked list, if it is nonempty; spin_unlock(&c->listlock); if a thread was removed from the list, make it runnable; return; }
void do_broadcast() void do_broadcast(condition*c){ spin_lock(&c->listlock); while (linked list is nonempty){ remove one thread from linked list, make it runnable; } spin_unlock(&c->listlock); }
Accessing conditional variables condition c; spinlock_t s; mesageQueue msq; server() {spin_lock(&s); if (msq.empty()) wait(&c,&s); get message; spin_unlock(&s); do_signal( &c ); } client() {spin_lock(&s); if (msq.full()) wait(&c,&s); put message; spin_unlock(&s); do_signal( &c ); }
Events • Combines a done flag, the spinlock protecting it and a condition variable Operations – awaitDone, setDone or testDone • Blocking locks combines locked flag on the resource and the sleep queue; operations – lock() and unlock()
7.8 Read-Write locks • Support one-writer & many readers • lockShared() // Reader • unlockShared() // Reader • lockExclusive() // Writer • unlockExclusive() // Writer • upgrade() // from shared to exclusive • downgrade() //from exclusive to shared
Design consideration • Avoid needless wake up • Reader release • only the last to wake up a single writer • Writer release: • prefer writer • Reader starvation • wake up all the readers • Writer starvation • lockShared() block if there is any writer waiting //new readers
Implementation struct rwlock{ int nActive;// number of readers, -1 means a writer is active int nPendingReads; int nPendingWrites; spinlock_t sl; condition canRead; condition canWrite; };
un/lockShared void lockShared(struct rwlock *r) { spin_lock(&r->sl); r->nPendingReads++; if (r->nPendingWrites>0) wait(&r->canRead, &r->sl); while(r->nActive <0) wait(&r->canRead, &r->sl); r->nActive++; r->nPendingReads--; spin_unlock(&r->sl); } void unlockShared(struct rwlock *r) { spin_lock(&r->sl); r->nActive --; if (r-> nActive ==0) { spin_unlock (&r->sl); do_signal(&r->canWrite); } else spin_unlock(&r->sl); }
un/lockExclusive void lockExclusive(struct rwlock *r) { spin_lock(&r->sl); r->nPendingWrites++; while(r->nActive) wait(&r->canWrite, &r->sl); r->nPendingWrites--; r->nActive =-1; spin_unlock(&r->sl); } void unlockExclusive(struct rwlock *r) { boolean_t wakeReaders; spin_lock(&r->sl); r->nActive =0; wakeReaders = (r->PendingReads!=0); spin_unlock(&r->sl); if (wakeReaders) { do_broadcast(&r->canRead); else do_signal(&r->canWrite); }
Up/downgrade() void upgrade(struct rwlock *r) { spin_lock(&r->sl); if (r->nActive ==1) r->nActive =-1; else{ r->nPendingWrites++; r->nActive--; while (r->nActive) wait(&r->canWrite, &r->sl); r->nPendingWrites--; r->nActive =-1; } spin_unlock(&r->sl); } void downgrade(struct rwlock *r) { boolean_t wakeReaders; spin_lock(&r->sl); r->nActive =1; wakeReaders = (r->PendingReads!=0) spin_unlock(&r->sl); if (wakeReaders) { do_broadcast(&r->canRead); }
Using R/W locks T2() { lockExclusive(&l); writing; downgrade(&l); reading; upgrade(&l); unlockExclusive(&l); } rwlock l; T1() { lockShared(&l); reading; upgrade(&l) writing; downgrade(&l) unlockShared(&l); }
Reference Counts • A number kept with an object to show how many threads have pointers to it. • To keep the threads pointing to it can access it. • When creating, set to one. • When one thread gets a pointer to it, the number increments. • When the thread release it, the kernel will decrement it. • When the count is decreased to 0, the kernel will de-allocate it.
Deadlock avoidance • Hierarchical locking – imposes an order on related locks and requires that all threads take locks in the same order, e.g. thread must lock condition variable before locking the linked list; • Stochastic locking – when order violation may occur try_lock() is used instead of lock(). If the lock cannot be acquired because it is allocated already it returns a failure instead of blocking.
Recursive locks • Lock, which is owned by a thread is acquired by a lower level routine which is called by the thread to perform some operation on resource locked by the thread. Without recursion there would be deadlock.