170 likes | 191 Views
CSL718 : Multiprocessors. Synchronization, Memory Consistency 17th April, 2006. Synchronization Problem. Processes run on different processors independently At some point they need to know the status of each other for communication mutual exclusion etc
E N D
CSL718 : Multiprocessors Synchronization, Memory Consistency 17th April, 2006 Anshul Kumar, CSE IITD
Synchronization Problem • Processes run on different processors independently • At some point they need to know the status of each other for • communication • mutual exclusion etc • Hardware primitive for atomic read+write is required (e.g. test&set, exchange, fetch&increment etc.) Anshul Kumar, CSE IITD
Spin Lock with Exchange Instr. Lock: 0 indicates free and 1 indicates locked Code to lock X : r2 1 lockit: r2 X ;atomic exchange if(r20)lockit ;already locked locks are cached for efficiency, coherence is used Better code to lock X : lockit: r2 X ;read lock if(r20)lockit ;not available r2 1 r2 X ;atomic exchange if(r20)lockit ;already locked
LD Locked & ST conditional Simpler to implement • atomic exchange using LL and SC try: r3 r2 ;move exchange value LL r1, X ;load locked SC r3, X ;store conditional if(r3=0)try ;branch store fails r2 r1 ;put loaded value in r2 • fetch&increment using LL and SC try: LL r1, X ;load locked r3 r1 + 1 ;increment SC r3, X ;store conditional if(r3=0)try ;branch store fails Anshul Kumar, CSE IITD
Spin Lock with LL & SC lockit: LL r2, X ;load locked if(r20)lockit ;not available r2 1 SC r2, X ;store cond if(r2=0)lockit ;branch store fails spin lock with exponential back-off reduces contention Anshul Kumar, CSE IITD
Barrier Synchronization lock (X) if(count=0)release 0 count++ unlock(X) if(count=total){count0;release1} else spin(release=1) Anshul Kumar, CSE IITD
Improved Barrier Synch. local_sense !local_sense lock (X) count++ unlock(X) if(count = total) {count0;releaselocal_sense} else {spin(release = local_sense)} tree based barrier reduces contention Anshul Kumar, CSE IITD
Memory Consistency Problem • When must a processor see the value that has been written by another processor? Atomicity of operations – system wide? • Can memory operations be re-ordered? Various models : http://rsim.cs.uiuc.edu/~sadve/Publications/ models_tutorial.ps Anshul Kumar, CSE IITD
Example P1: A = 0 P2: B = 0 ... ... A = 1 B = 1 L1: if(B=0)S1 L2: if(A=0)S2 Which statements among S1 and S2 are done? Both S1, S2 may be done if writes are delayed Anshul Kumar, CSE IITD
Sequential Consistency • result of any execution is same as if the operations of all processors were executed in some sequential order • operations of each processor occur in the order specified by its program - it requires all memory operations to be atomic - too restrictive, high overheads Anshul Kumar, CSE IITD
Relaxing WR order Loads are allowed to overtake stores Write buffering is permitted • Total Store Ordering : Writes are atomic • Processor Consistency : Writes need not be atomic - Invalidations may gradually propagate Anshul Kumar, CSE IITD
Relaxing WR & WW order Partial Store Ordering • Loads are allowed to overtake stores • Writes can be re-ordered • Memory barrier or fence are used to explicitly order any operations Further improves the performance Anshul Kumar, CSE IITD
P1P2 A = 1; while(flag=0); flag = 1; print A; P1P2 A = 1; print B; B = 1; print A; SC ensures that “1” is printed TSO, PC also do so PSO does not SC ensures that if B is printed as “1” then A is also printed as “1” TSO, PC also do so PSO does not Examples Anshul Kumar, CSE IITD
Examples - continued P1P2P3 A = 1; while(A=0); while(B=0); B = 1; print A; SC ensures that “1” is printed. TSO and PSO also do that but PC does not P1P2 A = 1; B = 1; print B; print A; SC ensures that both can’t be printed as “0”. TSO, PC and PSO do not Anshul Kumar, CSE IITD
Relaxing all R/W order Weak Ordering or Weak Consistency • Loads and Stores are not restricted to follow an order • Explicit synchronization primitives are used • Synchronization primitives follow a strict order • Easy to achieve • Low overhead Anshul Kumar, CSE IITD
Release Consistency • Further relaxation of weak ordering • Synch primitives are divided into aquire and release operations • R/W operations after an aquire can not move before it but those before it can be moved after • R/W operations before a release can not move after it but those after it can be moved before Anshul Kumar, CSE IITD
WC and RC Comparison WC RC R/W … R/W R/W … R/W 1 1 synch aquire R/W … R/W R/W … R/W 2 2 synch release R/W … R/W R/W … R/W 3 3 Anshul Kumar, CSE IITD