120 likes | 351 Views
Microprocessors. Synchronization Instructions April 16th, 2002. Consider Simple Monitor Case. Logically we want Test if lock is set, if so wait, else set lock Do processing needing to lock out others Release the lock
E N D
Microprocessors Synchronization Instructions April 16th, 2002
Consider Simple Monitor Case • Logically we want • Test if lock is set, if so wait, else set lock • Do processing needing to lock out others • Release the lock • Example, two threads need to access the screen, but there is only one cursor register, so must not intefere with one another.
The Critical Lock Step • Test if lock is set, if so wait, else set lock • The point is that this must be atomic • The following code is wrong. Why? • Move mem to register • Test if register is non-zero • If so, wait • Else set mem non-zero
The Critical Lock Step • Test if lock is set, if so wait, else set lock • The point is that this must be atomic • The following code is wrong. Why? • Move mem to register • Test if register is non-zero • -> BECAUSE YOU CAN GET INTERRUPT HERE • If so, wait • Else set mem non-zero
Let the OS Do Things • Set address of lock in a register • Trap to operating system • Let the operating system take care of it • Works fine … • But … • Operating system traps are expensive • Most of the time the lock was free • So this is very inefficient
Hardware Instructions • Test and Set • Sets memory non-zero • And sets a flag to indicate if it was non-zerobefore the operation was done. • This instruction must be indivisible • Requires indivisible memory read/write • This is the classical instruction • E.g. the TS instruction on the IBM mainframe
Other Hardware Instructions • Exchange memory register • We can simulate TS with this • Register has non-zero value • Memory is lock • Do exchange • Test what was in memory before
Other Hardware Instructions • Replace-Add • register mem mem + register • Much more powerful. Consider the following problem. • A queue of tasks is to be executed • Each thread wants to grab one operation • Without having to do fancy synchronization
Solution to that Problem • Represent the tasks in an array, one task per array entry. Tasks are indexed by subscripts in the range N .. M. • A mem variable has address of next task to take. • Each thread does Replace-Add of 1 on this mem variable. • Each thread gets a unique index
Caching Considerations • The memory locations cannot be cached • Because two different threads on two different processes could get confused • If each has a cached copy of the memory • So either the caches must be coherent • Or the memory location must not be cached at all.