190 likes | 315 Views
An Case for an Interleaving Constrained Shared-Memory Multi-Processor. CS6260 Biao xiong , Srikanth Bala . Why is Parallel Programming Hard?. Parallel programming is harder than single-threaded programming relatively easy? Threads interleave in so many ways
E N D
An Case for an Interleaving Constrained Shared-Memory Multi-Processor CS6260 Biao xiong, SrikanthBala.
Why is Parallel Programming Hard? • Parallel programming is harder than single-threaded programming relatively easy? • Threads interleave in so many ways we can not decide the order of the execution of all the threads • some threads remain untested impossible to test all the interleavings.
Why is Parallel Programming Hard? Legal Thread Interleavings Untested interleavings - cause for concurrency bugs Incorrect interleavings found during testing
Solution • Synchronization operation: semaphores locks condition variable transaction. • Memeoryconsistancy model it reduces the number of legal thread interleavings. • Using PSet to test correct interleavings in program’s execution. • Avoid untested interleavings occur infrequently.
Challenges • How to encode tested interleavings in a program’s binary? • Predecessor Set (PSet) interleaving constraints • How to efficiently enforce interleaving constraints at runtime? • Detect violations of PSetconstraints • Avoid violations by stalling or using rollback-and-re-execution support
Constraining Interleavings • A majority of the concurrency bugs are avoidable • Data races, atomicity violations, and also order violations • Performance overhead is low • Untested interleavings in well-tested programs are likely to manifest rarely
Data Race • A pair of memory accesses to the same memory location, at least one is write, neither one happens before the other
Data race detectors • Happens-before Only data race in a given program execution • Lockset-based Predict data races. Not occur in a program’s execution.
Benign data race • Not all data races are harmful data races, programmers allow data races to optimize performance.
Atomicity Violation • Atomicity is a guarantee of isolation from cuncurrent processes x++? moveax,dwordptr [x] add eax,1 movdwordptr [x],eax
Atomicity Violation Detectors • AVIO Analyse atomic region and detect atomicity violations.
Order violation • Thread should be invoked only after thread2 executes the wait().
Deterministic Multi-threading • Any execution of a multi-threaded program would yield the same output as long as the input remains the same.
PSet • Defined for each static memory operation. • Consider true(read after write) as well as false(write after write and write after read)
PSet • P in PSet of M iff 1.either P or M should be a write 2. P, M executed in two different threads (T1, T2) 3.M was immediately dependent on P 4. neither T1 nor T2 executed a read or a write between P and M.
PSet • Example
PSet • It can capture the fact that a benign data race interleaving is a correct interleaving, it will not hurt the performance.
PSet • Instruction with Pset information for a 32-bit ISA