100 likes | 216 Views
Model Checking of a lock-free stack. Wael Yehia York University March 31, 2010. Main Components of the Stack. The stack was simply a top pointer Each thread has a ThreadInfo object that uniquely identifies the thread Two arrays, for collision and (lock-free) synchronization purposes
E N D
Model Checking of a lock-free stack Wael Yehia York University March 31, 2010
Main Components of the Stack • The stack was simply a top pointer • Each thread has a ThreadInfo object that uniquely identifies the thread • Two arrays, for collision and (lock-free) synchronization purposes • AtomicIntegerArray collision • AtomicReferenceArray<ThreadInfo<T>> location threadInfo<T> final int id OP op cell<T> cell
JPF Testing • We ran JPF on our test cases from assignment 2, lowering the # of threads and operations. • We found: • no deadlocks • 1 Data Race (not fixed, so maybe more) • 3 Uncaught Exceptions (All fixed)
Data Races Found • Tested on different number of threads and number of operations per thread • For 2 threads, no Data Races were found • # of ops: 2, 3, 4, 5 were tested • For 3 threads, a Data Race was found. • Could not be fixed. • The race was also related to the same problem that causes the NullPointerException discussed later.
Uncaught Exceptions found • One NullPointerException • Two AssertionError() Exceptions • The null pointer and one of the assertion errors seemed to be related. • Occur due to the same scenario that causes the Data Race • When the problem was fixed, so was the second assertion error
The Untested Scenario • The null pointer and the data race problems rise in the following situation (which is not accounted for in the paper): Let p stand for Thread p, q for Thread q, and qInfo for q’s ThreadInfo • p.pop() q.push() • Collide with q • - q sees someone has collide with it • - q exits normally • - q starts another operation • either operations alter qInfo.cell • - p read qInfo.cell
The Untested Scenario • In general, the scenario is as follows: • Two threads collide ( p.pop() and q.push() ) • The pushing thread finishes first and exits. • Then it executes another stack operation before the popping thread reads any data from it. • The popping thread wakes up and starts reading the data from q’s ThreadInfo
Threadinfo Threadinfo Threadinfo of p of q of r Threadinfo null Don’t care of q Solution • It is obvious that the problem occurs when one thread (popping p) is slow in reading the data from the second thread (pushing q). • The fast thread cannot wait for the slow thread, so it has to store it’s data somewhere. • Quick Reminder of the collision process: • Two processes cannot be colliding with the same process, so their collision relation looks like this: …. q p r …. • State of the location Array (that hold threadinfo’s) during a collision: 3 Threads: q pushing, p popping, r popping p.id = 0, q.id = 1, r.id = 2 Array before collision: Array after collision: collide with collide with collide with
Solution (Cont’d) • Solution part (a) (when q starts the collision): • instead of saving it’s ThreadInfo in popping thread’s slot, • create and store a new dummy ThreadInfo holding the data • Solution part (b) (when p starts the collision): • Before attempting the collision, save the q’s data locally • When collision succeeds, use it, otherwise discard it
Conclusion • JPF helped us find and understand the problem more clearly. • The exception caught by jpf took seconds or minutes at most. While during our testing, they appeared once every millions of operations executed by many threads concurrently. • The described scenario will fail the algorithm presented in the paper. • The Data race has still to be fixed