1 / 10

Model Checking of a lock-free stack

Model Checking of a lock-free stack. Wael Yehia York University March 31, 2010. Main Components of the Stack. The stack was simply a top pointer Each thread has a ThreadInfo object that uniquely identifies the thread Two arrays, for collision and (lock-free) synchronization purposes

ellema
Download Presentation

Model Checking of a lock-free stack

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Model Checking of a lock-free stack Wael Yehia York University March 31, 2010

  2. Main Components of the Stack • The stack was simply a top pointer • Each thread has a ThreadInfo object that uniquely identifies the thread • Two arrays, for collision and (lock-free) synchronization purposes • AtomicIntegerArray collision • AtomicReferenceArray<ThreadInfo<T>> location threadInfo<T> final int id OP op cell<T> cell

  3. JPF Testing • We ran JPF on our test cases from assignment 2, lowering the # of threads and operations. • We found: • no deadlocks • 1 Data Race (not fixed, so maybe more) • 3 Uncaught Exceptions (All fixed)

  4. Data Races Found • Tested on different number of threads and number of operations per thread • For 2 threads, no Data Races were found • # of ops: 2, 3, 4, 5 were tested • For 3 threads, a Data Race was found. • Could not be fixed. • The race was also related to the same problem that causes the NullPointerException discussed later.

  5. Uncaught Exceptions found • One NullPointerException • Two AssertionError() Exceptions • The null pointer and one of the assertion errors seemed to be related. • Occur due to the same scenario that causes the Data Race • When the problem was fixed, so was the second assertion error 

  6. The Untested Scenario • The null pointer and the data race problems rise in the following situation (which is not accounted for in the paper): Let p stand for Thread p, q for Thread q, and qInfo for q’s ThreadInfo • p.pop() q.push() • Collide with q • - q sees someone has collide with it • - q exits normally • - q starts another operation • either operations alter qInfo.cell • - p read qInfo.cell

  7. The Untested Scenario • In general, the scenario is as follows: • Two threads collide ( p.pop() and q.push() ) • The pushing thread finishes first and exits. • Then it executes another stack operation before the popping thread reads any data from it. • The popping thread wakes up and starts reading the data from q’s ThreadInfo

  8. Threadinfo Threadinfo Threadinfo of p of q of r Threadinfo null Don’t care of q Solution • It is obvious that the problem occurs when one thread (popping p) is slow in reading the data from the second thread (pushing q). • The fast thread cannot wait for the slow thread, so it has to store it’s data somewhere. • Quick Reminder of the collision process: • Two processes cannot be colliding with the same process, so their collision relation looks like this: …. q p r …. • State of the location Array (that hold threadinfo’s) during a collision: 3 Threads: q pushing, p popping, r popping p.id = 0, q.id = 1, r.id = 2 Array before collision: Array after collision: collide with collide with collide with

  9. Solution (Cont’d) • Solution part (a) (when q starts the collision): • instead of saving it’s ThreadInfo in popping thread’s slot, • create and store a new dummy ThreadInfo holding the data • Solution part (b) (when p starts the collision): • Before attempting the collision, save the q’s data locally • When collision succeeds, use it, otherwise discard it

  10. Conclusion • JPF helped us find and understand the problem more clearly. • The exception caught by jpf took seconds or minutes at most. While during our testing, they appeared once every millions of operations executed by many threads concurrently. • The described scenario will fail the algorithm presented in the paper. • The Data race has still to be fixed

More Related