1 / 23

Iterative Context Bounding for Systematic Testing of Multithreaded Programs

Iterative Context Bounding for Systematic Testing of Multithreaded Programs. Madan Musuvathi Shaz Qadeer Microsoft Research. Testing multithreaded programs is HARD. Specific thread interleavings expose subtle errors Testing often misses these errors

shakti
Download Presentation

Iterative Context Bounding for Systematic Testing of Multithreaded Programs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Iterative Context Bounding for Systematic Testing of Multithreaded Programs Madan Musuvathi Shaz Qadeer Microsoft Research

  2. Testing multithreaded programs is HARD • Specific thread interleavings expose subtle errors • Testing often misses these errors • Even when found, errors are hard to debug • No repeatable trace • Source of the bug is far away from where it manifests

  3. Current practice • Concurrency testing == Stress testing • Example: testing a concurrent queue • Create 100 threads performing queue operations • Run for days/weeks • Pepper the code with sleep( random() ) • Stress increases the likelihood of rare interleavings • Makes any error found hard to debug

  4. CHESS: Unit testing for concurrency • Example: testing a concurrent queue • Create 1 reader thread and 1 writer thread • Exhaustively try all thread interleavings • Run the test repeatedly on a specialized scheduler • Explore a different thread interleaving each time • Use model checking techniques to avoid redundancy • Check for assertions and deadlocks in every run • The error-trace is repeatable

  5. State space explosion Thread 1 Thread 2 x = 1; y = 1; x = 2; y = 2; Init state: x = 0, y = 0 0,0 2,0 1,0 x = 1; 1,0 2,2 1,1 2,0 y = 1; x = 2; 1,2 1,2 2,1 2,1 1,1 2,2 y = 2; 1,2 1,1 1,1 2,1 2,2 2,2

  6. State space explosion • Number of executions = O( nnk ) • Exponential in both n and k • Typically: n < 10 k > 100 • Limits scalability to large programs (large k) Thread 1 Thread n x = 1; … … … … … y = 1; x = 2; … … … … … y = 2; … k steps each n threads

  7. Techniques • Iterative context bounding • Strategy for searching large state spaces • State space optimization • Reduces the size of the state space

  8. Iterative context bounding • Prioritize executions with small number of preemptions • Two kinds of context switches: • Preemptions – forced by the scheduler • e.g. Time-slice expiration • Non-preemptions – a thread voluntarily yields • e.g. Blocking on an unavailable lock, thread end Thread 1 Thread 2 x = 1; if (p != 0) { x = p->f; } x = 1; if (p != 0) { p = 0; preemption x = p->f; } non-preemption

  9. Iterative context-bounding algorithm • The scheduler has a budget of c preemptions • Nondeterministically choose the preemption points • Resort to non-preemptive scheduling after c preemptions • Run each thread to the next yield point • Once all executions explored with c preemptions • Try with c+1 preemptions • Iterative context-bounding has desirable properties • Property 0: Easy to implement

  10. Property 1: Polynomial state space • n threads, k steps each, c preemptions • Number of executions <= nkCc . (n+c)! = O( (n2k)c. n! ) Exponential in n and c, but not in k Thread 1 Thread 2 • Choose c preemption points x = 1; … … … … … y = 1; x = 1; … … … … x = 2; … … … … … y = 2; x = 2; … … … • Permute n+c atomic blocks … … … y = 1; y = 2;

  11. Property 2: Deep exploration possible with small bounds • A context-bounded execution has unbounded depth • A thread may execute unbounded number of steps within each context • Can reach a terminating state from an arbitrary state with zero preemptions • Perform non-preemptive scheduling • Leave the number of non-preemptions unbounded

  12. Property 3: Coverage metric • If search terminates with c preemptions, • any remaining error must require at least c+1 preemptions • Intuitive estimate for • the complexity of the bugs remaining in the program • the chance of their occurrence in practice

  13. Property 4: Finds the ‘simplest’ error trace • Finds the smallest number of preemptions to the error • Number of preemptions better metric of error complexity than execution length

  14. Property 5: Lots of bugs with small number of preemptions

  15. Most states are covered with small number of preemptions

  16. Coverage vs Time (Dryad)

  17. Techniques • Iterative context-bounding • Strategy for searching large state spaces • State space optimization

  18. Insert context-switches only at synchronization points Massive state-space reduction Num steps (k) = num synch. operations (not memory accesses) Run data-race detection to check race-free assumption Goldilocks algorithm [PLDI ’07] implemented for x86 Theorem: When search terminates for context-bound c Either find an erroneous execution Or find a data-race Or the program has no errors reachable with c preemptions Optimization for race-free programs

  19. Conclusion • Iterative context-bounding algorithm • Effective search strategy for multi-threaded bugs • Exposes many concurrency bugs • Implemented in the CHESS model checking tool • Applying CHESS to Windows drivers, SQL, Cosmos, Singularity • Visit http://research.microsoft.com/projects/CHESS/

  20. Extra Slides

  21. Partial-order reduction • Many thread interleavings are equivalent • Accesses to separate memory locations by different threads can be reordered • Avoid exploring equivalent thread interleavings T1: x := 1 T2: y := 2 T2: y := 2 T1: x := 1

  22. Optimistic dynamic partial-order reduction • Algorithm [Bruening ‘99] : • Assume the program is data-race free • Context switch only at synchronization points • Check for data-races in each execution • Theorem [Stoller ‘00] : • If the algorithm terminates without reporting races • Then the program has no assertion failures • Massive reduction: • k = number of synchronization accesses (not memory accesses)

  23. Combining with context-bounding • Algorithm: • Assume the program is data-race free • Context switch only at synchronization points • Explore executions with c preemptions • Check for data-races in each execution • Theorem: • If the algorithm terminates without reporting races, • Then the program has no assertion failures reachable with c preemptions • Requires that a thread can block only at synchronization points

More Related