1 / 47

Deadlock Detection

Deadlock Detection. Nov 26, 2012 CS 8803 FPL. Part I. Static Deadlock Detection Reference: Effective Static Deadlock Detection [ICSE’09]. What is a Deadlock?. An unintended condition in a shared-memory, multi-threaded program in which: a set of threads blocks forever

kale
Download Presentation

Deadlock Detection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Deadlock Detection Nov 26, 2012CS 8803 FPL

  2. Part I • Static Deadlock Detection • Reference:Effective Static Deadlock Detection [ICSE’09]

  3. What is a Deadlock? • An unintended condition in a shared-memory, multi-threaded program in which: • a set of threads blocks forever • because each thread in the set waits to acquire alock being held by another thread in the set • This work: ignore other causes (e.g., wait/notify) • Example l1 // Thread t2 sync (l2) { sync (l1) { … } } // Thread t1 sync (l1) { sync (l2) { … } } t1 t2 l2

  4. Motivation • Today’s concurrent programs are rife with deadlocks • 6,500/198,000 (~ 3%) of bug reports in Sun’s bug database at http://bugs.sun.com are deadlocks • Deadlocks are difficult to detect • Usually triggered non-deterministically, on specific thread schedules • Fail-stop behavior not guaranteed (some threads may be deadlocked while others continue to run) • Fixing other concurrency bugs like races can introduce new deadlocks • Our past experience with reporting races: developers oftenask for deadlock checker

  5. Previous Work • Based on finding cycles in program’s dynamic or static lock order graph • Dynamic approaches • Inherently unsound • Inapplicable to open programs • Ineffective without sufficient test input data • Static approaches • Type systems (e.g., Boyapati-Lee-RinardOOPSLA’02) • Annotation burden often significant • Model checking (e.g., SPIN) • Does not currently scale beyond few KLOC • Dataflow analysis (e.g., Engler & Ashcraft SOSP’03;Williams-Thies-Ernst ECOOP’05) • Scalable but highly imprecise l1 t1 t2 l2

  6. Challenges to Static Deadlock Detection l = abstract lock acq. t = abstract thread l1 l1 l4 • Deadlock freedom is a complex property • can t1,t2 denote different threads? • can l1,l4 denote same lock? • can t1 acquire locks l1->l2? • some more … t1 t1 t2 t2 l2 l2 l3

  7. Our Rationale l1 l1 l4 • Deadlock freedom is a complex property • can t1,t2 denote different threads? • can l1,l4 denote same lock? • can t1 acquire locks l1->l2? • some more … t1 t1 t2 t2 l2 l2 l3

  8. Our Rationale l1 l1 l4 • Existing static deadlock checkers cannot check all conditions simultaneously and effectively • But each condition can be checked separately and effectively using existing static analyses t1 t1 t2 t2 l2 l2 l3

  9. Our Approach l1 l1 l4 • Consider all candidate deadlocks in closed program • Check each of six necessary conditions for each candidateto be a deadlock • Report candidates that satisfy all six conditions • Note: Finds only deadlocks involving 2 threads/locks • Deadlocks involving > 2 threads/locks rare in practice • may-reach(t1,l1,l2)? • may-alias(l1,l4)? t1 t1 t2 t2 • ... l2 l2 l3

  10. Example: jdk1.4 java.util.logging class LogManager { static LogManager manager =newLogManager(); 155: Hashtable loggers = new Hashtable(); 280: syncbooleanaddLogger(Logger l) { String name = l.getName(); if (!loggers.put(name, l)) return false; // ensure l’s parents are instantiated for (...) { String pname = ...; 314: Logger.getLogger(pname); } return true; } 420: sync Logger getLogger(String name) { return (Logger) loggers.get(name); } } class Logger { 226: static sync Logger getLogger(String name) { LogManager lm = LogManager.manager; 228: Logger l = lm.getLogger(name); if (l == null) { l = new Logger(...); 231: lm.addLogger(l); } return l; } } l4 l1 l3 class Harness { static void main(String[] args) { 11: new Thread() { void run() { 13: Logger.getLogger(...); }}.start(); 16: new Thread() { void run() { 18: LogManager.manager.addLogger(...); }}.start(); } } t1 l2 t2

  11. Example Deadlock Report *** Stack trace of thread <Harness.java:11>: LogManager.addLogger (LogManager.java:280) - this allocated at <LogManager.java:155> - waiting to lock {<LogManager.java:155>} Logger.getLogger (Logger.java:231) - holds lock {<Logger.java:0>} Harness$1.run (Harness.java:13) *** Stack trace of thread <Harness.java:16>: Logger.getLogger (Logger.java:226) - waiting to lock {<Logger.java:0>} LogManager.addLogger (LogManager.java:314) - this allocated at <LogManager.java:155> - holds lock {<LogManager.java:155>} Harness$2.run (Harness.java:18)

  12. Our Approach • Six necessary conditions identified experimentally • Checked using four incomplete but sound whole-program static analyses } • Reachable • Aliasing • Escaping • Parallel • Non-reentrant • Non-guarded • Relatively language independent • Incomplete but sound checks • Widely-used Java locking idioms • Incomplete and unsound checks • - sound needs must-alias analysis } • Call-graph analysis • May-alias analysis • Thread-escape analysis • May-happen-in-parallel analysis

  13. Condition 1: Reachable l1 l1 l4 l4 • Property: In some execution: • can a thread abstracted by t1 reach l1 • and after acquiring lock at l1, proceed to reach l2 while holding that lock? • and similarly for t2, l3, l4 • Solution: Use call-graph analysis • k-object-sensitive [Milanova-Rountev-Ryder ISSTA’03] t1 t1 t2 t2 l2 l2 l3 l3

  14. Example: jdk1.4 java.util.logging class LogManager { static LogManager manager =newLogManager(); 155: Hashtable loggers = new Hashtable(); 280: syncbooleanaddLogger(Logger l) { String name = l.getName(); if (!loggers.put(name, l)) return false; // ensure l’s parents are instantiated for (...) { String pname = ...; 314: Logger.getLogger(pname); } return true; } 420: sync Logger getLogger(String name) { return (Logger) loggers.get(name); } } class Logger { 226: static sync Logger getLogger(String name) { LogManager lm = LogManager.manager; 228: Logger l = lm.getLogger(name); if (l == null) { l = new Logger(...); 231: lm.addLogger(l); } return l; } } l4 l1 l3 class Harness { static void main(String[] args) { 11: new Thread() { void run() { 13: Logger.getLogger(...); }}.start(); 16: new Thread() { void run() { 18: LogManager.manager.addLogger(...); }}.start(); } } t1 l2 t2

  15. Condition 2: Aliasing l1 l1 l4 • Property: In some execution: • can a lock acquired at l1 be the same as a lock acquired at l4? • and similarly for l2, l3 • Solution: Use may-alias analysis • k-object-sensitive [Milanova-Rountev-Ryder ISSTA’03] t1 t2 l2 l2 l3

  16. Example: jdk1.4 java.util.logging class LogManager { static LogManager manager =newLogManager(); 155: Hashtable loggers = new Hashtable(); 280: syncbooleanaddLogger(Logger l) { String name = l.getName(); if (!loggers.put(name, l)) return false; // ensure l’s parents are instantiated for (...) { String pname = ...; 314: Logger.getLogger(pname); } return true; } 420: sync Logger getLogger(String name) { return (Logger) loggers.get(name); } } class Logger { 226: static sync Logger getLogger(String name) { LogManager lm = LogManager.manager; 228: Logger l = lm.getLogger(name); if (l == null) { l = new Logger(...); 231: lm.addLogger(l); } return l; } } l4 l1 l3 class Harness { static void main(String[] args) { 11: new Thread() { void run() { 13: Logger.getLogger(...); }}.start(); 16: new Thread() { void run() { 18: LogManager.manager.addLogger(...); }}.start(); } } t1 l2 t2

  17. Condition 3: Escaping l1 l1 l4 l4 • Property: In some execution: • can a lock acquired at l1 be thread-shared? • and similarly for each of l2, l3, l4 • Solution: Use thread-escape analysis t1 t2 l2 l2 l3 l3

  18. Example: jdk1.4 java.util.logging class LogManager { static LogManager manager =newLogManager(); 155: Hashtable loggers = new Hashtable(); 280: syncbooleanaddLogger(Logger l) { String name = l.getName(); if (!loggers.put(name, l)) return false; // ensure l’s parents are instantiated for (...) { String pname = ...; 314: Logger.getLogger(pname); } return true; } 420: sync Logger getLogger(String name) { return (Logger) loggers.get(name); } } class Logger { 226: static sync Logger getLogger(String name) { LogManager lm = LogManager.manager; 228: Logger l = lm.getLogger(name); if (l == null) { l = new Logger(...); 231: lm.addLogger(l); } return l; } } l4 l1 l3 class Harness { static void main(String[] args) { 11: new Thread() { void run() { 13: Logger.getLogger(...); }}.start(); 16: new Thread() { void run() { 18: LogManager.manager.addLogger(...); }}.start(); } } t1 l2 t2

  19. Condition 4: Parallel l1 l1 l4 l4 • Property: In some execution: • can different threads abstracted by t1 and t2 • simultaneously reach l2 and l4? • Solution: Use may-happen-in-parallel analysis • Does not model full happens-before relation • Models only thread fork construct • Other conditions model other constructs ≠ t1 t1 t2 t2 l2 l2 l3 l3

  20. Example: jdk1.4 java.util.logging class LogManager { static LogManager manager =newLogManager(); 155: Hashtable loggers = new Hashtable(); 280: syncbooleanaddLogger(Logger l) { String name = l.getName(); if (!loggers.put(name, l)) return false; // ensure l’s parents are instantiated for (...) { String pname = ...; 314: Logger.getLogger(pname); } return true; } 420: sync Logger getLogger(String name) { return (Logger) loggers.get(name); } } class Logger { 226: static sync Logger getLogger(String name) { LogManager lm = LogManager.manager; 228: Logger l = lm.getLogger(name); if (l == null) { l = new Logger(...); 231: lm.addLogger(l); } return l; } } l4 l1 l3 class Harness { static void main(String[] args) { 11: new Thread() { void run() { 13: Logger.getLogger(...); }}.start(); 16: new Thread() { void run() { 18: LogManager.manager.addLogger(...); }}.start(); } } t1 l2 t2

  21. Benchmarks

  22. Experimental Results

  23. Individual Analysis Contributions

  24. Conclusion • Novel approach to static deadlock detection for Java • Checks six necessary conditions for a deadlock • Uses four off-the-shelf static analyses • Neither sound nor complete, but effective in practice • Applied to suite of 14 multi-threaded Javaprograms comprising over 1.5 MLOC • Found all known deadlocks as well as previously unknown ones, with few false alarms

  25. Part II • Dynamic Deadlock Detection • Reference:An Effective Dynamic Analysis Technique for Detecting Generalized Deadlocks [FSE’10]

  26. Motivation • Most previous deadlock detection work has focused on resource deadlocks • Example // Thread T1 // Thread T2 sync(L1) { sync(L2) { sync(L2) { sync(L1) { …. …. } } } } L1 T1 T2 L2

  27. Motivation • Other kinds of deadlocks, e.g. communication deadlocks, are equally notorious • Example // Thread T1 // Thread T2 if (!b) { b = true; sync(L) { sync(L) { L.wait(); L.notify(); } } } T1 T2 if(!b) b = true notify L wait L b is initially false

  28. Goal • Build a dynamic analysis based tool that: • detects communication deadlocks • scales to large programs • has low false positive rate

  29. Our Initial Effort • Take cue from existing dynamic analyses for other concurrency errors • Existing dynamic analyses check for violation of a programming idiom • Races: • every shared variable is consistently protected by a lock • Resource deadlocks: • no cycle in lock ordering graph • Atomicity violations: • atomic blocks should have the pattern (R+B)*N(L+B)*

  30. Our Initial Effort • What programming idiom should we check for communication deadlocks?

  31. Our Initial Effort • Recommended usage of condition variables // Thread T1 // Thread T2 sync (L) { sync (L) { while (!cond) cond = true; L.wait(); L.notifyAll(); assert (cond == true); } }

  32. An Example • Recommended usage of condition variables // Thread T1 // Thread T2 sync (L) { sync (L) { while (list.isEmpty()) list.add(...); L.wait(); L.notifyAll(); … = list.remove(); } }

  33. Violation of Idiom as Deadlock Must use while, not if Accesses to b must be inside sync • Example // Thread T1 // Thread T2 if (!b) b = true; sync (L) L.notifyAll(); sync (L)L.wait();

  34. Satisfaction of Idiom as Deadlock No violation of idiom, but still deadlocks! • Example // Thread T1 // Thread T2 sync (L1) sync (L2) while (!b) L2.wait(); sync (L1) sync (L2) L2.notifyAll(); => Recommended usage pattern (or idiom) based checking does not work

  35. Revisiting Existing Analyses • Relax the dependencies between relevant events from different threads • verify all possible event orderings for errors • use data structures to check idioms (vector clocks, lock-graphs etc.) to implicitly verify all event orderings

  36. Revisiting Existing Analyses • Programming idiom-based checking does not workfor communication deadlocks • Nevertheless, we can explicitly verify all orderings of relevant events for deadlocks

  37. Trace Program T2 T1 // Thread T1 // Thread T2 if (!b) { b = true; sync (L) { sync (L) { L.wait (); L.notify (); } } } b is initially false lock L wait L lock L notify L unlock L unlock L

  38. Trace Program • Thread T1 { • lock L; • wait L; • unlock L; • } T2 T1 lock L wait L lock L • Thread T2 { • lock L; • notify L; • unlock L; • } notify L unlock L unlock L

  39. Trace Program • Thread T1 { Thread T2 {lock L; lock L; • wait L; || notify L; • unlock L; unlock L; • } } T2 T1 lock L wait L lock L notify L unlock L unlock L

  40. Trace Program • Built out of only a subset of events • usually much smaller than the original program • Throws away a lot of dependencies between threads • could give false positives • but increases coverage

  41. Trace Program: Add Dependencies T2 T1 // Thread T1 // Thread T2 if (!b) { b = true; sync (L) { sync (L) { L.wait (); L.notify (); } } } b is initially false if (!b) lock L wait L unlock L b = true lock L notify L unlock L

  42. Trace Program: Add Dependencies • Thread T1 { • if (!b) { • lock L; • wait L; • unlock L; • } • } T2 T1 if (!b) lock L wait L • Thread T2 { • b = true; • lock L; • notify L; • unlock L; • } unlock L b = true lock L notify L unlock L

  43. Trace Program: Add Predictivity • Use static analysis to add to the predictive power of the trace program • Thread T1 { • if (!b) { • lock L; • wait L; • unlock L; • } • } // Thread T1 // Thread T2 @ !b => L.wait() if (!b) { b = true; sync (L) { sync (L) { L.wait (); L.notify (); } } } b is initially false

  44. Trace Program: Other Errors • Effective for concurrency errors that cannot be detected using a programming idiom • communication deadlocks, deadlocks due of exceptions, … // Thread T1 // Thread T2 while (!b) { try { sync (L) { foo(); L.wait(); b = true; } sync (L) { L.notify(); } } } catch (Exception e) {…} b is initially false • can throw an • exception

  45. Implementation and Evaluation • Implemented for deadlock detection • both communication and resource deadlocks • Built a prototype tool for Java called CHECKMATE • Applied to several Java libraries and applications • log4j, pool, felix, lucene, jgroups, jruby.... • Found both previously known and unknown deadlocks (17 in total)

  46. Conclusion • CHECKMATE is a novel dynamic analysis for finding deadlocks • both resource and communication deadlocks • Effective on several real-world Java benchmarks • Trace program based approach is generic • can be applied to other errors, e.g. deadlocks because of exceptions

  47. Did Not Cover Today … • Deadlock Detection in Message-Passing Programs • must model many variants of message sends/receives • Dynamic Deadlock Avoidance • unique to deadlock errors (cannot, e.g., “avoid” buffer overruns) • see Dimmunix OSDI’08 paper (http://dimmunix.epfl.ch/) • Dynamic Deadlock Detection by Controlling Thread Schedules • CHESS (http://research.microsoft.com/en-us/projects/chess/) • CalFuzzer (http://srl.cs.berkeley.edu/~ksen/calfuzzer/) • Type-based Deadlock Detection • statically check lock-order graph (see OOPSLA’02 paper)

More Related