260 likes | 273 Views
Comprehensive study on concurrency bug characteristics in real-world applications. Covers bug patterns, manifestations, and fixes. Insightful findings on atomicity and order violations. Importance of bug detection strategies.
E N D
Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics Shan Lu, Soyeon Park, EunsooSeo and Yuanyuan Zhou Appeared in ASPLOS’08 Presented by Michelle Goodstein LBA Reading Group 3/27/08
Introduction • Multi-core computers are common • More programmers are having to write concurrent programs • Concurrent programs have different bugs than sequential programs • However, without a study, hard to know what those bugs are • First real-world study of concurrency bugs
Introduction • Knowing the types of concurrent bugs that actually occur in software will: • Help create better bug detection schemes • Inform the testing process software goes through • Provide information to program language designers
Introduction • Current state of affairs • Repeating concurrent bugs is difficult • Test cases are critical to being able to diagnose a bug • Most detection research focuses: • data races • deadlock bugs • some new work on detecting atomicity violations • Few studies on real world concurrency bugs • Most use programs that were buggy by design for the study • Most studies on bug characteristics focus on non-concurrent bugs
Methodology • 4 representative open-source applications: • MySQL • Apache • Mozilla • OpenOffice • Each application has • 9-13 years of development history • 1-4 million lines of code
Methodology • Randomly selected bugs from bug databases that contained at least one keyword related to concurrency (eg “race”, “concurrency”, “deadlock”, “synchronization”, etc.) • From these, randomly choose 500 bugs that have • Root causes explained well and in detail • Source code available • Bug fix info available
Methodology • Remove any bugs not truly caused by concurrency • Result: 105 concurrency bugs • Separate study of deadlock and non-deadlock bugs
Methodology • Evaluated bugs in 3 dimensions • Bug pattern: {atomicity-violation, order-violation, other} • Manifestation: required conditions for bug to occur, # threads involved, # variables, # accesses • Bug fix strategy: Look at final patch, mistakes in intermediate patches, and whether TM can help • Results organized as a collection of findings
Motivation • 34/105 concurrency bugs cause program crashes • 37/105 concurrency bugs cause programs to hang • Concurrency bugs are important
Findings: Bug Patterns • Atomicity Violation • Order Violation
Findings: Bug Patterns • Most (72/74) of the examined non-deadlock concurrency bugs are either atomicity-violations or order-violations • Focusing on atomicity and order-violations should detect most non-deadlock concurrency bugs • In fact, 24/74 are order violations • Since current tools don’t address order-violation, new tools must be developed
Findings: Bug Manifestations • Most (101/105) bugs involved ≤ 2 threads • Most communication among a small number of threads • Enforcing certain partial orderings among a small number of threads can expose bugs • Heavy workloads can increase competition for resources, and make it more likely to observe a partial ordering that causes a bug • Pairwise Testing can find many bugs
Findings: Bug Manifestations • Some (7/31) bugs experience deadlock bugs with only 1 thread! • Easy to detect/avoid
Findings: Bug Manifestations • Many (49/74) non-deadlock bugs involve 1 variable. However, 34% involve ≥ 2 variables • Focusing on 1 variable is a good simplification • However, new tools also necessary to discover multivariable concurrency bugs
Findings: Bug Manifestations • Most (30/31 ) deadlock bugs involved ≤ 2 resources • Pairwise testing of order among obtained and released resources should help reveal deadlocks
Findings: Bug Manifestations • Most (92%) bugs manifested if enforced certain partial orderings among ≤ 4 memory accesses • Testing small groups of accesses will be polynomial time and expose most bugs
Findings: Bug Fixes • Adding/changing locks only helps minority (20/74) non-deadlock concurrency bug fixes • Locks aren’t enough to fix all concurrency bugs. • Locks don’t promise ordering, just atomicity • Addition of locks can hurt performance or create new, deadlock bugs
Findings: Bug Fixes • Most common fix (19/31) to deadlock bugs allows 1 thread to ignore acquiring a resource, like a lock • This may get rid of deadlock bugs, but create other non-deadlock bugs • Code may no longer be correct
Bug fixes: Buggy Patches • 17/57 Mozilla bugs have ≥ 1 buggy patch • On average, release .4 buggy patches for every final correct patch • Of 23 distinct buggy patches for the 17 bugs: • 6 decrease probability of occurrence but do not eliminate original bug • 5 create new concurrency bugs • 12 create new non-concurrency bugs
Findings: Bug fixes • In many (41/105) cases, TM can help avoid concurrency bugs
Findings: Bug fixes • Also in many cases (44/105), TM might be able to help with concurrency bugs • Need to allow long regions, rollback of I/O, strange “nature” of the code
Findings: Bug fixes • In 20/105 cases, TM provides little help • TM cannot help with many order-violation bugs • While TM could be useful in preventing concurrency bugs, it will not fix all of them
Conclusion • First real-world concurrent bug study • Multiple findings on • Type of concurrency bugs • Conditions for manifestation • Techniques for fixing concurrent bugs • Several heuristics proposed for: • Bug detection • Testing • Language Design (ie, TM) • Future work can focus on detecting common types of errors • Multi-variable bugs • Order violation bugs • Multiple-access bugs