300 likes | 530 Views
RacerX: Effective, Static Detection of Race Conditions and Deadlocks. by Dawson Engler & Ken Ashcraft (published in SOSP03) Hong,Shin. Contents. Introduction Overview Lockset Analysis Deadlock Checking Datarace Checking Conclusion. Introduction 1/2.
E N D
RacerX: Effective, Static Detection of Race Conditions and Deadlocks by Dawson Engler & Ken Ashcraft (published in SOSP03) Hong,Shin RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Contents • Introduction • Overview • Lockset Analysis • Deadlock Checking • Datarace Checking • Conclusion RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Introduction 1/2 • Finding data races and deadlocks is difficult. • There have been many approaches to detect these errors. • Dynamic detecting tool (e.g. Erase) • These tools can only find errors on executed paths. • Model checking • Model checking is not scalable (state explosion problem) • Static tool • Many static tools make heavy use of annotations to inject knowledge into the analysis. RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Introduction 2/2 • Approach • Do not need annotations except for an indication as to what functions are used to acquire and release locks. • Minimize the impact of false positives(false alarms) • Must scale to large industrial program both in speed and in its ability to report complex errors. • A static tool that uses flow-sensitive, interprocedural analysis to detect both race conditions and deadlock • It aggressively infer checking informations (e.g. which locks protect which operations, which code contexts are multithreaded, which shared accesses are dangerous) The tool sorts errors from most to least severe RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Overview 1/3 • At a high level, checking a system with RacerX involves five phases: (1) Retargeting a system to system-specific locking function (2) Extracting a control flow graph from the system (3) Analysis (4) Ranking errors (5) Inspection RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Overview 2/3 • Retargeting a system to system-specific locking function • Users supply a table specifying the functions used to acquire/release locks, and disable/enable interrupts. • Users may optionally specify a function is single-threaded, multi-threaded, or interrupt handler (2) Extracting a control flow graph from the system • The tool extracts a CFG from the system and stores it in a file. • The CFG contains all function calls, uses of global variables, uses of parameter pointer variables, and optionally uses of all local variables, concurrency operations. • The CFG includes the symbolic information for these objects, such as their names, types, whether an access is read or write, whether a variable is a parameter or not, whether a function or variable is static or not, the line number, etc. RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Overview 3/3 (3) Analysis • The tool reads the emitted CFG and constructs a linked whole system CFG. And traverse the whole system CFG checking for deadlocks or data races. • The traversal is depth-first, flow-sensitive, and interprocedural and it tracks the set of locks held at any point. • At each program statement, the race checker or deadlock checker are passed the current statement, the current lockset, etc. (4) Ranking errors • Compute ranking information for error messages • Ranking sorts error messages based on two features: the likelihood of being false positive, and the difficulty of inspection (5) Inspection • Present the ranked error messages to users RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Lockset Analysis 1/5 • The tool compute locksets at all program points using a top-down, flow-sensitive, context-sensitive, interprocedural analysis. • Top-down: it starts the root of each call graph and does a DFS traversal down the CFG. • Flow-sensitive: the analysis effects of each path rather than conflate paths at join points. • Context-sensitive: analyzes the lockset at each actual callsite. • In the DFS traversal over the CFG, the tool (1) adds and removes locks as needed, and (2) calls the race and deadlock checkers on each statement. RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Lockset Analysis 2/5 • Caching • Statement cache: The tool caches the locksets that have reached each statement in CFG. • Summary cache: The tool caches the effect of each function by recording for each lockset l that entered function f , the set of locksets (l1, … , ln) that was produced. • Caching works because the analysis is deterministic – two executions that both start from the same statement with the same lockset will always produce the same result. • Since the analysis is flow-sensitive, a function could produce an exponential number of locksets. However, in practice, their effect are more modest. RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Lockset Analysis 3/5 • Pseudo-code for interprocedural lockset algorithm (1/2) void traverse_cfg(set of nodes roots) foreach r in roots traverse_fn(r, {}) ; end set of locksets traverse_fn(fn, ls) foreach edge x in fn->cache if (x->entry_lockset == ls) return x->exit_locksets ; if (fn->on_stack_p) return {} ; fn->on_stack_p = 1 ; x = new edge ; x->entry_lockset = lockset ; x->exit_locksets=traverse_stmts(fn->entry,ls,ls); fn->on_stack_p = 0 ; fn->cache = fn->cache union x ; return x->exit_locksets ; end Check summary cache a Break recursive call Cache update RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Lockset Analysis 4/5 • Pseudo-code for interprocedural lockset algorithm (2/2) set of locksets traverse_stmts(s, entry_ls, ls) if ((entry_ls, ls) in s->cache) return {} s->cache = s->cache union (entry_ls, ls) ; if (s is end-of-path) return ls ; if (s is lock acquire operation) ls = add_lock(ls, s) ; if (s is lock release operation) ls = remove_lock(ls, s) ; if (s is not resolved call) worklist = {ls} else worklist = traverse_fn(s->fn, ls) ; summ = {} ; foreach l in worklist foreach k in s->succ summ = summ union traverse_stmts(k,entry_ls, l) ; return sum ; end Check statement cache Cache update Lockset update DFS traversal RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Lockset Analysis 5/5 • Limitations • Do not do alias analysis. The tool represent local and parameter pointer variables by their type and name rather than their variable name. (e.g. a parameter foo that is a pointer to a structure of type bar will be named “local:struct bar”) • Do only simple function pointer resolution Record all functions ever assigned to a function pointer of a given type. And each call site, assume that all of the function could be invoked. RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Deadlock Checking 1/9 (1) Computing locking cycles (2) Ranking (3) Increasing analysis accuracy (4) Handling lockset mistakes (5) Experience result RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Deadlock Checking 2/9 Computing locking cycles • Constraint extraction At every lock acquisition, emit the lock ordering constraints produced by the current lock acquisition. (e.g. if the current lockset is {l1, l2} and the current ly acquired lock is l3, then emit l1l3, and l2l3) (2) Constraint solving Reads in the emitted locking constraints and computes the transitive closure of all dependencies. It records the shortest path between any cyclic lock depdendencies. RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Deadlock Checking 3/9 Ranking • Rank error messages based on three criteria: (1) The number of threads involved. - Errors with fewer threads are preferred to one with many threads. (2) Whether the lock involved are local or global - Global lock errors are preferred over local one. (3) The depth of the call chain - Short call chains are better than longer ones. • Use these ranking criteria hierarchically to sort error message: (1) > (2) > (3) RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Deadlock Checking 4/9 Example: Error message of simple deadlock between two global locks RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Deadlock Checking 5/9 Increasing analysis accuracy (1/2) • There are two significant sources of false lock dependencies: (1) Semaphores used to enforce scheduling dependency - A semaphore may be used to implement scheduling dependencies. - Signal-wait semaphores have two behavior patterns: they are almost never paired, more lock than unlock - Statistical approach: (1) Calculate how often true locks satisfies these two behaviors by counting the number of lock acquisitions, lock releases, and unlock errors. (2) And discard semaphores below some probability threshold. RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Deadlock Checking 6/9 Increasing analysis accuracy (2/2) (2) “Release-on-block” locks • Many operating systems such as FreeBSD and Linux use global, coarse-grained locks(e.g. big kernel lock) that have “release-on-block” semantics. <Thread1> <Thread2> lock_kernel() ; down(sem) ; down(sem) ; lock_kernel(); <Thread1> <Thread2> lock_kernel() ; down(sem) ; down(sem) ; lock_kernel() ; /* No deadlock */ down(sem) { … while( down(sem) would block ) { unlock_kernel() ; schedule() ; lock_kernel() ; } … } RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Deadlock Checking 7/9 Handling lockset mistakes • The most of deadlock false positives are caused by invalid locksets. • And almost all invalid locksets arise from a data-dependent lock release, or correlated branches. e.g. void foo(int x) { if (x) lock(l) ; … if (x) unlock(l) ; } Without path-sensitive analysis, the tool will believe there are four paths through foo. Use simple and novel propagation techniques to minimize the propagation of invalid locksets. RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Deadlock Checking 8/9 • Cutting off lock-error paths - Cut off the lockset on paths that contains a locking error. • Downward-only lockset propagation - A significant source of false positives occur when it falsely believe that a lock is held on function exit when it is actually not. - Propagate locksets downward from caller to callee but never upward. - Cause false negatives for wrapper functions. • Selecting the right summary - Majority summary selection: Rather than following all locksets a function call with generates, we take the one produced by the largest number of exit point within the function. - Minimum-size summary selection • Unlockset analysis - At program statement s, remove any lock l in the current lockset if there exists no successor statement s’ reachable from s that contains an unlock operation of l. RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Deadlock Checking 9/9 Experience result Ex. Deadlock: acquired lock is released and then reacquired by the same thread. scsiLock handleArrayLock RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Data Race Checking 1/6 • Dataracer checker is called by the lockset analysis on each statement. • The checker can be run in three modes: • Simple checking - only flags global accesses that occur without any lock held. (2) Simple statistical - infer which non-global variables and functions must be protected by some lock. (3) Precise statistical - infer which specified lock protects an access and flag when an access occurs when the lockset does not contain the lock. RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Data Race Checking 2/6 • The tool uses a set of heuristics to rank data race errors by a scoring function. • Heuristics are to answer following questions: - Is the lockset valid? - Is code multithreaded? - Does x need to be protected? - Does x need to be protected by L? RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Data Race Checking 3/6 • Is code multithreaded? Two methods of determining a code is multithreaded: (1) Multithreading inference • Any concurrency operation (e.g. lock acquire/release, atomic operations) implies that the programmer believes the surrounding code is multithreaded. • The tool marks a function as multithreaded if concurrency operations occur anywhere within its body, or anywhere above it in a call chain. (2) Programmer written automatic annotator • Users can mark a function as single threaded, a function that should be ignored, multithreaded, interrupt handler. RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Data Race Checking 4/6 • Does x need to be protected? • There are three approaches to answer this question: • Eliminating accesses unlikely to be dangerous, - Avoid flagging data races on variables that are private to a thread. - Demote errors where data appears to be written only during initialization and only read afterwards. (2) Promoting accesses that have a good chance of being unsafe - Favor errors that write data over errors that read data - Flag unprotected variables that cannot be read or written atomically (e.g. 64-bit variables on 32-bit machine) • Inferring which variables programmers believe must not be accessed without a lock. - Count how many times each variable is accessed with a lock held and versus not. - Variables the programmer believes should be protected will have a relatively high number of locked accesses and few unlocked accesses. RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Data Race Checking 5/6 • Does x need to be protected by L? • The tool infers whether a given lock protects a variable (or a function) using statistical approaches. • For each variable (or function) (1) the number of accesses to a variable(function) (2) the number of times these accesses held a specific lock • And then pick a single best lock out of all the candidates and then do an interprocedural checking with this information. RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Data Race Checking 6/6 Experience result Ex. Datarace error RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Conclusion • RacerX is a static tool that uses flow-sensitive, interprocedural analysis to detect both data races and deadlocks. • RacerX found errors in large commercial codes such as FreeBSD, and Linux. RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Further Work • Chord , by Mayur Naik and Alex Aiken , POPL07 Static race detection system for Java. Flow-insensitive , context-sensitive static analysis tool. RacerX: Effective, Static Detection of Race Conditions and Deadlocks
Reference [1] RacerX: Effective, Static Detection of Race Conditions and Deadlocks, Dawson Engler & Ken Ashcraft, SOSP03 RacerX: Effective, Static Detection of Race Conditions and Deadlocks