240 likes | 350 Views
Dataflow Analysis for Datarace-Free Programs. (ESOP ‘11) Arnab De Joint work with Deepak D’Souza and Rupesh Nasre Indian Institute of Science, Bangalore. Why Datarace-Free Programs?. Java, C++, … programs. Racy programs. Very weak guarantees. DRF programs.
E N D
Dataflow Analysis for Datarace-Free Programs (ESOP ‘11) Arnab De Joint work with Deepak D’Souza and Rupesh Nasre Indian Institute of Science, Bangalore
Why Datarace-Free Programs? Java, C++, … programs Racy programs Very weak guarantees DRF programs Sequentially consistent semantics • Dataraces are often indicators of bugs.
SC for DRF Verifier Bug/Memory model specific reasoning required DRF? No Yes Analysis for DRF programs! Perform optimization assume DRF Optimized code Compiler
Datarace-Free Programs • In an execution, a release action synchronizes-with (sw)all acquire actions on same variable after it. • In an execution, happens-before(hb) relation is reflexive, transitive closure of synchronizes-with and program-order. • In all SC executions, all conflicting accesses must be ordered by happens-before.
Datarace-Free Programs t1++; lock l; x = 1; unlock l; t2++; lock l; x = 2; unlock l; t++; lock l; x = 1; unlock l; t2++; lock l; x = 2; unlock l; sw edge po edge po edge
buf *p; lock l; p = new (...); p->data = new (...); *p->data = VAL; spawn (“prod”); spawn(“cons”); cons () { while (1) { lock (l); v = *p->data; unlock (l); } } prod () { while (1) { lock (l); oldv = *p->data; free (p->data); newv = nextv (oldv); p->data = new (...); *p->data = newv; unlock (l); } }
Dataflow Analysis for Concurrent Programs • Kill dataflow facts conservatively. • More precise. • Track interleavings precisely. • More efficient. • Handle simple program constructs. • Handle modern language constructs. • Handle simple analyses. • Handle more complex analyses.
buf *p; lock l; p = new (...); p->data = new (...); *p->data = VAL; spawn (“prod”); spawn (“cons”); p p,p->data p,p->data cons () { while (1) { lock (l); v = *p->data; unlock (l); } } prod () { while (1) { lock (l); oldv = *p->data; free (p->data); newv = nextv (oldv); p->data = new (...); *p->data = newv; unlock (l); } } p,p->data p,p->data p,p->data p,p->data p,p->data p,p->data p,p->data p p p,p->data p.p->data
buf *p; lock l; p = new (...); p->data = new (...); *p->data = VAL; spawn (“prod”); spawn (“cons”); p p,p->data p,p->data cons () { while (1) { lock (l); v = *p->data; unlock (l); } } prod () { while (1) { lock (l); oldv = *p->data; free (p->data); newv = nextv (oldv); p->data = new (...); *p->data = newv; unlock (l); } } p,p->data p,p->data p,p->data p,p->data p,p->data p,p->data p,p->data p p p,p->data p.p->data
buf *p; lock l; p = new (...); p->data = new (...); *p->data = VAL; spawn (“prod”); spawn (“cons”); p p,p->data p,p->data cons () { while (1) { lock (l); v = *p->data; unlock (l); } } prod () { while (1) { lock (l); oldv = *p->data; free (p->data); unlock (l); newv = nextv (oldv); lock (l); p->data = new (...); *p->data = newv; unlock (l); } } p,p->data p p p p,p->data p,p->data p,p->data p p p p p,p->data p.p->data
Our Algorithm for Lifting Sequential Analyses for Concurrent Programs • Build sync-CFG: add may-synchronize-edges from release to corresponding acquire instructions, if they can run in parallel. • From fork to first instruction of child thread. • From unlock to lock instructions on same lock variable. • From last instruction of a child thread to join instruction waiting for it. • … • May need to over-approximate the edges.
Our Algorithm for Lifting Sequential Analyses for Concurrent Programs • Sequential analysis on sync-CFG: • Consider flow function for synchronization instructions as id. • Construct flow equations on sync-CFG. • Compute least fixed point (lfp) of flow equations.
Restrictions on Analysis • Value Set analysis: • Collects set of values for each lvalue at each program point, loses the correlation. • l := e :evaluate e on the input value set and update the value set of l. • if(e) : propagate values that can make e true to true branch, similarly for false branch. • Join operation is point-wise union. • Treats aliases conservatively.
Restrictions on Analysis (2) • Abstractions of value set analysis: • A is an abstraction of VS if there are αandγsuch that α(lfp of VS) ≤ lfp of A and lfp of VS ≤ γ(lfp of A). • Null-pointer analysis, Interval analysis, Constant propagation, May pointer analysis…
Interpreting the Result • We assume that the value set of an lvalue (or its abstraction) is relevant only at those program points where that lvalue is read. • Result of NPA is important only where the pointer is dereferenced. • Result of CP is important only where that variable is read. • Our result is sound only for relevant lvalues at a given program point.
Why does it work? For Value Set analysis: • LFP of sequential analysis over-approximates join-over-all-paths in sync-CFG. • It is enough to show that if an execution produces a value v for an lvalue l relevant at a program point E, then there is a path in sync-CFG that includes v in VS(l) at E.
Path in Sync-CFG W: x = y • Induction over execution length. • W and R are related by hb. • hb = (po U sw)* • Flow functions of po edges over-approximate execution behavior. • Flow functions of sw edges are identity. R: … = x
Context-Sensitive Analysis • Analysis domain: • call string -> abstract state • On a call site c, • [s -> a] -> [sc -> a] • On return to call site c, • [sc -> a] -> [s -> a]
Context-Sensitive Analysis for Concurrent Programs • Use a summary component at each may-synchronize-with edge. • Join all the states at acquire and put in summary. • Join the summary with all (non-bottom) states at release.
Results all derefs actually safe seq analysis our analysis
Sources of Imprecision • Alias analysis, may happen in parallel analysis, … • Representation of multiple dynamic threads by a single static thread. • Paths in sync-CFG that do not correspond to any real execution.
foo() { lock l; x++; unlock l; } main() { fork(foo); … fork(foo); } baz() { lock l; x++; unlock l; } bar() { lock l; x++; unlock l; }
Conclusion • A dataflow analysis technique for DRF programs. • Defined the conditions for soundness. • Demonstrated scalability and precision.