1 / 24

Memory-Model-Sensitive Data Race Analysis

Memory-Model-Sensitive Data Race Analysis. Jason Yue Yang Microsoft Corporation Ganesh Gopalakrishnan, Gary Lindstrom University of Utah. CPU performance. Memory performance. Shared-Memory Systems. A hardware perspective. Multiprocessor architectures - increasingly parallel.

silvio
Download Presentation

Memory-Model-Sensitive Data Race Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Memory-Model-Sensitive Data Race Analysis Jason Yue Yang Microsoft Corporation Ganesh Gopalakrishnan, Gary Lindstrom University of Utah

  2. CPU performance Memory performance Shared-Memory Systems A hardware perspective Multiprocessor architectures - increasingly parallel • Multithreaded software - popular, BUT hard to analyze • Thread libraries: e.g., P-thread, Win32, Solaris • Language level support of threads: e.g., Java

  3. Memory Consistency Models • Many have been developed: • Sequential Consistency (SC) • Coherence • Parallel Random Access Memory (PRAM) • Causal Consistency • Processor Consistency (PC) • Release Consistency • Location Consistency • The Intel Itanim Memory Model • Java Memory Model (JMM) • and more! Defines the legalorderings of memory operations that can be perceived at the user level

  4. Sequential Consistency • Requirements • Exists a common total order • Respects program order • Read sees the “latest” write Example Initially, x = y = 0. Finally, can r1 = r2 = 1? Thread 1 Thread 2 r1 = x; y = 1; r2 = y; x = 1; • Under Sequential Consistency: No • Under many weak models: Yes

  5. Concurrency-Related Properties • Trace validity • Whether a certain set of outcome is legal • Atomicity • Whether a block of code is executed atomically • Race Freedom • Whether the program is free of data-race

  6. What Is a Data-Race? • Informally: conflicting and concurrent accesses Is this program race-free? Initially, x = y = 0. Are these two instructions conflicting and concurrent? Thread 1 r1 = x; if (r1 > 0) y = 1; Thread 2 r2 = y; if (r2 > 0) x = 1; • Control flow interwoven with memory consistency requirements • Hence, the answer depends on the memory model • - Under SC, this program is race-free • - Under a weaker model, this program might contain races

  7. A Subtle Variation Is this program race-free? Initially, x = y = 0. Thread 1 r1 = x; if (r1 > 0) y = 1; Thread 2 r2 = y; if (r2 >= 0) x = 1; This small change introduces a race-condition, even under SC executions

  8. Observations • Properties satisfied under one memory model may be broken under another • Thread semantics must be taken into account in addition to program semantics • Informal intuitions may lead to inaccurate results • Correctness properties need to be formalized • Thread interleaving can be counter-intuitive • Need an automatic tool with exhaustive coverage

  9. Methodology Test Program Program+Thread Semantics e.g., SC, Itanium, JMM … SAT Solver CLP SAT QBF Constraints Correctness Properties e.g., race, atomicity … UNSAT (1) Define both intra-thread and inter-thread semantics as constraints (2) Model correctness properties as additional constraints (3) Reduce a verification problem to a constraint satisfaction problem and solve it automatically

  10. Formalizing SC Executions • SC is most commonly assumed in SW development • Critical in defining race conditions in the newly proposed Java Memory Model • Apply happens-before order for acquire/release semantics

  11. The Source Language • Java-like (simplified but non-trivial) • Local and global variables • Computations • Control branches • Monitor-like mutual exclusion • Instruction types • Read: e.g., r = x • Write: e.g., x = 1 or x = r • Control: e.g., if (r==0) • Computation: e.g., r1 = r2 + 1 • Lock: e.g., Lock m • Unlock: e.g., Unlock m

  12. Specification Techniques (1) Apply declarative specifications - Easy to understand, flexible (2) Make the rules higher order - pass down the order relation through all the rules - Compositional, reusable, scalable, easy to compare (3) Make all rules explicit - Executable using a constraint-programming system

  13. Constraints of Sequential Consistency (ops is the execution trace; order is the ordering relation) • legalSC ops order = • requireWeakTotalOrder ops order • requireTransitiveOrder ops order • requireAsymmetricOrder ops order  • requireProgramOrder ops order • requireReadValue ops order • requireComputation ops order • requireMutualExclusion ops order Common total order Respects program order Read sees “latest” write Program semantics” Lock/unlock semantics requireWeakTotalOrder ops order   i, j  ops. (fb i fb j  id i id j) (order i j  order j i) order is repeatedly refined Hidden rules are explicit

  14. Constraints for Control Flow • Treat control operations similar to memory operations • Imagine “assigns” and “uses” of “control variables” • Add an auxiliary control variableck for each branch statement k, and convert the if-statement to an auxiliary assign of ck • E.g. if(r1>0) becomes c1=r1>0 • Every op k has a path predicatectrExpr • K is a use of those control variables in ctrExpr • k is feasible if ctrExpr evaluates to ture • Feasibility of ops are checked when setting the rules

  15. Read Value Rules • Consistent values across reads and writes • Intuitively, a read should see the “latest” write • Local data/control flow can be treated similar to global reads • Global Reads: forall r = x, exists a latest write on x • Local Reads: forall x = r, exists a latest write on r • Control Reads: forall op that depends on c, exists a most recent write on c • requireReadValue ops order = • globalReadValue ops order  • localReadValue ops order  • controlReadValue ops order

  16. Computation Rules • Program semantics • Not directly related to memory ordering, but needed for analyzing real code • Two types • Local variable computation: r1 = r2 + 1 • Control evaluation: c1 = r1 > 0 • Predicate eval follows standard program semantics requireComputation ops order =  i  ops. • (isControl i data i = eval (ctrExpr i))  • (isCmp i data i = eval (cmpExpr i))

  17. Constraints of Race Conditions detectDataRace ops scOrder, hbOrder. legalSC ops scOrder  requireHbOrder ops hbOrder scOrder  mapConstraints ops hbOrder scOrder  existDataRace ops hbOrder requireHbOrder ops hbOrder scOrder  requireProgramOrder ops hbOrder  requireSyncOrder ops hbOrder scOrder  requireTransitiveOrder ops hbOrder existDataRace ops hbOrder i, j ops. conflictingAccess i j  ¬(hbOrder i j) ¬(hbOrder j i) Under SC execution Respects happens-before Mutually consistent Exists data-race Conflicting Concurrent

  18. Making It Executable • Encode the test case as a Constraint Logic Program (CLP) that interprets executions and checks conformance with the rules • Implementation in FD-Prolog is straightforward • Universal quantification handled via enumeration • Existential quantification handled via backtracking • Built-in constraint solver from FD-Prolog: - Logical variables - Finite-domain (FD) variables

  19. Precedence matrix M j i x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x Values ofentry Mij: 1: i is ordered before j 0: i is not ordered before j x: value not bound yet How to Encode the Ordering Relation? The Method: • Given a test program with N operations, use a 2D precedence matrix with N2 constraint variables • Interpret the symbolic execution, impose constraints to the 2D matrix • When interpretation finishes, x values reveal latitude in weak order • When an x changes to a 1, an attempt to set it to 0 later triggers backtracking

  20. Example of Prolog Implementation Formal Specification (e.g., requireWeakTotalOrder) requireWeakTotalOrder ops order   i, j  ops. (fb i fb j  id i id j) (order i j  order j i) SICStus Prolog Code requireWeakTotalOrder(Ops,Order,FbList):- forEachElem(Ops,Order,FbList,doWeakTotalOrder). elemProg(doWeakTotalOrder,Ops,Order,FbList,I,J):- const(feasible,Feasible), length(Ops,N), matrix_elem(Order,N,I,J,Oij), matrix_elem(Order,N,J,I,Oji), nth(I,FbList,Fi), nth(J,FbList,Fj), (Fi #= Feasible #/\ Fj #= Feasible) #/\ I #\= J) #=> (Oij #\/ Oji).

  21. Interactive and IncrementalAnalysis Test Program Execution (ops) Initially, x = y = 0. Tinit Thread 1 Thread 2 (1) wr(x,0); (3) rd(x,r1); (6) rd(y,r2); (2) wr(y,0); (4) ctr(c1,[r1>0]); (7) ctr(c2,[r2>=0]); (5) wr(y,1,[c1]); (8) wr(x,1,[c2]); Thread 1 r1 = x; if (r1 > 0) y = 1; Thread 2 r2 = y; if (r2 >= 0) x = 1; 1 2 3 4 5 6 7 8 1 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1 0 0 1 1 1 0 0 1 0 0 1 1 1 0 0 0 2 3 Possible interleaving 1 2 6 7 8 3 4 5 Data-race operations 3 and 8 The adjacency matrix: 4 5 6 7 8

  22. Summary Existing Approaches • Scalable, but not formal • Based on simplifying assumptions • Or formal, but not executable • Paper-and-pencil approach is error-prone Our Approach • Formal and executable (although not quite as scalable) • Allows rigorous analysis of a program pattern under an exact memory model • Generic • Not limited to specific synchronization mechanism • Can define concurrency properties under other memory models

  23. Rigorous Concurrency Analysis More Exhaustive More Scalable Intra-thread Inter-thread Intra-procedural Inter-procedural Memory-model sensitive Memory-model insensitive

  24. Thank You ! Papers and prototype tools are available at http://www.cs.utah.edu/~yyang/research.html

More Related