600 likes | 739 Views
Formalisms and Verification for Transactional Memories. Vasu Singh EPFL Switzerland. Part 1: Verification for Pure Transactional Programs Part 2: Formalisms for Mixed Transactional Programs (Parametrized Opacity). Pure Transactional Programs. All operations within transactions
E N D
Formalisms and Verification forTransactional Memories Vasu Singh EPFL Switzerland
Part 1: Verification for Pure Transactional Programs Part 2: Formalisms for Mixed Transactional Programs (Parametrized Opacity)
Pure Transactional Programs • All operations within transactions • No non-transactional operations
Interaction Hardware Pure Transactional Program Transactional Memory Algorithm Memory operations may be reordered by the hardware
Relaxed memory models • For reasons of performance, hardware may transform the sequence of instructions of a thread • One uses fences to ensure order with relaxed memory models: fences have a high performance overhead
Verification Problem Does a given TM algorithm guarantee atomicity for every transactional program under a given memory model?
How do we verify? • Formalize common memory models • Capture the behavior of STM algorithms under relaxed memory models • Build a specification (of say, opacity) at hardware level atomicity • Implement a tool to check the correctness of an STM algorithm using the spec
Relaxed Memory Language • A new language to write concurrent programs under relaxed memory models • Syntax: Statements execute atomically in hardware • Semantics: Parametrized by the memory model M • We express TM algorithms in RML
Homework slide on RM 101 How many possible valuations for r1, r2, r3, r4? On SC ? On TSO ? On PSO ? On RMO ? A := 1 B := 1 r1 := D r3 := C A := 2 C := 1 D := 1 r2 := B r4 := A C := 2
Homework slide on RM 101 How many possible valuations for r1, r2, r3, r4? On SC ? 7 On TSO ? 1 more On PSO ? 7 more On RMO ? 1 more A := 1 B := 1 r1 := D r3 := C A := 2 C := 1 D := 1 r2 := B r4 := A C := 2 Manually: a few minutes, at least ! RML: less than a second on a dual core 2.8 GHz
Our Tool FOIL
Our Tool A is correct under M RML description of an STM algorithm A FOIL A is correct under M with fences at … Memory Model M A is not correct under SC
Our Tool RML description of an STM algorithm A Memory Model M
Our Tool RML description of an STM algorithm A L(A,M) Memory Model M
Our Tool Spec RML description of an STM algorithm A L(A,M) subset of Spec? L(A,M) Memory Model M
Our Tool Spec A is correct under M RML description of an STM algorithm A YES L(A,M) subset of Spec? L(A,M) Memory Model M
Our Tool Spec A is correct under M RML description of an STM algorithm A YES L(A,M) subset of Spec? L(A,M) NO Memory Model M L(A,SC) subset of Spec?
Our Tool Spec A is correct under M RML description of an STM algorithm A YES L(A,M) subset of Spec? L(A,M) NO Memory Model M L(A,SC) subset of Spec? A is not correct under SC NO
Our Tool Spec A is correct under M RML description of an STM algorithm A YES L(A,M) subset of Spec? L(A,M) NO YES Add fence to A Memory Model M L(A,SC) subset of Spec? A is not correct under SC NO
Our Tool Spec A is correct under M RML description of an STM algorithm A YES L(A,M) subset of Spec? L(A,M) NO YES Add fence to A Memory Model M L(A,SC) subset of Spec? A is not correct under SC NO
Our Tool Spec A is correct under M RML description of an STM algorithm A YES L(A,M) subset of Spec? L(A,M) A is correct under M with fences at … YES NO NO YES Add fence to A Memory Model M L(A,SC) subset of Spec? A is not correct under SC NO
Our experiments • Wrote DSTM, TL2, and McRT STM in RML without fences • Found the STM algorithms correct under SC and TSO • FOIL places required fences for correctness under further relaxed PSO and RMO • The set of inserted fences matches those in the official implementation for TL2
Mixed Transactional Programs • No formal framework • We try to define one
Mixed Transactional Programs r1 := x atomic { x := 1 x := 2 } Hardware Mixed Transactional Program Transactional Memory Algorithm Non transactional interaction
A strong correctness property Strong atomicity / Strong isolation: Transactions are isolated from other transactions and non-transactional operations
A Common Quote • “Strong atomicity is expensive to achieve” • Questions: • What is strong atomicity precisely ? • How expensive ?
Strong Atomicity • Precise part: • Transactions isolated from other transactions, and also from non-transactional operations • Ambiguous part: • What is the interaction between non-transactional operations? • Two definitions :: • every non-transactional operation executes as a transaction (Larus et al.) • Non-transactional operations execute according to a relaxed memory model (Martin et al.)
Precise part 1 r1 = 0 atomic { x := 1 x := 2 } r1 := x r1 = 2 Two possibilities for r1. Allowed by both: Larus and Martin
Precise part 2 Only possibility by both Larus and Martin: r1 = 1 atomic { x := 1 r1 := x } x := 2
Ambiguous part 1 atomic { x := 1 y := 1 } r1 := y r2 := x
Ambiguous part 1 Allowed only by Martin r1 = 0 r2 = 0 r2 = 0 r1 = 1 r1 = 0 r2 = 1 atomic { x := 1 y := 1 } r1 := y r2 := x r1 = 1 r2 = 1 Allowed by both: Larus and Martin
Ambiguous part 2 Can r1 be 42 ? Depends on the memory model ! If the memory model allows out of thin air values, r1 can be 42. atomic { x := 1 } r1 := x x := 2
Motivation • Separate the concerns: • Memory model “contract” for non-tx • Strong atomicity for tx
Intuition • Opacity for transactions • Isolation of transactions from non-transactional operations • Non-transactional operations respect the memory model
Basically … • We want to know what is so expensive about strong atomicity • Does it require non-transactional operations to perform a long sequence of operations ? • Does it require non-transactional operations to wait indefinitely for transactions to finish ? • Is it impossible to achieve?
Uninstrumented TM • Let us study these first • No overhead of non-tx operations • What can we achieve • Under what conditions?
Classes of memory models • Four classes based on restriction of reorderings -> • RR : does not allow to reorder two read instructions to different variables • RW : does not allow to reorder a read followed by a write to a different variable • WR : does not allow to reorder a write followed by a read to a different variable • WW : …
Examples • SC: RR, RW, WR, WW • PSO: RR, RW • RMO: RR_d, RW_d • Java: RW_d, RR_d • Alpha: RW_d
NULL memory model • Every pair of operations can be reordered • NULL not in { RR, RW, WR, WW } • Even the most relaxed memory models enforce an order between a load and a dependent store • NULL memory model is not practical
Results • Parametrized opacity can be obtained with uninstrumented TMs only under NULL memory model • For every non-NULL memory model, it is impossible to achieve parametrized opacity without instrumentation
Proof idea • Assume the memory model restricts the order of two instructions • Create a counterexample history that is not opaque parametrized by this memory model • Do this for every possible restriction (RR, RW, WR, WW)
Instrumented TM • Change the semantics of non-transactional operations • Reads are no longer just loads and writes are no longer just stores • Used to make non-tx operations tx aware • Example: • Non-transactional writes are required to hold the lock that is used by transactional writes before performing a write
Example of instrumented TM [Shpeisman et al., PLDI 07] • Every object accessed in a transaction has a tx record • A tx record is in one of the states: shared, exclusive, private, exclusive anonymous • Tx operations as in common TMs • Non-tx read: check no tx write interferes • Non-tx write: get exclusive access to the tx record
Example of instrumented TM [Shpeisman et al., PLDI 07] • Every object accessed in a transaction has a tx record • A tx record is in one of the states: shared, exclusive, private, exclusive anonymous • Tx operations as in common TMs • Non-tx read: check no tx write interferes • Non-tx write: get exclusive access to the tx record Expensive !
Instrumented TMs • The instrumented guarantees parametrized opacity wrt SC ! • You saw it was “expensive”: a non-tx read or write may indefinitely wait for a tx to complete