390 likes | 400 Views
This tutorial explores the importance of high-level language semantics in transaction success, covering topics such as isolation, weak and strong isolation techniques, restrictive type systems, and formal models for correctness proofs.
E N D
STM in Managed Runtimes: High-Level Language Semantics(MICRO 07 Tutorial) Dan Grossman University of Washington 2 December 2007
So… Hopefully you’re convinced high-level language semantics is needed for transactions to succeed First session: focus on various notions of isolation • A taxonomy of ways weak isolation can surprise you • Ways to avoid surprises • Strong isolation • Restrictive type systems Second session: • Formal model for high-level definitions & correctness proofs • Memory-model problems • Integrating exceptions, I/O, and multithreaded transactions } 3 slide review Dan Grossman, MICRO Tutorial (STM Semantics)
Notions of isolation • Strong-isolation: A transaction executes as though no other computation is interleaved • Weak-isolation? • Single-lock (“weak-sla”): A transaction executes as though no other transaction is interleaved • Single-lock + abort (“weak undo”): Like weak-sla, but a transaction can abort/retry, undoing changes • Single-lock + lazy update (“weak on-commit”): Like weak-sla, but buffer updates until commit • Real contention: Like “weak undo” or “weak on-commit”, but multiple transactions can run at once • Catch-fire: Anything can happen if there’s a race Dan Grossman, MICRO Tutorial (STM Semantics)
Partition Surprises arose from the same mutable locations being used inside & outside transactions by different threads Hopefully sufficient to forbid that • But unnecessary and probably too restrictive • Bans publication and privatization • cf. STM Haskell [PPoPP05] For each allocated object (or word), require one of: • Never mutated • Only accessed by one thread • Only accessed inside transactions • Only accessed outside transactions Dan Grossman, MICRO Tutorial (STM Semantics)
Static partition Recall our “what is a race” problem: initially x=0, y=0, z=0 atomic { if(x<y) ++z; } atomic { ++x; ++y; } r = z; //race? assert(z==0); So “accessed on valid control paths” is not enough • Use a type system that conservatively assumes all paths are possible Dan Grossman, MICRO Tutorial (STM Semantics)
So… Hopefully you’re convinced high-level language semantics is needed for transactions to succeed First session: focus on various notions of isolation • A taxonomy of ways weak isolation can surprise you • Ways to avoid surprises • Strong isolation • Restrictive type systems Second session: • Formal model for high-level definitions & correctness proofs • Memory-model problems • Integrating exceptions, I/O, and multithreaded transactions Dan Grossman, MICRO Tutorial (STM Semantics)
Why formal models Some really smart people didn’t anticipate the surprises So maybe there are other surprises even with the partitioning type system Increase our confidence by modeling (mini-languages) various forms of isolation and prove them equivalent given the type system • So far: weak-sla, weak undo • Future work: weak on-commit, real contention, thread-local, immutable Dan Grossman, MICRO Tutorial (STM Semantics)
A formal program state a; H; e1 || … || en e: a thread (an expression that runs and terminates) H: a heap (maps mutable labels to values) a: either o or means one thread is in a transaction o means no thread is in a transaction • A high-level model for programmers & compiler-writers • No TM implementation details! Dan Grossman, MICRO Tutorial (STM Semantics)
Operational semantics Execution is a series of steps from one state to another • At each step, one thread runs some “instruction” a;H;e1|| … ||en a’;H’;e1’|| … ||en’ Isolation amounts to using the a to restrict interleavings: • strong: If a = , only the transaction can touch H • weak-sla: If a = , no other thread can start a transaction • weak undo: Like weak-sla, but transactions log updates and can abort/retry by undoing them • a returns to o after the abort is complete Dan Grossman, MICRO Tutorial (STM Semantics)
A family of languages So “strong”, “weak-sla”, and “weak undo” are similar languages with different semantic rules • The AtomsFamily • Lots of Greek letters in the paper Theorem: If e1, …, en type-check with our partition rules, then the set of states reachable from a;H;e1|| … ||en is the same for strong, weak-sla, and weak undo. • Not quite, weak undo has more transient states and can produce more garbage Dan Grossman, MICRO Tutorial (STM Semantics)
Type-checking Code can be used inside transactions, outside transactions, or both Each memory location can be accessed only inside transactions or only outside transactions Form of type-checking: ::=ot|wt|both ::= int | * | … ::= • | , x: ; ├ e : “Assuming variables in have those types, e has type and stays on the side of the partition required by ” Dan Grossman, MICRO Tutorial (STM Semantics)
Type-checking ; ├ e : “Assuming variables in have those type, e has type and stays on the side of the partition required by ” Three example rules (C-style syntax): (specialized slightly to emphasize the partition) (x) = *’; ├ e : * ; ├ x : *’; ├ *e : ; wt ├ e : ; ├ atomic{e} : Dan Grossman, MICRO Tutorial (STM Semantics)
The proof The proofs are dozens of pages and a few person-months (lest a skipped step hold a surprise) But the high-level picture is illuminating… Dan Grossman, MICRO Tutorial (STM Semantics)
The proof The proofs are dozens of pages and a few person-months (lest a skipped step hold a surprise) But the high-level picture is illuminating… If possible in strong, then possible in weak-sla • trivial: don’t ever violate isolation strong weak-sla weak undo Dan Grossman, MICRO Tutorial (STM Semantics)
The proof The proofs are dozens of pages and a few person-months (lest a skipped step hold a surprise) But the high-level picture is illuminating… If possible in weak-sla, then possible in weak undo • trivial: don’t ever abort strong weak-sla weak undo Dan Grossman, MICRO Tutorial (STM Semantics)
The proof The proofs are dozens of pages and a few person-months (lest a skipped step hold a surprise) But the high-level picture is illuminating… If possible in weak-sla, then possible in strong • Current transaction is serializable thanks to the type system (can permute with other threads) • Earlier transactions serializable by induction strong weak-sla weak undo Dan Grossman, MICRO Tutorial (STM Semantics)
The proof The proofs are dozens of pages and a few person-months (lest a skipped step hold a surprise) But the high-level picture is illuminating… If possible in weak undo, then possible in weak-sla? • Really need that abort is correct • And that’s hard to show, especially with interleavings from weak isolation… strong weak-sla weak undo Dan Grossman, MICRO Tutorial (STM Semantics)
The proof The proofs are dozens of pages and a few person-months (lest a skipped step hold a surprise) But the high-level picture is illuminating… If possible in weak undo, then possible in weak-sla? • Define strong undo for sake of the proof • Can show abort is correct without interleavings strong undo strong weak-sla weak undo Dan Grossman, MICRO Tutorial (STM Semantics)
Why we formalize, redux Thanks to the formal semantics, we: • Had to make precise definitions • Know we did not skip cases (at least in the model) • Learned the essence of why the languages are equivalent under partition • Weak interleavings are serializable • Abort is correct • And these two arguments compose Dan Grossman, MICRO Tutorial (STM Semantics)
So… Hopefully you’re convinced high-level language semantics is needed for transactions to succeed First session: focus on various notions of isolation • A taxonomy of ways weak isolation can surprise you • Ways to avoid surprises • Strong isolation • Restrictive type systems Second session: • Formal model for high-level definitions & correctness proofs • Memory-model problems • Integrating exceptions, I/O, and multithreaded transactions Dan Grossman, MICRO Tutorial (STM Semantics)
Relaxed memory models Modern languages don’t provide sequential consistency • Lack of hardware support • Prevents otherwise sensible & ubiquitous compiler transformations (e.g., copy propagation) So safe languages need two complicated definitions • What is “properly synchronized”? • What can compiler and hardware do with “bad code”? (Unsafe languages need (1)) A flavor of simplistic ideas and the consequences… Dan Grossman, MICRO Tutorial (STM Semantics)
Ordering Can get “strange results” for bad code • Need rules for what is “good code” initially x==0 and y==0 x = 1; y = 1; r = y; s = x; assert(s>=r);//invalid Dan Grossman, MICRO Tutorial (STM Semantics)
Ordering Can get “strange results” for bad code • Need rules for what is “good code” initially x==0 and y==0 initially x==y==0 x = 1; sync(lk){} y = 1; r = y; sync(lk){} //same lock s = x; assert(s>=r);//valid Dan Grossman, MICRO Tutorial (STM Semantics)
Ordering Can get “strange results” for bad code • Need rules for what is “good code” initially x==0 and y==0 x = 1; atomic{} y = 1; r = y; atomic{} s = x; assert(s>=r);//??? If this is good code, existing STMs are wrong Dan Grossman, MICRO Tutorial (STM Semantics)
Ordering Can get “strange results” for bad code • Need rules for what is “good code” initially x==0 and y==0 x = 1; atomic{z=1;} y = 1; r = y; atomic{tmp=0*z;} s = x; assert(s>=r);//??? “Conflicting memory” a slippery ill-defined slope Dan Grossman, MICRO Tutorial (STM Semantics)
Lesson It is not clear when transactions are ordered, but languages need memory models Corollary: This could/should delay adoption of transactions in well-specified languages I wish I had more answers. Dan Grossman, MICRO Tutorial (STM Semantics)
Other operations So far every atomic block we have considered only: • read/wrote/allocated memory • called functions What about: • I/O • Exceptions (or first-class continuations) • Spawn a thread Dan Grossman, MICRO Tutorial (STM Semantics)
I/O Can’t have irreversible actions in transactions that might abort Need pragmatic, partial solutions such as: • Forbid irreversible actions in transactions • Trivial extension of our partition type system • Have unabortable transactions • Make actions reversible • Buffer output • Buffer (idempotent) input Dan Grossman, MICRO Tutorial (STM Semantics)
I After O The real problem is input after output in a transaction atomic{ write_to_file(); read_from_file(); } Contents read cannot depend on how external world sees write if the write is buffered Dan Grossman, MICRO Tutorial (STM Semantics)
Native mechanism Can generalize: Require native code to have 2 versions • Runtime calls 1 in transactions, 1 not in transactions • Native code responsible for 2 versions “the same” Transactional versions also need callbacks for pre-commit, post-commit, and pre-abort • Sufficient for buffering input and output • Sufficient for external transaction systems If “in transaction” version causes abort, that just encodes “safe dynamic failure/retry” Dan Grossman, MICRO Tutorial (STM Semantics)
Exceptions If code in atomic throws exception to outside atomic: A. Does the transaction commit or abort? B. Where does control transfer to? Three “obvious” answers: 1. Commit transaction, transfer to exception handler • My preference; exceptions in most HLLs are semantically just “non-local jumps” • Preserves design goal that atomic has no effect on single-threaded programs Dan Grossman, MICRO Tutorial (STM Semantics)
Exceptions If code in atomic throws exception to outside atomic: A. Does the transaction commit or abort? B. Where does control transfer to? Three “obvious” answers: 2. Abort transaction, transfer to retry the exception • Turns exceptions into aborts • Useful if exceptions due to shared-memory state • But programmer can encode this atomic { try { s } catch (Throwable e) { abort; } } Dan Grossman, MICRO Tutorial (STM Semantics)
Exceptions If code in atomic throws exception to outside atomic: A. Does the transaction commit or abort? B. Where does control transfer to? Three “obvious” answers: 3. Abort transaction, transfer to exception handler • But the transaction never happened?! • What if the exception value uses memory allocated/written by the aborted transaction?! Dan Grossman, MICRO Tutorial (STM Semantics)
Beyond exceptions Other non-local jumps even harder to deal with Example: Perhaps a coroutine jumps out of an atomic and then jumps back in • Then probably the jump out should continue the transaction (commit or abort) later It depends “what you’re trying to do” which is a problem if the same language feature (exceptions, continuations, etc.) is used for multiple idioms. • Tough policy questions; mechanism pretty easy Dan Grossman, MICRO Tutorial (STM Semantics)
Multithreaded transactions What if code in atomic creates a new thread? Easy answers: • Dynamic failure • Thread not runnable unless/until transaction commits More interesting: • Parallelism within transaction • Isolation and concurrency are orthogonal • Controversial(?) claim: Necessary due to Amdahl’s Law as core-count increases Dan Grossman, MICRO Tutorial (STM Semantics)
Multithreaded transactions (Semantics done; implementation is work in progress) When does multithreaded transaction commit? • After all spawned threads terminate What is hard for programmers? • Nested transactions now crucial for isolating parallel computations inside a larger transaction What is hard for implementors? • Transactional bookkeeping must be parallel • Unclear how hardware could best help Dan Grossman, MICRO Tutorial (STM Semantics)
So… Hopefully you’re convinced high-level language semantics is needed for transactions to succeed First session: focus on various notions of isolation • A taxonomy of ways weak isolation can surprise you • Ways to avoid surprises • Strong isolation • Restrictive type systems Second session: • Formal model for high-level definitions & correctness proofs • Memory-model problems • Integrating exceptions, I/O, and multithreaded transactions Dan Grossman, MICRO Tutorial (STM Semantics)
If I had another 2 hours Plenty more semantics to consider: • Open-nesting semantics • Message-passing within transactions • See recent work from Oregon, Purdue, UW • atomic {s1} orelse {s2} • Try s2 if s1 aborts • Fairness guarantees • Obstruction-freedom • … Dan Grossman, MICRO Tutorial (STM Semantics)
Conclusions • Weak isolation without type restrictions is surprising • Interaction with other language features non-trivial • PL-style semantics has a huge role to play in bringing transactions to high-level languages • An essential complement to the core algorithms, compiler, hardware work • Need “cross-cultural understanding” of the issues wasp.cs.washington.edu Dan Grossman, MICRO Tutorial (STM Semantics)