610 likes | 696 Views
Software Transactions: A Programming-Languages Perspective. Dan Grossman University of Washington 26 March 2008. Atomic. An easier-to-use and harder-to-implement primitive. void deposit (int x ){ synchronized (this){ int tmp = balance; tmp += x; balance = tmp; }}.
E N D
Software Transactions: A Programming-Languages Perspective Dan Grossman University of Washington 26 March 2008
Atomic An easier-to-use and harder-to-implement primitive void deposit(int x){ synchronized(this){ int tmp = balance; tmp += x; balance = tmp; }} void deposit(int x){ atomic { int tmp = balance; tmp += x; balance = tmp; }} lock acquire/release (behave as if) no interleaved computation Dan Grossman, Software Transactions
Viewpoints Software transactions good for: • Software engineering (avoid races & deadlocks) • Performance (optimistic “no conflict” without locks) Research should be guiding: • New hardware with transactional support • Software support • Semantic mismatch between language & hardware • Prediction: hardware for the common/simple case • May be fast enough without hardware • Lots of nontransactional hardware exists Dan Grossman, Software Transactions
PL Perspective Complementary to lower-level implementation work Motivation: • The essence of the advantage over locks Language design: • Rigorous high-level semantics • Interaction with rest of the language Language implementation: • Interaction with modern compilers • New optimization needs Answers urgently needed for the multicore era Dan Grossman, Software Transactions
Today, part 1 Language design, semantics: • Motivation: Example + the GC analogy [OOPSLA07] • Semantics: strong vs. weak isolation [PLDI07]* [POPL08] • Interaction w/ other features [ICFP05][SCHEME07][POPL08] * Joint work with Intel PSL Dan Grossman, Software Transactions
Today, part 2 Implementation: • On one core [ICFP05][SCHEME07] • Static optimizations for strong isolation [PLDI07]* • Multithreaded transactions * Joint work with Intel PSL Dan Grossman, Software Transactions
Code evolution void deposit(…) { synchronized(this) { … }} void withdraw(…) { synchronized(this) { … }} int balance(…) { synchronized(this) { … }} Dan Grossman, Software Transactions
Code evolution void deposit(…) { synchronized(this) { … }} void withdraw(…) { synchronized(this) { … }} int balance(…) { synchronized(this) { … }} void transfer(Acct from, int amt) { if(from.balance()>=amt && amt<maxXfer) { from.withdraw(amt); this.deposit(amt); } } Dan Grossman, Software Transactions
Code evolution void deposit(…) { synchronized(this) { … }} void withdraw(…) { synchronized(this) { … }} int balance(…) { synchronized(this) { … }} void transfer(Acct from, int amt) { synchronized(this) { //race if(from.balance()>=amt && amt<maxXfer) { from.withdraw(amt); this.deposit(amt); } } } Dan Grossman, Software Transactions
Code evolution void deposit(…) { synchronized(this) { … }} void withdraw(…) { synchronized(this) { … }} int balance(…) { synchronized(this) { … }} void transfer(Acct from, int amt) { synchronized(this) { synchronized(from){ //deadlock (still) if(from.balance()>=amt && amt<maxXfer) { from.withdraw(amt); this.deposit(amt); } }} } Dan Grossman, Software Transactions
Code evolution void deposit(…) { atomic { … }} void withdraw(…) { atomic { … }} int balance(…) { atomic { … }} Dan Grossman, Software Transactions
Code evolution void deposit(…) { atomic { … }} void withdraw(…) { atomic { … }} int balance(…) { atomic { … }} void transfer(Acct from, int amt) { //race if(from.balance()>=amt && amt<maxXfer) { from.withdraw(amt); this.deposit(amt); } } Dan Grossman, Software Transactions
Code evolution void deposit(…) { atomic { … }} void withdraw(…) { atomic { … }} int balance(…) { atomic { … }} void transfer(Acct from, int amt) { atomic { //correct and parallelism-preserving! if(from.balance()>=amt && amt<maxXfer){ from.withdraw(amt); this.deposit(amt); } } } Dan Grossman, Software Transactions
But can we generalize So transactions sure looks appealing… But what is the essence of the benefit? Transactional Memory (TM) is to shared-memory concurrency as Garbage Collection (GC) is to memory management Dan Grossman, Software Transactions
roots heap objects GC in 60 seconds • Allocate objects in the heap • Deallocate objects to reuse heap space • If too soon, dangling-pointer dereferences • If too late, poor performance / space exhaustion Automate deallocation via reachabilityapproximation Dan Grossman, Software Transactions
GC Bottom-line Established technology with widely accepted benefits • Even though it can perform arbitrarily badly in theory • Even though you can’t always ignore how GC works (at a high-level) • Even though an active research area after 40+ years Now about that analogy… Dan Grossman, Software Transactions
concurrent programming race conditions loss of parallelism deadlock lock lock acquisition The problem, part 1 Why memory management is hard: Balance correctness (avoid dangling pointers) And performance (space waste or exhaustion) Manual approaches require whole-program protocols Example: Manual reference count for each object • Must avoid garbage cycles Dan Grossman, Software Transactions
synchronization release locks are held concurrent The problem, part 2 Manual memory-management is non-modular: • Caller and callee must know what each other access or deallocate to ensure right memory is live • A small change can require wide-scale code changes • Correctness requires knowing what data subsequent computation will access Dan Grossman, Software Transactions
TM thread-shared thread-local The solution Move whole-program protocol to language implementation • One-size-fits-most implemented by experts • Usually inside the compiler and run-time • GC system uses subtle invariants, e.g.: • Object header-word bits • No unknown mature pointers to nursery objects Dan Grossman, Software Transactions
So far… Dan Grossman, Software Transactions
memory conflict TM run-in-parallel Open nested txns unique id generation locking TM TM Incomplete solution GC a bad idea when “reachable” is a bad approximation of “cannot-be-deallocated” Weak pointers overcome this fundamental limitation • Best used by experts for well-recognized idioms (e.g., software caches) In extreme, programmers can encode manual memory management on top of GC • Destroys most of GC’s advantages Dan Grossman, Software Transactions
Circumventing TM class SpinLock { private boolean b = false; void acquire() { while(true) atomic { if(b) continue; b = true; return; } } void release() { atomic { b = false; } } } Dan Grossman, Software Transactions
It really keeps going (see the essay) Dan Grossman, Software Transactions
Lesson Transactional memory is to shared-memory concurrency as garbage collection is to memory management Huge but incomplete help for correct, efficient software Analogy should help guide transactions research Dan Grossman, Software Transactions
Today, part 1 Language design, semantics: • Motivation: Example + the GC analogy [OOPSLA07] • Semantics: strong vs. weak isolation [PLDI07]* [POPL08] [[Katherine Moore] • Interaction w/ other features [ICFP05][SCHEME07][POPL08] * Joint work with Intel PSL Dan Grossman, Software Transactions
“Weak” isolation initially y==0 Widespread misconception: “Weak” isolation violates the “all-at-once” property only if corresponding lock code has a race (May still be a bad thing, but smart people disagree.) atomic { y = 1; x = 3; y = x; } x = 2; print(y); //1? 2? 666? Dan Grossman, Software Transactions
It’s worse Privatization: One of several examples where lock code works and weak-isolation transactions do not ptr initially ptr.f == ptr.g sync(lk) { r = ptr; ptr = new C(); } assert(r.f==r.g); sync(lk) { ++ptr.f; ++ptr.g; } f g (Example adapted from [Rajwar/Larus] and [Hudson et al]) Dan Grossman, Software Transactions
It’s worse (Almost?) every published weak-isolation system lets the assertion fail! • Eager-update or lazy-update ptr f g initially ptr.f == ptr.g atomic { r = ptr; ptr = new C(); } assert(r.f==r.g); atomic { ++ptr.f; ++ptr.g; } Dan Grossman, Software Transactions
The need for semantics • Which is wrong: the privatization code or the transactions implementation? • What other “gotchas” exist? • What language/coding restrictions suffice to avoid them? • Can programmers correctly use transactions without understanding their implementation? • What makes an implementation correct? Only rigorous source-level semantics can answer Dan Grossman, Software Transactions
What we did Formal operational semantics for a collection of similar languages that have different isolation properties Program state allows at most one live transaction: a;H;e1|| … ||en a’;H’;e1’|| … ||en’ Multiple languages, including: Dan Grossman, Software Transactions
What we did Formal operational semantics for a collection of similar languages that have different isolation properties Program state allows at most one live transaction: a;H;e1|| … ||en a’;H’;e1’|| … ||en’ Multiple languages, including: 1. “Strong”: If one thread is in a transaction, no other thread may use shared memory or enter a transaction Dan Grossman, Software Transactions
What we did Formal operational semantics for a collection of similar languages that have different isolation properties Program state allows at most one live transaction: a;H;e1|| … ||en a’;H’;e1’|| … ||en’ Multiple languages, including: 2. “Weak-1-lock”: If one thread is in a transaction, no other thread may enter a transaction Dan Grossman, Software Transactions
What we did Formal operational semantics for a collection of similar languages that have different isolation properties Program state allows at most one live transaction: a;H;e1|| … ||en a’;H’;e1’|| … ||en’ Multiple languages, including: 3. “Weak-undo”: Like weak, plus a transaction may abort at any point, undoing its changes and restarting Dan Grossman, Software Transactions
A family Now we have a family of languages: “Strong”: … other threads can’t use memory or start transactions “Weak-1-lock”: … other threads can’t start transactions “Weak-undo”: like weak, plus undo/restart So we can study how family members differ and conditions under which they are the same Oh, and we have a kooky, ooky name: The AtomsFamily Dan Grossman, Software Transactions
Easy Theorems Theorem: Every program behavior in strong is possible in weak-1-lock Theorem: weak-1-lock allows behaviors strong does not Theorem: Every program behavior in weak-1-lock is possible in weak-undo Theorem (slightly more surprising): weak-undo allows behavior weak-1-lock does not Dan Grossman, Software Transactions
Hard theorems Consider a (formally defined) type system that ensures any mutable memory is either: • Only accessed in transactions • Only accessed outside transactions Theorem: If a program type-checks, it has the same possible behaviors under strong and weak-1-lock Theorem: If a program type-checks, it has the same possible behaviors under weak-1-lock and weak-undo Dan Grossman, Software Transactions
A few months in 1 picture strong-undo strong weak-1-lock weak-undo Dan Grossman, Software Transactions
Lesson Weak isolation has surprising behavior; formal semantics let’s us model the behavior and prove sufficient conditions for avoiding it In other words: With a (too) restrictive type system, get semantics of strong and performance of weak Dan Grossman, Software Transactions
Today, part 1 Language design, semantics: • Motivation: Example + the GC analogy [OOPSLA07] • Semantics: strong vs. weak isolation [PLDI07]* [POPL08] • Interaction w/ other features [ICFP05][SCHEME07][POPL08] * Joint work with Intel PSL Dan Grossman, Software Transactions
What if… Real languages need precise semantics for all feature interactions. For example: • Native Calls [Ringenburg] • Exceptions [Ringenburg, Kimball] • First-class continuations [Kimball] • Thread-creation [Moore] • Java-style class-loading [Hindman] • Open: Bad interactions with memory-consistency model See joint work with Manson and Pugh[MSPC06] Dan Grossman, Software Transactions
Today, part 2 Implementation: • On one core [ICFP05] [SCHEME07] [Michael Ringenburg, Aaron Kimball] • Static optimizations for strong isolation [PLDI07]* • Multithreaded transactions * Joint work with Intel PSL Dan Grossman, Software Transactions
Interleaved execution The “uniprocessor (and then some)” assumption: Threads communicating via shared memory don't execute in “true parallel” Important special case: • Uniprocessors still exist • Many language implementations assume it (e.g., OCaml, Scheme48) • Multicore may assign one core to an application Dan Grossman, Software Transactions
Implementing atomic Key pieces: • Execution of an atomic block logs writes • If scheduler pre-empts during atomic, rollback the thread • Duplicate code or bytecode-interpreter dispatch so non-atomic code is not slowed by logging Dan Grossman, Software Transactions
Logging example Executing atomic block: • build LIFO log of old values: int x=0, y=0; void f() { int z = y+1; x = z; } void g() { y = x+1; } void h() { atomic { y = 2; f(); g(); } } y:0 z:? x:0 y:2 Rollback on pre-emption: • Pop log, doing assignments • Set program counter and stack to beginning of atomic On exit from atomic: • Drop log Dan Grossman, Software Transactions
Logging efficiency Keep the log small: • Don’t log reads (key uniprocessor advantage) • Need not log memory allocated after atomic entered • Particularly initialization writes • Need not log an address more than once • To keep logging fast, switch from array to hashtable after “many” (50) log entries y:0 z:? x:0 y:2 Dan Grossman, Software Transactions
Evaluation Strong isolation on uniprocessors at little cost • See papers for “in the noise” performance • Memory-access overhead Recall initialization writes need not be logged • Rare rollback Dan Grossman, Software Transactions
Lesson Implementing transactions in software for a uniprocessor is so efficient it deserves special-casing Note: Don’t run other multicore services on a uniprocessor either Dan Grossman, Software Transactions
Today, part 2 Implementation: • On one core [ICFP05] [SCHEME07] • Static optimizations for strong isolation [PLDI07]* [Steven Balensiefer, Benjamin Hindman] • Multithreaded transactions * Joint work with Intel PSL Dan Grossman, Software Transactions
Strong performance problem Recall uniprocessor overhead: With parallelism: Dan Grossman, Software Transactions
Optimizing away strong’s cost New: static analysis for not-accessed-in-transaction … Thread local Not accessed in transaction Immutable Dan Grossman, Software Transactions