230 likes | 384 Views
Thread-Level Speculation as a Memory Consistency Protocol for Software DSM?. University of Edinburgh http://www.dcs.ed.ac.uk/home/mc. Marcelo Cintra. Thread-Level Speculation (TLS). Speculatively run whole “threads” and backtrack if necessary
E N D
Thread-Level Speculation as a Memory Consistency Protocol for Software DSM? University of Edinburgh http://www.dcs.ed.ac.uk/home/mc Marcelo Cintra
Thread-Level Speculation (TLS) • Speculatively run whole “threads” and backtrack if necessary • Track data accesses to detect cross-thread “conflicting” memory accesses • Buffer state of speculative threads and commit when appropriate • Enforce some expected correct execution behavior Dagstuhl Seminar - October 2003
RAW Example 1: Speculative Parallelization • Original code: sequential with non-decidable dependences • Squash on data flow dependences for(i=0; i<100; i++) { … = A[L[i]]+… A[K[i]] = … } Iteration J … = A[4]+… A[5] = ... Iteration J+1 … = A[2]+… A[2] = ... Iteration J+2 … = A[5]+… A[5] = ... Dagstuhl Seminar - October 2003
Example 2: Speculative Synchronization [Martinez and Torrellas, ASPLOS02] • Original code: parallel with locks and barriers • Squash on conflicting accesses Thread A acquire Thread B acquire Thread C acquire … = A[4]+… A[5] = … … = A[2]+… A[2] = … … = A[5]+… A[5] = … RAW WAW release release release Dagstuhl Seminar - October 2003
Example 2: Speculative Synchronization • Non-conflicting memory operations can perform out-of-order • Conflicting memory operations eventually complete in-order after rollback • Relaxes the order of non-conflicting memory operations while still providing RC abstraction • At release/commit all pending stores must complete TLS used to enforce RC in a more “relaxed” way by means of speculation and rollback Dagstuhl Seminar - October 2003
Outline • Background and motivation • A TLS-based protocol for software DSM • Summary • Related work • Conclusions Dagstuhl Seminar - October 2003
LRC Consistency Protocol • Block on acquires and wait for lock • Obtain lock along with invalidations • On load page fault allocate local page and get diff update • On store page fault generate twin copy • On release compare twin and private copy to generate twin; send invalidations and lock to next thread in line Dagstuhl Seminar - October 2003
Example LRC Operation Thread A acquire … = A[4]+… … A[5] = … release Thread B acquire … = A[2]+… … A[2] = … release Thread C acquire … = A[5]+… … A[5] = … release Generate diff Obtain diff from Thread A Dagstuhl Seminar - October 2003
TLS-based Consistency Protocol • On load or write miss allocate local page and twin copy • Expand loads and stores to keep a record of the accesses to individual fields of shared objects • On commit • Wait for “diff” from non-speculative thread • Check for violations • Merge “diff’s” and pass to next speculative thread in line • If violation detected • Incorporate received “diff” into twin copy and discard local copy • Discard own “diff” • Discard some private data (may require extra buffering) • Re-execute Dagstuhl Seminar - October 2003
Speculative Non-spec NotAccessed Loaded Modified NotAccessed NotAccessed NotAccessed Modified Loaded NotAccessed NotAccessed Modified Modified Modified Violation Violation TLS “diff” and Violations • 3 possible states for each field of shared object: • NotAccessed: thread did not touch this field • Loaded: thread loaded this field but did not store to it • Modified: thread stored to this field and possibly loaded it • Violation and merging of “diff”s Dagstuhl Seminar - October 2003
Example TLS DSM Operation Thread A TLS_start … = A[4]+… TLS_load … A[5] = … TLS_store TLS_end Thread B TLS_start … = A[2]+… TLS_load … A[2] = … TLS_store TLS_end Thread C TLS_start … = A[5]+… TLS_load … A[5] = … TLS_store TLS_end No need to update “diff” Update “diff” to have A[2] as Loaded Get page with stale data Update “diff” to have A[5] as Modified Wait for non-spec (A) to finish. Obtain “diff” from A. Compare “diff” with own “diff”. No violations, so become non-spec. Merge “diff’s” Wait for non-spec (B) to finish. Obtain “diff” from B. Compare “diff” with own “diff”. Violation detected. Dagstuhl Seminar - October 2003
Example Implementation • TLS_load: • TLS_store: • TLS_start: • Try to acquire lock with a non-blocking operation • If successful then become non-speculative • Otherwise get a place in line for the lock, and execute speculatively if (SA[i]==NotAccessed) SA[i]=Loaded SA[i]=Modified Dagstuhl Seminar - October 2003
Example Implementation • TLS_end: • If non-speculative then “pass” lock to next thread in line; next thread becomes non-speculative • Else, if next thread waiting for lock then • Wait for non-speculative to finish • Get “diff” from non-speculative thread • Check for violations • Merge “diff”s • “Pass” lock to next thread in line • Else, wait for lock Dagstuhl Seminar - October 2003
Outline • Background and motivation • A TLS-based protocol for software DSM • Summary • Related work • Conclusions Dagstuhl Seminar - October 2003
Will It Work? • Overheads • Augmented loads and stores • Both speculative parallelization and optimistic concurrency control in software have been done successfully • Compiler instrumentation for write trapping in DSM is not so bad [Adve et. al., HPCA96] • Serialization of commits • Implementation • Hopefully not much more complex than a software DSM • Use source code augmentation and user help • Applications • Irregular applications with little overlap of modifications in critical sections • Easy to switch back to normal DSM operation Dagstuhl Seminar - October 2003
Outline • Background and motivation • A TLS-based protocol for software DSM • Summary • Related work • Conclusions Dagstuhl Seminar - October 2003
Related Work Speculative Synchronization: • Martinez and Torrellas (ASPLOS 2002); Rajwar and Goodman (MICRO 2001) • Hardware-based Optimistic Concurrency Control and Software Transactional Memory • Herlihy (ACM TDBS 1990); Kung and Robinson (ACM TDBS 1981) • Source-code level speculation for transaction processing • Shavit and Touitou (PODC 1995); Herlihy et. al., (PODC 2003) • Run-time system speculation on top of hardware coherent systems Dagstuhl Seminar - October 2003
Related Work Speculation and consistency models: • Gniady, Falsafi, and Vijaykumar (ISCA 1999) • SC plus speculation in hardware • Speculation only within instruction window and ld/st queue Dagstuhl Seminar - October 2003
Related Work Software Speculative Parallelization: • Dang, Yu, and Rauchwerger (IPDPS 2002); Rundberg and Stenström (WSSMM 2000); Cintra and Llanos (PPoPP 2003) • Speculative parallelization at source-code level • Papadimitriou and Mowry (CMU-CS-01-145) • Speculative parallelization on software DSM protocol Dagstuhl Seminar - October 2003
Related Work Software DSM systems: • Treadmarks: Amza et. al. (IEEE Computer 1996) • Lazy RC (LRC) • Midway: Bershad, Zekauskas, and Sawdon (CompCon 1993) • Entry Consistency (EC) • Adve et. al. (HPCA 1996) • Compared LRC versus EC • Compared twinning versus compiler instrumentation for write trapping Dagstuhl Seminar - October 2003
Outline • Background and motivation • A TLS-based protocol for software DSM • Summary • Related work • Conclusions Dagstuhl Seminar - October 2003
Conclusions and Future Work • TLS can provide RC with more relaxed synchronization • Hardware speculative synchronization and software speculative parallelization have been successful • Must find applications • Must perform detailed performance evaluation • ? Dagstuhl Seminar - October 2003
Thread-Level Speculation as a Memory Consistency Protocol for Software DSM? University of Edinburgh http://www.dcs.ed.ac.uk/home/mc Marcelo Cintra