130 likes | 161 Views
Foundations of the C++ Concurrency Memory Model . Hans-J. Boehm Sarita V. Adve HP Laboratories UIUC. Multithreaded applications. Written in single threaded languages C,C++ Compilers are not thread aware OS thread libraries (e.g pthread) to prevent data races.
E N D
Foundations of the C++ Concurrency Memory Model Hans-J. Boehm Sarita V. Adve HP Laboratories UIUC
Multithreaded applications • Written in single threaded languages C,C++ • Compilers are not thread aware • OS thread libraries (e.g pthread) to prevent data races
A need for semantics of multithreaded C++ programs (1) • Informal specifications of current thread libraries are ambiguous. • What is a data race ?
A need for semantics of multithreaded C++ programs (2) • Common compiler transformation violate • Compilers need to be thread aware
A need for semantics of multithreaded C++ programs (3) • Current facilities for (lock free) atomic data access are expensive and not portable • __sync instrinsics (gcc) • Interlocked (Microsoft)
Memory Consistency Models • What values should a shared variable read return in a multithreaded program • performance, portability and programmability • Sequential Consistency • Simplest • Single total order for memory ops, program order within thread • Restricts compiler/hardware optimizations • Relaxed models • Difficult for programmers to understand • Limits some compiler optimizations
Data-Race-Free Models • Correct programs are data-race-free in any sequential consistent execution • Sequential consistency to correct programs only • Unlike Java • Advantages : • Simple programming model (Sequential consistency) • High performance (correct programs) • Data-race-free-0 model: two concurrent conflicting accesses • More flexible refinements with more information from programmer
C++ model • data-race-free-0 model • Undefined semantics for programs with races • No benign data races • Sequentially consistent synchronization operations (atomics) • Write-atomicity
Abuse of trylocks • Assertion may fail if T1 code reordered • Expensive memory fence can prevent reordering • Make trylocks fail if lock is available
Relaxing Write Atomicity • Independent-Reads-Independent-Writes • single core/single threaded processors • Ownership based invalidation protocol systems • Invalid for SMTs and Multicores
Write-to-Read Cauaslity • Relaxing write atomicity violates sequential consistency T1 T2 T3 L1 L1
Conclusion • Simple programming model • Sequential consistency for data race free programs • Break abusive programming idioms (e.g trylocks) • Hardware implications • Write Atomicity can not be easily relaxed • Sequential consistent atomics • AMD64, Intel64 map atomics to atomic xchg • Compilers • Special handling of atomic types • No register promotion • No rewriting of adjacent structure fields