1.32k likes | 1.47k Views
Transactional Memory. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A. Our Vision for the Future. In this course, we covered …. Best practices ….
E N D
Transactional Memory Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAA
Our Vision for the Future In this course, we covered …. Best practices … New and clever ideas … And common-sense observations. Art of Multiprocessor Programming 2
Our Vision for the Future In this course, we covered …. Nevertheless … Best practices … Concurrent programming is still too hard … New and clever ideas … Here we explore why this is …. And common-sense observations. And what we can do about it. Art of Multiprocessor Programming 3
Locking Art of Multiprocessor Programming 4
Coarse-Grained Locking Easily made correct … But not scalable. Art of Multiprocessor Programming 5
Fine-Grained Locking Can be tricky … Art of Multiprocessor Programming 6
Locks are not Robust If a thread holding a lock is delayed … No one else can make progress 7 Art of Multiprocessor Programming
Locking Relies on Conventions • Relation between • Lock bit and object bits • Exists only in programmer’s mind Actual comment from Linux Kernel (hat tip: Bradley Kuszmaul) • /* • * When a locked buffer is visible to the I/O layer • * BH_Launder is set. This means before unlocking • * we must clear BH_Launder,mb() on alpha and then • * clear BH_Lock, so no reader can see BH_Launder set • * on an unlocked buffer and then risk to deadlock. • */ Art of Multiprocessor Programming
Simple Problems are hard enq(y) enq(x) double-ended queue No interference if ends “far apart” Interference OK if queue is small Clean solution is publishable result: [Michael & Scott PODC 97] Art of Multiprocessor Programming 9
Locks Not Composable Transfer item from one queue to another Must be atomic : No duplicate or missing items Art of Multiprocessor Programming 10
Locks Not Composable Lock source Unlock source & target Lock target Art of Multiprocessor Programming 11
Locks Not Composable Lock source Methods cannot provide internal synchronization Unlock source & target Objects must expose locking protocols to clients Lock target Clients must devise and follow protocols Abstraction broken! Art of Multiprocessor Programming 12
Monitor Wait and Signal Empty zzz buffer Yes! If buffer is empty, wait for item to show up Art of Multiprocessor Programming 13
Wait and Signal do not Compose empty empty zzz… Wait for either? Art of Multiprocessor Programming 14
Art of Multiprocessor Programming The Transactional Manifesto • Current practice inadequate • to meet the multicore challenge • Research Agenda • Replace locking with a transactional API • Design languages or libraries • Implement efficient run-times 15
Art of Multiprocessor Programming Transactions Block of code …. Atomic: appears to happen instantaneously Serializable: all appear to happen in one-at-a-time order Commit: takes effect (atomically) Abort: has no effect (typically restarted) 16
Art of Multiprocessor Programming Atomic Blocks atomic{x.remove(3); y.add(3);}atomic{ y =null; } 17
Art of Multiprocessor Programming Atomic Blocks atomic {x.remove(3); y.add(3);}atomic {y =null; } No data race 18
Art of Multiprocessor Programming A Double-Ended Queue public voidLeftEnq(item x) { Qnode q = newQnode(x); q.left = left; left.right= q; left = q; } Write sequential Code 19
Art of Multiprocessor Programming A Double-Ended Queue public voidLeftEnq(item x) atomic { Qnode q = newQnode(x); q.left = left; left.right= q; left = q; } } 20
Art of Multiprocessor Programming A Double-Ended Queue public void LeftEnq(item x) { atomic { Qnode q = new Qnode(x); q.left = left; left.right= q; left = q; } } Enclose in atomic block 21
Art of Multiprocessor Programming Not always this simple Conditional waits Enhanced concurrency Complex patterns But often it is… Warning 22
Composition? Art of Multiprocessor Programming 23
Composition? public void Transfer(Queue<T> q1, q2) { atomic { T x = q1.deq(); q2.enq(x); } } Trivial or what? Art of Multiprocessor Programming 24
Art of Multiprocessor Programming Conditional Waiting public T LeftDeq() { atomic { if (left == null) retry; … } } Roll back transaction and restart when something changes 25
Art of Multiprocessor Programming Composable Conditional Waiting atomic { x = q1.deq(); } orElse { x = q2.deq(); } Run 1st method. If it retries … Run 2nd method. If it retries … Entire statement retries 26
Hardware Transactional Memory Exploit native cache coherence invalidation consistency checking rollback Art of Multiprocessor Programming 27 27
Intel Haswell xbegin starts new transaction speculativeexecution checks for conflict at the end commit changes become visible abort changes discarded
Intel Haswell xbegin starts new transaction speculativeexecution checks for conflict at the end No progress guarantees a transaction may never commit must have software fallback
Intel Haswell xbegin starts new transaction register pointer to abort hander code jump there if transaction aborts nesting up to 8 levels
Intel Haswell xend end transaction commit control flows normally abort jump to abort handler code
Intel Haswell xabort abort transaction discard any changes jump directly to abort handler can set abort condition code
Intel Haswell xtest am I in a transaction? useful for dual-purpose code example: avoid clean-up
Intel Haswell xbegin start new transaction xend end transaction xabort abort transaction xtest am I in a transaction?
Lock Elision what execute critical section speculatively on fail, go back and acquire lock why no conflict, no cry don’t rewrite legacy code don’t retrain legacy coders
Lock Elision xacquire start critical section xrelease end critical section
Lock Elision Pseudo-Code Retry: xbegin Abort cmpmutex, 0 jz Success xabort $0xff Abort: … Success:critical section Unlock: cmpmutex, 0 jnzrelease_lock xend
Lock Elision Pseudo-Code Retry: xbegin Abort cmpmutex, 0 jz Success xabort $0xff Abort: … Success:critical section Unlock: cmpmutex, 0 jnzrelease_lock xend enter transaction, Abort is fallback path
Lock Elision Pseudo-Code Retry: xbegin Abort cmpmutex, 0 jz Success xabort $0xff Abort: … Success:critical section Unlock: cmpmutex, 0 jnzrelease_lock xend is lock free?
Lock Elision Pseudo-Code Retry: xbegin Abort cmpmutex, 0 jz Success xabort $0xff Abort: … Success:critical section Unlock: cmpmutex, 0 jnzrelease_lock xend abort transaction if mutex busy
Lock Elision Pseudo-Code fallback software path retry transaction or explicitly acquire mutex Retry: xbegin Abort cmpmutex, 0 jz Success xabort $0xff Abort: … Success:critical section Unlock: cmpmutex, 0 jnzrelease_lock xend
Lock Elision Pseudo-Code critical section is either speculative (transactional) or holding lock (non-transactional) Retry: xbegin Abort cmpmutex, 0 jz Success xabort $0xff Abort: … Success:critical section Unlock: cmpmutex, 0 jnzrelease_lock xend
Lock Elision Pseudo-Code Retry: xbegin Abort cmpmutex, 0 jz Success xabort $0xff Abort: … Success:critical section Unlock: cmpmutex, 0 jnzrelease_lock xend mutex not free means we are not in a transaction
Lock Elision Pseudo-Code Retry: xbegin Abort cmpmutex, 0 jz Success xabort $0xff Abort: … Success:critical section Unlock: cmpmutex, 0 jnzrelease_lock xend otherwise, commit transaction
When Transactions Fail changes discarded exceptions suppressed sets condition code XABORT overflow retry debug conflict nested
Some Haswell Numbers 16K transactions stable transaction setup: 40 cycles larger transactions more efficient transaction abort: 150 cycles register restoration?
new instructions check transaction status take checkpoint restore checkpoint debugging interrupt handler
transaction suspension pause transaction run non-transactional code resume transaction debugging interrupt handlers