450 likes | 625 Views
Noninvasive Java Concurrency with Deuce STM 1.0. Guy Korland “Multi Core Tools” CMP09. Outline. Motivation Deuce Implementation TL2 LSA Benchmarks Summary References. Motivation. Problem I. Process 1 Process 2 a = acc.get() a = a + 100 b = acc.get() b = b + 50
E N D
Noninvasive Java Concurrency with Deuce STM 1.0 Guy Korland “Multi Core Tools”CMP09
Outline Motivation Deuce Implementation TL2 LSA Benchmarks Summary References
Problem I Process 1Process 2 a = acc.get() a = a + 100 b = acc.get() b = b + 50 acc.set(b) acc.set(a) ... Lost Update! ...
Problem II Process 1Process2 lock(A) lock(B) lock(B) lock(A) ... Deadlock! ...
The Problem Cannot exploit cheap threads Today’s Software Non-scalable methodologies Today’s Hardware Poor support for scalable synchronization. Low level support CAS, TAS, MemBar…
Why Locking Doesn’t Scale? Not Robust Relies on conventions Hard to Use Conservative Deadlocks Lost wake-ups Not Composable
Outline Motivation Solutions Deuce Implementation TL2 LSA Benchmarks Summary References
Solutions I – Domain specific • Mathlab – Concurrency behind the scenes. • SQL/XQuery/XPath – DB will handle it… • HTML, ASP, PHP, JSP … – (almost) stateless. • Fortress[Sun], X10[IBM], Chapel[UW] … – implicit concurrency. Domain too specific Remember Cobol!
Solutions II – Actor Model(Share nothing model) Carl Hewitt, Peter Bishop and Richard, A Universal Modular Actor Formalism for Artificial Intelligence [IJCAI 1973]. An actor, on message: no shared data send messages to other actors create new actors Where can we find it? Simula, Smalltalk, Scala, Haskell, F#, Erlang... Functional languges
Solutions II – Actor Model(Share nothing model) Actors in Erlang • Is it really easier? • What about performance? • Will functional languages ever be functional? • Java/.NET/C++ rules!!! (maybe Ruby) send_msgs(_, 0) -> true; send_msgs(S, Count) -> S ! {inc, 1}, send_msgs(S, Count-1). -module(counter). -export([run/0, counter/1]). run() -> S = spawn(counter, counter, [0]), send_msgs(S, 100000), S. counter(Sum) -> receive {inc, Amount} -> counter(Sum+Amount) end.
Solutions III – STMNir Shavit, DAN TOUITOU, Software Transactional Memory [PODC95] l.lock(); <instructions> l.unlock(); synchronized{ <instructions> } atomic{ <instructions> }
What is a transaction? Or maybe we do want it? Atomicity – all or nothing Consistency – consistent state (after & before) Isolation – Other can’t see intermediate. Durability - persistent
DSTM2 (Herlihy, Luchangco) Soft Trans (Ananian, Rinard) Meta Trans (Herlihy, Shavit) T-Monitor (Jagannathan…) TL2 (Dice, Shavit, Shalev) Trans Support TM (Moir) AtomJava (Hindman…) WSTM (Fraser, Harris) OSTM (Fraser, Harris) ASTM (Marathe et al) Deuce (Korland et al) STM (Shavit,Touitou) Lock-OSTM (Ennals) DSTM (Herlihy et al) McTM (Saha et al) LSA (Riegel et al TL (Dice, Shavit)) HybridTM (Moir) 2009 2006 2003 1997 2005 2004 2006 2007 2005 2004 2006 2006 2008 2004 2005 2004 1993 2003 2004 2003 Rock (Sun) Tanger The Brief History of STM
DSTM2Maurice Herlihy et al, A flexible framework … [OOPSLA06] • Limited to Objects. • Very intrusive. • Doesn’t support libraries. • Bad performance (fork). @atomicpublic interfaceINode{ intgetValue (); voidsetValue (intvalue ); INode getNext (); voidsetNext (INode value ); } Factory<INode> factory = Thread.makeFactory(INode.class); result = Thread.doIt(newCallable<Boolean>() { publicBoolean call () { returnintSet.insert (value); } });
JVSTMJoão Cachopo and António Rito-Silva, Versioned boxes as the basis for memory transactions [SCOOL05] • Doesn’t support libraries. • Less intrusive. • Need to “Announce” shared fields public class Account{ privateVBox<Long> balance = newVBox<Long>(); public@Atomicvoid withdraw(long amount) { balance.put (balance.get() - amount); } }
Atom-JavaB. Hindman and D. Grossman. Atomicity via source-tosourcetranslation. [MSPC06] • Add a reserved word. • Need precompilation. • Doesn’t support libraries. • Even Less intrusive. public void update ( double value){ Atomic{ commission+= value; } }
MultiversePeter Veentjer, 2009 @TmEntitypublic static class Node<E> { final E value;final Node parent; Node(E value, Node prev) {this.value = value;this.parent = prev; } } • Doesn’t support libraries. • Limited to Objects. @TmEntity public class Stack<E>{private Node<E> head; public void push(E item) { head = new Node(item, head); } }
DATM-JHany E. Ramadan et al., Dependence-aware transactional memory [MICRO08] • Explicit transaction. • Explicit retry. Transaction tx = new Transaction ( id) ; boolean done = false; while ( !done) { try{ tx.BeginTransaction( ) ; / / txnl code done = tx.CommitTransaction ( ) ; } catch( AbortException e ) { tx.AbortTransaction( ) ; done = false; } }
Outline Motivation Solutions Deuce Implementation TL2 LSA Benchmarks Summary References
Deuce STM Java STM framework @Atomic methods Field based access More scalable than Object bases. More efficient than word based. Supports external libraries Can be part of a transaction No reserved words No need for new compilers (Existing IDEs can be used) Research tool API for developing and testing new algorithms.
Deuce - API public class Bank{ final private staticdouble MAXIMUM_TRANSACTION = 1000; privatedouble commission = 0; @Atomic(retries=64) public void transaction( Account ac1, Account ac2, double amount){ ac1.balance -= (amount + commission); ac2.balance += amount; } @Atomic public void update( double value){ commission += value; } }
Deuce - Running –javaagent:deuceAgent.jar Dynamic bytecode manipulation. -Xbootclasspath/p:rt.jar Offline instrumentation to support boot classloader. java –javaagent:deuceAgent.jar –cp “myjar.jar” MyMain
Outline Motivation Solutions Deuce Implementation TL2 LSA Benchmarks Summary References
Implementation ASM – Bytecode manipulation Online & Offline Fields privatedouble commission; final static public longcommission__ADDRESS... Relative address (-1 if final). final static public Object __CLASS_BASE__ ... Mark the class base for static fields access.
Implementation Method @Atomic methods. Replace the with a transaction retry loop. Add another instrumented method. Non-Atomic methods Duplicate each with an instrumented version.
Implementation @Atomic public void update ( double value){ commission += value; } In byte code @Atomic public void update ( double value){ double tmp = commission; commission = tmp + value; }
JIT removes it Implementation public void update( double value, Context c){ double tmp; if( commission__ADDRESS < 0 ) {// final field tmp = commission; } else{ c.beforeRead( this, commission__ADDRESS); tmp = c.onRead( this, commission, commission__ADDRESS); } c.onWrite( this, tmp + value, commission__ADDRESS); }
Implementation public void update( double value, Context c){ c.beforeRead( this, commission__ADDRESS); doubletmp = c.onRead( this, commission, commission__ADDRESS); c.onWrite( this, tmp + value, commission__ADDRESS); }
Implementation public void update( double value){ Context context = ContextDelegetor.getContext(); for( int i = retries ; i > 0 ; --i){ context.init(); try{ update( value, context); if( context.commit()) return; }catch ( TransactionException e ){ context.rollback(); continue; }catch ( Throwable t ){ if( context.commit()) throw t; } } throw new TransactionException(); }
Implementation • public interface Context{ • void init ( int atomicBlockId) • boolean commit(); • void rollback (); • void beforeReadAccess( Object obj , long field ); • Object onReadAccess( Object obj, Object value , long field ); • int onReadAccess( Object obj, int value , long field ); • long onReadAccess( Object obj, long value , long field ); • … • void onWriteAccess( Object obj , Object value , long field ); • void onWriteAccess( Object obj , int value , long field ); • void onWriteAccess( Object obj , long value , long field ); • … • }
Outline Motivation Solutions Deuce Implementation TL2 LSA Benchmarks Summary References
TL2 (Transaction Locking II)Dave Dice, Ori Shalev and Nir Shavit [DISC06] CTL - Commit-time locking Start Sample global version-clock Run through a speculative execution Collect write-set & read-set End Lock the write-set Increment global version-clock Validate the read-set Commit and release the locks
Outline Motivation Solutions Deuce Implementation TL2 LSA Benchmarks Summary References
LSA (Lazy Snapshot Algorithm)Torvald Riegel, Pascal Felber and Christof Fetzer [DISC06] ETL - Encounter-time locking Start Sample global version-clock Run through a speculative execution Lock on write access Collect read-set & write-set On validation error try to extend snapshot End Increment global version-clock Validate the read-set Commit and release the locks
Outline Motivation Solutions Deuce Implementation TL2 LSA Benchmarks Summary References
Outline Motivation Solutions Deuce Implementation TL2 LSA Benchmarks Summary References
Summary Simple API @Atomic No changes to Java No reserved words OpenSource On Google code Shows nice scalabilty Field based
Outline Motivation Solutions Deuce Implementation TL2 LSA Benchmarks Summary References
References Homepage - http://www.deucestm.org Project - http://code.google.com/p/deuce/ Wikipedia -http://en.wikipedia.org/wiki/Software_transactional_memory TL2 – http://research.sun.com/scalable LSA-STM - http://tmware.org/lsastm