1 / 36

Concurrent Revisions: A deterministic concurrency model.

Concurrent Revisions: A deterministic concurrency model. Daan Leijen & Sebastian Burckhardt Microsoft Research (invited talk at FOOL 2010). Algorithmic/ Scientific. Special-Purpose. Interactive/ Reactive. Operating System. Data Mining Biology Chemistry Physics. Desktop Applications.

ranger
Download Presentation

Concurrent Revisions: A deterministic concurrency model.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Concurrent Revisions:A deterministic concurrency model. Daan Leijen & Sebastian Burckhardt Microsoft Research (invited talk at FOOL 2010)

  2. Algorithmic/ Scientific Special-Purpose Interactive/ Reactive Operating System Data Mining Biology Chemistry Physics Desktop Applications Database Games Multimedia Signal Processing Parallel Programming TPL, Fortran, MPI, Cilk, StreamIt, X10, Cuda, ... Concurrent Programming Threads & Locks, Futures, Promises, Transactions, ... ? Our Focus.

  3. Application = Shared Data and Tasks Reader Reader Mutator R R R R Shared Data Mutator Mutator Reader Mutator Example: Office application • Save the document • React to keyboard input by the user • Perform a spellcheck in the background • Exchange updates with collaborating remote users

  4. Spacewars!

  5. Examples from SpaceWars Game Example 1:read-write conflict • Render taskreads position of all game objects • Physics task updates position of all game objects => Render task needs to see consistent snapshot Example 2: write-write conflict • Physics task updates position of all game objects • Network task updates position of some objects => Network has priority over physics updates

  6. Conventional Concurrency Control Conflicting tasks can not efficiently execute in parallel. • pessimistic concurrency control • use locks to avoid parallelism where there are (real or potential) conflicts • optimistic concurrency control • speculate on absence of conflictsrollback if there are real conflicts either way: true conflicts kill parallel performance.

  7. Our Proposed Programming Model: Revisions and Isolation Types Revision A logical unit of work that is forked and joined Isolation Type A type which implements automatic copying/merging of versions on write-write conflict • Deterministic Conflict Resolution, never roll-back • Full concurrent reading and writing of shared data • No restrictions on tasks (can be long-running, do I/O) • Clean semantics • Fast and space-efficient runtime implementation

  8. What’s new Isolation types: declares shared data fork revision: forks off a private copy of the shared state • Isolation: side effects are only visible when the revision is joined. • Deterministic execution! Traditional Task Concurrent Revisions int x = 0; task t = fork { x = 1; } assert(x==0 || x==1); join t; assert(x==1); versioned<int> x = 0; revision r = rfork { x = 1; } assert(x==0); join r; assert(x==1); isolation: Concurrent modifications are not seen by others join revision: waits for the revision to terminate and writes back changes into the main revision

  9. Sequential Consistency int x = 0; int y = 0; task t = fork { if (x==0) y++; } if (y==0) x++; join t; assert( (x==0 && y==1) || (x==1 && y==0) || (x==1 && y==1)); Concurrent Revisions Transactional Memory int x = 0; int y = 0; task t = fork { atomic { if (x==0) y++; } } atomic { if (y==0) x++; } join t; versioned<int> x = 0; versioned<int> y = 0; revision r = rfork { if (x==0) y++; } if (y==0) x++; join r; assert( (x==0 && y==1) || (x==1 && y==0)); assert(x==1 && y==1);

  10. Conflict resolution Versioned<int> x; By default, on a write-write conflict (only), the modification in the child revision wins. x = 0 x = 0 x = 0 x = 1 x = 2 x = 1 x = 0 x = 1 assert( x==2 ) assert( x==0 ) assert( x==1 )

  11. Custom conflict resolution Cumulative<int,(main,join,orig).main + join – orig> x; x = 0 0 x += 1 x += 2 merge(1,2,0)  3 1 2 assert(x==3)

  12. x = 0 0 x += 1 x += 2 2 x += 3 merge(1,2,0) 3 1 2 3 5 merge(3,5,2)  6 assert( x==6 )

  13. A Software engineering perspective • Transactional memory: • Code centric: put “atomic” in the code • Granularity: • too broad: too many conflicts and no parallel speedup • too small: potential races and incorrect code • Concurrent revisions: • Data centric: put annotations on the data • Granularity: group data that have mutual constraints together, i.e. if (x + y > 0) should hold, then x and y should be versioned together.

  14. Current Implementation: C# library • For each versioned object, maintain multiple copies • Map revision ids to versions • `mostly’ lock-free array • New copies are allocated lazily • Don’t copy on fork… copy on first write after fork • Old copies are released on join • No space leak

  15. Demo classSample { [Versioned] inti = 0; publicvoid Run() { vart1 = CurrentRevision.Fork(() => { i += 1; }); vart2 = CurrentRevision.Fork(() => { i += 2; }); i+=3; CurrentRevision.Join(t1); CurrentRevision.Join(t2); Console.WriteLine("i = " + i); } }

  16. Demo: Sandbox Fork a revision without forking an associated task/thread classSandbox { [Versioned] inti = 0; public void Run() { var r = CurrentRevision.Branch("FlakyCode"); try { r.Run(() => { i = 1; throw newException("Oops"); }); CurrentRevision.Merge(r); } catch { CurrentRevision.Abandon(r); } Console.WriteLine("\n i = "+ i); } } Run code in a certain revision Merge changes in a revision into the main one Abandon a revision and don’t merge its changes.

  17. By construction, there is no ‘global’ state: just local state for each revision State is simply a (partial) function from a location to a value

  18. Operational Semantics the state is a composition of the root snapshot  and local modifications  For some revision r, with snapshot  and local modifications and an expression context with hole(x.e) v On a join, the writes of the joineer’ take priority over the writes of the current revision:  ::’ On a fork, the snapshot of the new revision r’ is the current state: ::

  19. Custom merge: per location (type) No conflict if a location was not written in the joinee On a join, using a merge function. No conflict if a location was unmodified in the current revision, use the value of the joinee Conflict otherwise, use a location/type specific merge function Standard merges:

  20. What is a conflict? Cumulative<int> x = 0 • Merge is only called if: (1) write in child, and (2) modification in main revision: 0 x += 2 2 No conflict (merge function is not called) x += 3 0 2 2 5 No conflict (merge function is not called) assert( x = 5 )

  21. Merging with failure On fail, we just ignore any writes in the joinee

  22. Snapshot isolation • Widely used in databases, for example Oracle and Microsoft SQL • In essence, in snapshot isolation a concurrent transaction can only complete in the absence of write-write conflicts. • Our calculus generalizes snapshot isolation: • We support arbitrary nesting • We allow custom merge functions to resolve write-write conflicts deterministically

  23. Snapshot isolation We can succinctly model snapshot isolation as: • Disallow nesting • Use the default merge: Some versions of snapshot isolation do not treat silent writes in a transaction as a conflict:

  24. Sequential merges • We can view each location as an abstract data types (i.e. object) with certain operations (i.e. methods). • If a merge function always behaves as if concurrent operations for those objects are sequential, we call it a sequential merge. • Such objects always behave as if the operations in the joinee are all done sequentially at the join point.

  25. Sequential merges x = o • A merge is sequential if: merge(uw1(o), uw2(o), u(o))= uw1w2(o) • Anduw1w2(o)   u w1 w2 merge(uw1(o), uw2(o), u(o))

  26. Abelian merges • For any abstract data type that forms an abelian group (associative, commutative, with inverses) with neutral element 0 and an operation , the following merge is sequential: merge(v,v’,v0) = v v’  v0 • This holds for example for additive integers and additive sets.

  27. SpaceWars Game Sequential Game Loop: Parallel Collision Detection Render Screen Parallel Collision Detection Parallel Collision Detection Graphics Card Simulate Physics Shared State Play Sounds Send Process Inputs Receive Autosave Key- board Network Connection Disk

  28. Revision Diagram for Parallelized Game Loop Coll. Det. 4 Coll. Det. 1 Coll. Det. 2 Coll. Det. 3 Render network Physics autosave (long running)

  29. “Problem Example 1” is solved Coll. Det. 4 Coll. Det. 1 Coll. Det. 2 Coll. Det. 3 Render network Physics autosave (long running) • Render task reads position of all game objects • Physics task updates position of all game objects • No interference!

  30. “Problem Example 2” is solved. Coll. Det. 4 Coll. Det. 1 Coll. Det. 2 Coll. Det. 3 Render network Physics autosave (long running) • Physics taskupdates position of all game objects • Network task updates position of some objects • Network updates have priority over physics updates • Order of joins establishes precedence!

  31. Results • Autosave now perfectly unnoticeable in background • Overall Speed-Up:3.03x on four-core (almost completely limited by graphics card)

  32. Overhead:How much does all the copying and the indirection cost?

  33. Conclusion Revisions and Isolation Types simplify the parallelization of applications with tasks that • Exhibit conflicting accesses to shared data • Have unpredictable latency • Have unpredictable data access pattern • May perform I/O that can not be rolled back Revisions and Isolation Types are • easy to reason about (determinism, isolation) • have low-enough overhead for many applications

  34. Questions? • daan@microsoft.com • sburckha@microsoft.com • Download available soon on CodePlex

More Related