1 / 20

Pay-to-use strong atomicity on conventional hardware

Pay-to-use strong atomicity on conventional hardware. Martín Abadi, Tim Harris, Mojtaba Mehrara Microsoft Research. Strong semantics atomic, retry, ..... W hat, ideally, should these constructs do?. Programming discipline(s) W hat does it mean for a program to use the constructs correctly?.

yered
Download Presentation

Pay-to-use strong atomicity on conventional hardware

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pay-to-use strong atomicity on conventional hardware Martín Abadi, Tim Harris, Mojtaba Mehrara Microsoft Research

  2. Strong semantics atomic, retry, ..... What, ideally, should these constructs do? • Programming discipline(s) What does it mean for a program to use the constructs correctly? • Low-level semantics & actual implementationsTransactions, optimistic concurrency, program transformations, weak memory models, ... Our approach

  3. Programming disciplines • Which programs are correctly synchronized? All programs Violation-free programs Obeying dynamic separation Obeying static separation More programs correctly synchronized More implementation flexibility

  4. Strong atomicity • Direct accesses work like single-access transactions • We would like: • Implementation flexibility; ongoing innovation in STM/hybrid techniques, optimizations, ... • Invisible / visible readers • In-place / deferred updates • Eager / lazy conflict detection • No overhead on direct accesses • Robust performance, not dependent on success of static analyses

  5. Strong atomicity: implementation Physical addressspace Direct-heap Tx-heap Virtual addressspace Direct memory accesses Memory accesses from atomic blocks

  6. Writes from atomic blocks Physical addressspace Direct-heap Tx-heap Virtual addressspace Direct memory accesses Memory accesses from atomic blocks 1. Atomic block attempts to write to a field of an object

  7. Writes from atomic blocks 2. Revoke direct access to the page holding the direct view of the object Physical addressspace Direct-heap Tx-heap Virtual addressspace Direct memory accesses Memory accesses from atomic blocks

  8. Writes from atomic blocks 3. Use underlying STM write primitives Physical addressspace Direct-heap Tx-heap Virtual addressspace Direct memory accesses Memory accesses from atomic blocks

  9. Writes from atomic blocks Physical addressspace 4. Restore direct access once the underlying transaction has finished and an access violation (AV) occurs Direct-heap Tx-heap Virtual addressspace Direct memory accesses Memory accesses from atomic blocks

  10. Avoiding Access Violations • Safe accesses in runtime system code • Virtual method tables and array length • Memory allocation structures (e.g. free list) • STM implementation structures • GC implementation Forward all these to TX-heap at compile time

  11. Forward to TX-heap Avoiding Access Violations • Safe accesses in normal code • Normal writes to locations that haven’t been read or written in a TX • Normal reads from locations that haven’t been written in a TX • Safe accesses in TX code • TX writes to locations that haven’t been read or written outside TXs • TX reads from locations that haven’t been written outside TXs Avoid page-level tracking

  12. Sample Code private intComputeUniqueSegments (intnthreads) { intnumUniqueSegment = 0; for (inti = 0; i < nthreads; i++) numUniqueSegment += this.uniqueSegments[i].Count; return numUniqueSegment; } Access immutable runtime-system data Genome_Sequencer_ComputeUniqueSegments:: loop: moveax,dwordptr [edi+0x20] // LoaduniqueSegments array reference cmpebx,dwordptr [eax+0x4] // Check reference with array bounds jaeoutOfRange movecx,dwordptr [eax+ebx*4+0x08] // load array element moveax,dwordptr [ecx] // load Count function pointer call dwordptr [eax+0x88] // call Count (get) function add ebp,eax// add it to numUniqueSegments add ebx,1 cmpebx,esi jl loop moveax,dwordptr [edi+0x40000020] // Load uniqueSegments array reference cmpebx,dwordptr [eax+0x40000004] // Check reference with array bounds movecx,dwordptr [eax+ebx*4+0x40000008] // load array element moveax,dwordptr [ecx+0x40000000] // load Count function pointer Safe normal access call dwordptr [eax+0x40000088] // call Count (get) function

  13. Exploiting Safe Accesses • Implemented by extending Steensgard’s points-to analysis • Only safe accesses from normal code were beneficial • Little benefit from identifying safe accesses from inside atomic blocks. #page-table changes:

  14. Patching access violations • Patch sites of AVs • Our heuristic: • Patch on first AV • Also change page protection as normal • Future work: • Remove patches if they become unnecessary • Make multiple patches to bound worst-case perf

  15. Results - Vacation

  16. Results - Delaunay

  17. Results - Genome

  18. Results - Labyrinth

  19. Scaling SA – patch AV + analysis WA

  20. Conclusion • Weak atomicity is an obstacle in providing clear semantics for TM models • We use conventional memory protection hardware to provide strong atomicity • This comes at a low performance cost… high runtime complexity cost • Performance hit can be lowered by compile time analysis

More Related