1 / 34

Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

Michael Bond Milind Kulkarni Man Cao Minjia Zhang Meisam Fathi Salmi Swarnendu Biswas Aritra Sengupta Jipeng Huang. Purdue. Octet: Capturing and Controlling Cross-Thread Dependences Efficiently. Ohio State. Parallel programming is mainstream. Shared memory with locks Challenge:

dylan
Download Presentation

Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Michael Bond MilindKulkarni Man Cao Minjia Zhang MeisamFathiSalmi SwarnenduBiswas AritraSengupta Jipeng Huang • Purdue Octet: Capturing and Controlling Cross-Thread Dependences Efficiently • Ohio State

  2. Parallel programming is mainstream Sharedmemory with locks Challenge: performance & correctness

  3. Need practical runtime support Help express parallelism better Eliminate concurrency errors Diagnose production bugs Deal with nondeterminism

  4. Need practical runtime support • Transactional memory • DRF/SC enforcement • Deterministic execution • Atomicity checking • Data race detection • Record & replay

  5. Need practical runtime support • Atomicity checking • Data race detection • Record & replay • Transactional memory • DRF/SC enforcement • Deterministic execution • Track dependences • Control dependences o.f = … … = o.f

  6. Need practical runtime support • Atomicity checking • Data race detection • Record & replay • Transactional memory • DRF/SC enforcement • Deterministic execution • Track dependences • Control dependences o.f = … … = o.f

  7. Need practical runtime support • Atomicity checking • Data race detection • Record & replay • Transactional memory • DRF/SC enforcement • Deterministic execution • Track dependences • Control dependences o.f = … … = o.f • Commodity (software-only) approaches • slow programs by several times

  8. Need practical runtime support • Atomicity checking • Data race detection • Record & replay • Transactional memory • DRF/SC enforcement • Deterministic execution • Track dependences • Control dependences check o.f = … check … = o.f • Commodity (software-only) approaches • slow programs by several times

  9. Need practical runtime support • Atomicity checking • Data race detection • Record & replay • Transactional memory • DRF/SC enforcement • Deterministic execution • Track dependences • Control dependences check o.f = … check … = o.f Any access could race add synchronization at every access

  10. Octet • Framework for runtimesupport • HB edges  all dependences • Atomicity of analysis & access • Concurrency control mechanism • Synchronization  cross-thread dependence •  Qualitative performance improvement

  11. Octet • Framework for runtimesupport • HB edges  all dependences • Atomicity of analysis & access • Concurrency control mechanism • Synchronization  cross-thread dependence •  Qualitative performance improvement Proofs!

  12. Octet tracks ownership • Each object’s stateЄ { WrExT, RdExT, RdShc }

  13. o’s state = WrExT1 T1 T2 write check wro.f Time

  14. o’s state = WrExT1 T1 T2 write check read check wro.f Time

  15. o’s state = WrExT1 T1 T2 write check read check wro.f Time

  16. o’s state = WrExT1 T1 T2 write check read check wro.f Time safe point

  17. o’s state = WrExT1 T1 T2 write check read check wro.f Implicit safe point Time safe point

  18. o’s state = WrExT1 T1 T2 write check read check wro.f Time safe point

  19. o’s state = RdExT2 T1 T2 write check read check wro.f Time safe point

  20. o’s state = RdExT2 T1 T2 write check read check wro.f Time safe point rdo.f

  21. o’s state = RdExT2 T1 T2 T3 T4 write check wro.f read check safe point rdo.f

  22. o’s state = RdExT2 T1 T2 T3 T4 write check wro.f read check safe point rdo.f read check

  23. o’s state = RdShc T1 T2 T3 T4 write check wro.f read check safe point rdo.f read check rdo.f

  24. o’s state = RdShc T1 T2 T3 T4 write check wro.f read check safe point rdo.f read check read check rdo.f

  25. o’s state = RdShc T1 T2 T3 T4 write check wro.f read check safe point rdo.f read check read check rdo.f rdo.f

  26. o’s state = RdShc T1 T2 T3 T4 Sharing detection [von Praun & Gross ’01] Comparison in our paper Distributed shared memory Shasta [Scales et al. ’96] Biased locking [Kawachiya et al. ’02] [Russell & Detlefs ’06] [Hindman & Grossman ’06] write check wro.f read check safe point rdo.f read check read check rdo.f rdo.f

  27. Practical runtime support • Atomicity checking • Data race detection • Record & replay • Transactional memory • DRF/SC enforcement • Deterministic execution • Track dependences • Control dependences Framework for runtime support Concurrency control mechanism Octet

  28. Dependence recorder records happens-before edges T1 T2 T3 T4 write check wro.f read check safe point rdo.f read check read check rdo.f rdo.f

  29. Implementation in Jikes RVM • Publicly available • http://jikesrvm.org/Research+Archive Parallel programs DaCapo Benchmarks 2006& 2009 SPEC JBB 2000& 2005 • Parallel platform • 32cores (AMD Opteron 6272)

  30. Octet helps enable practical runtime support for reliable, scalable concurrency • Framework for runtime support • HB edges  all dependences • Atomicity of analysis & access • Concurrency control mechanism • Synchronization  cross-thread dependence • Qualitative performance improvement

More Related