1 / 14

Epoch parallelism: One execution is not enough

Epoch parallelism: One execution is not enough. Jessica Ouyang, Kaushik Veeraraghavan, Dongyoon Lee, Peter Chen, Jason Flinn, Satish Narayanansamy University of Michigan. Motivation. Write a single program that is both fast & correct Make it easier for programmers

maxime
Download Presentation

Epoch parallelism: One execution is not enough

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Epoch parallelism:One execution is not enough Jessica Ouyang, Kaushik Veeraraghavan, Dongyoon Lee, Peter Chen, Jason Flinn, Satish Narayanansamy University of Michigan OSDI ’10 Research Visions 3 October 2010

  2. Motivation • Write a single program that is both fast & correct • Make it easier for programmers • Change approach to programming • Write program that is fast or correct – not both • Combine multiple, specialized executions • Fast/buggy accelerates slow/correct • Slow/correct checks fast/buggy Fast & Correct Fast & Buggy Fast & Buggy Slow & Correct Slow & Correct Jessica Ouyang

  3. Epoch parallelism Fast & buggy Slow & correct E0 E0 E1 E1 E1 E2 != ==? 1. Checkpoint state E3 E2 E2 2. Start epoch 3. Check state E3 E3 4. Roll back &Re-execute E3 E3 Time Jessica Ouyang

  4. Uniprocessor execution Multiprocessor Uniprocessor CPU 0 CPU 1 CPU 2 CPU 3 A0 A0 B0 E0 • Nice properties of uniprocessor • Fewer races • Stronger memory consistency model • Easier to replay A1 B1 E1 E0 E1 A1 Performance B0 B1 Jessica Ouyang

  5. Using epoch parallelism Multi-threaded Single-threaded CPU 0 CPU 1 CPU 2 CPU 3 E0 S0 A0 B0 E0 Transform function A1 B1 E1 E1 S1 • Challenges • Importingstate to start epochs • Checking state Jessica Ouyang

  6. Conclusion • Rethink having a single program/execution be both fast & correct • Use separate, specialized executions to achieve different goals Jessica Ouyang - University of Michigan

  7. Epoch parallelism:One execution is not enough Jessica Ouyang, Kaushik Veeraraghavan, Dongyoon Lee, Peter Chen, Jason Flinn, Satish Narayanansamy University of Michigan OSDI ’10 Research Visions 3 October 2010

  8. Related Work • Master/Slave Speculative Parallelization • Zilles, Sohi, IEEE ‘02 • Thread-Level Data Speculation • Steffan, Mowry, HPCA ‘98 • Enhancing Software Reliability with Speculative Threads • Oplinger, Lam, APLOS ’02 • BASE • Castro, Rodriguez, Liskov, TOCS ’03 • GRACE • Berger, Yang, Liu, Novark, OOPSLA ‘09 Jessica Ouyang

  9. More uses of epoch parallelism • Uniprocessor execution • Deterministic replay • Data race detection/avoidance • Optimistic concurrency • Lock elision • Transactional memory • Additional runtime checks • Assertions, bounds checking • Security checks Jessica Ouyang

  10. Programming effort • Write one program • Compiler/runtime/hardware optimizes aggressively • Original program checks correctness • Write 2 versions of same program • One with checks (assertions, security) and one without • Write 2 versions + transform function • Arbitrary implementations Jessica Ouyang

  11. Programming effort • Single-threaded & multi-threaded use case • Need additional transform function • Generate input to start epochs • Is this really less work than 1 correct & fast multi-threaded program? Jessica Ouyang

  12. Redundancy & efficiency • Base-line overhead is 2x throughput • Acceptable for some applications • Core counts increasing • Using cores is hard • Can make it more efficient • Remove redundant instructions • Use fast & buggy as software predictor for slow & correct (branched, load value) Jessica Ouyang

  13. Epoch parallelism Fast, buggy Correct, slow E0 E0 E1 E1 E1 E2 E3 E2 E2 E3 E3 Time Jessica Ouyang

  14. Misspeculation in epoch parallelism E0 E0 Fast, buggy ? ? E1 E1 Correct, slow E1 E1 E2 Slow and correct E0 has completed Check thread-parallel checkpoint Checkpoint doesn’t match! E3 E2 Use result from epoch-parallel Restart execution of epoch 1 Continue executing E3 Time Jessica Ouyang

More Related