1 / 51

A Framework for Reasoning About Inherent Parallelism in Modern Object-Oriented Languages

A Framework for Reasoning About Inherent Parallelism in Modern Object-Oriented Languages. Presented by A. Craik ( 5 -Jan-12). Research supported by funding from Microsoft Research and the Queensland State Government. Introduction. Semantic Analysis. Dependency Analysis.

khan
Download Presentation

A Framework for Reasoning About Inherent Parallelism in Modern Object-Oriented Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Framework for Reasoning About Inherent Parallelism in Modern Object-Oriented Languages Presented by A. Craik (5-Jan-12) Research supported by funding from Microsoft Research and the Queensland State Government

  2. Introduction Semantic Analysis Dependency Analysis Procedural Algorithm Parallel Algorithm Sequential Implementation Explicitly Parallel Implementation Procedural Algorithm Sequential Implementation w/ Injected Parallelism 2

  3. Introduction • Inherent Parallelism: a = 1; b = 2; c = a + b; • Three steps for finding & exploiting: • Find the inherent parallelism in the program • Decide which inherent parallelism is worth exploiting • Choose an implementation technology to expose the selected parallelism for (inti=0;i<max;++i) a[i] = a[i] + 1;   

  4. Introduction • Dependencies impose ordering constraints • Sequential consistency required • Two forms • Control – which statements will run • Data – reads & writes of shared state • Control well studied and easier to handle inter-procedurally • Example, Java checked exceptions

  5. Data Dependencies • Flow Dependence (Write-After-Read)int a = 1;int b = a + 1;a = 2; • Output Dependence (Write-After-Write)int a = 1;a = 4;a = 5; • Anti-Dependence (Read-After-Write)int a = 1;a = 2;int b = a + 1;

  6. Traditional Approach for (inti=0; i < 3; ++i) { for (int j=0; j < i+1; ++j) { a[i,j] = b[i,j] + c[i,j]; b[i,j] = a[i,j+1]; } } • Pair-wise analysis of statements and expressions • Can a, b or c refer to the array?

  7. Traditional Approach for (inti=0; i < 3; ++i) { for (int j=0; j < i+1; ++j) { a[i,j] = b[i,j] + c[i,j]; b[i,j] = a.readIandJInc(i,j); } } • What does a.readIandJInc(i,j) do? • Examine ALL possible implementations!

  8. Side-Effects class Holder { public static int value; } class Array { public intreadsIandJInc(i,j) { return this[i,j+1]; } }

  9. Side-Effects class Holder { public static int value; } class Array { public intreadsIandJInc(i,j) { this[0,0] = i + j; return this[i,j]; } }

  10. Side-Effects class Holder { public static int value; } class Array { public intreadsIandJInc(i,j) { Holder.value++; return this[i,j]; } }

  11. Limitations of Current Techniques • Traditional: • Focused on analyzing complex tight loops • Poor abstraction and composition • Too complex for programmers to use without tool support

  12. The Idea • Goal: • Simplify inter-procedural dependency analysis • Idea: • Ensure safety • Make reasoning modular and composable

  13. The Idea • Specify effects on method signature: public intgetReads() reads<> writes<>  • What goes in the angle brackets? • Abstract effect description • Composable descriptions • Verifiable

  14. The Idea

  15. Object-Orientation • Encapsulation  representation hierarchy Person Company name dateOfBirth employer Date String

  16. The Idea

  17. Safe Parallelism Block 1 { ... } reads <a,b> writes <c,d> Block 2 { ... } reads <w,x> writes <y,z> • Can 2 arbitrary pieces of code execute in parallel safely? • Type rules specify computation of effect sets • Look for overlaps in the read & write effect sets to find possible data deps.

  18. Dependencies using Effect Sets • Dependency exists where two triangles of representation overlap • Triangles can only be nested: • Becomes a check for a parent-child relationship; disjointess no dep.  

  19. Types of Parallelism • Task Parallelism • Run 2+ separate ops. at same time • Loop Parallelism • Execute loop iterations in parallel • Pipeline Parallelism • Stage loop body execution so that iteration execution overlaps safely

  20. Task Parallelism class Demo { void op1() reads<a,b> writes<c,d> {…} void op2() reads<w,x> writes<y,z> {…}} • Can we execute calls to op1 and op2 in parallel? • Determine the overlap in the effect sets; no overlap  no data deps. • Realization using one-way calls or futures

  21. Loop Parallelism Conditions • Data parallel loops major source of parallelism in imperative programs • Start with simple data parallel loop in the form of a foreach loop:foreach (T element in collection) element.operation();

  22. Foreach Loop Conditions • Condition 1:Areas holding the representations of the objects returned by the enumerator are all disjoint from one another

  23. Foreach Loop Conditions • Condition 2:The operation only mutates the representation of its “own” element and does not read the state owned by any of the other elements

  24. Foreach Loop Conditions • Condition 3:There are no control dependencies which would prevent loop parallelization

  25. Arbitrary Loop Bodies • So far we have looked at foreach(T element in collection)element.operation(); • Question: How do we generalize this to an arbitrary loop body? foreach(T element in collection) { //sequence of statements //including local vardefs//and a read of a context r }

  26. Loop Body Rewriting • Loop becomes: foreach (T elem in collection) elem.loopBody(this); • Where loopBody is: class T { void loopBody(Foo me) { //same sequence of statements //replace all elem by this //and all this by me } }

  27. Object-Orientation • Encapsulation  representation hierarchy Person Company name dateOfBirth employer Date String

  28. Ownership Types • Designed to enforce encapsulation • Adapted to validate encapsulation • Type parameters to capture memory referencing permissions class Person [o,c] {private String|this| Name;private Date|this| DateOfBirth;private Company|c| Employer;… }

  29. Ownerships & Effects classCompany[o] {publicstring name;… } classPerson[o,c] {privateCompany|c| Employer; publicstringemployerName()reads<this,c> writes<> {return Employer.name;}… }

  30. Contexts and Dependencies • Analyze & apply sufficient conditions • All pairs of context relations need to be known • Need some basis to believe the relationships between contexts to hold

  31. Reasons for a Runtime System • Statically know some relationships • The owner of an object is a parent of the object’s this context • The world context is a parent of all contexts • Relationship may only be known dynamically • Optionally track at runtime to allow runtime conditions

  32. Conditional Parallelism parallel for(T<c> e in collection){ e.operation(arguments); } disjoint(r,c) Always True if (disjoint(r,c)) { parallel version } else { sequential version } for(T<c> e in collection){ e.operation(arguments); } disjoint(r,c) unknown serial for(T<c> e in collection){ e.operation(arguments); } disjoint(r,c) Always False

  33. Reasons for a Runtime System • We do not know the relationships between all contexts at compile time. • May vary from one object or method invocation to another • Reasons: • Separate Compilation • Dynamic Linking • Complex Data Flows

  34. Reasons for a Runtime System • Type system provides support for specifying context relationships programmer asserts must be true void oper1[r]() reads<r,c…> writes<…> where r # c { …foreach(T|c| elem in collection){…} …}

  35. Runtime System Implementation • Naïve implementation – each object keeps a pointer to its owner

  36. Well Formed Heap Subject Reduction Progress Owner Invariance AFJO Soundness Effect Soundness Contexts form a Tree Cast Safety Effect Completeness Static Context Relations Context Parameters do not survive Disjointness Test Correct Context Disjointness Implies Effect Disjointness Disjoint effects imply no data dependencies Update Dependency Preservation Sufficient for Parallelization Sequential Consistency Task Parallelism Sufficient Conditions Data Parallelism Sufficient Conditions Pipeline Parallelism Sufficient Conditions

  37. Implementation – Zal • Added my system to C# 3.5 • Extended GPC# compiler • Added infrastructure to support arbitrary type parameters • Implemented runtime ownership tracking system (~1,000 lines)

  38. Implementation – Zal Zal Compiler Microsoft C# Compiler Zal source C# source CIL Program w/ Ownership Tracking Executing Program with Automatic Parallelization Runtime Ownership Libraries

  39. Implementation – Zal Legend EffectComputation Parallelization OwnershipImplementation C# compilation step AST AST Parallelize() Checks sufficient conditions for parallelism and implements them computeEffects() LocalEffects() Computes heap & stack effects for AST nodes BuildOwnershipImplementation() Implements Zal features in C# by modifying AST Zal compilation step I/O AST AST Scannergenerated by GPLex Parsergenerated by Coco/R Type Checker CodeGeneration Tokens AST AST Scanner.scan() Reads a stream of characters and processes them into tokens Parser.parse() Converts stream of tokens into an Abstract Syntax Tree TypeCheck() Resolves all TypeRefs to TypeDefs & checks type correctness Output() Emit Generates C# or CIL implementation of AST

  40. Validation • Have applied my system to a number of realistic applications • Overall annotation requires modification to 20% of the source • Ownership tracking overhead: • Execution time: 10% to 20% • Memory usage: 15% to 30% • Implementation not fully optimized

  41. Validation – Speedup

  42. Validation – Speedup

  43. Related Work – Prog. Langs. • Focus on providing tools to express parallelism • No support for validating correctness of parallelization • Assumed programmer knowledge of parallel programming constructs • Examples: Fortress, Chapel, X10

  44. Related Work – Ownership • Have proposed effect systems, but only suggested application to parallelism • Data race and dead lock detection for locking – very different reasoning • Deterministic Parallel Java (late 2009) • modified ownerships • Focused on kernels • Lost composition & abstraction to do so

  45. Contributions • Abstract and composable system for reasoning about effects based on Ownership Types. • Effect and reasoning systems applied to a real language and real program examples • Real parallelism detected and exploited automatically

  46. Contributions • Developed and proved sufficient conditions for a number of different forms of parallelism • Runtime system to support static reasoning.

  47. Publications A. Craik and W. Kelly. Using Ownership to Reason About Inherent Parallelism in Imperative Object-Oriented Programs. International Conference on Compiler Construction. ed. R. Gupta, LNCS 6011, pp. 145-164, Springer-Verlag Berlin Hiedleberg, 2010. W. Reid, W. Kelly, and A. Craik. Reasoning about Parallelism in Modern Object-Oriented Languages. Australasian Computer Science Conference. 2008 +3 technical reports on various versions of the reasoning systemin e-prints

  48. Conclusion • System for reasoning about data dependencies and parallelism • Abstract & composable • Usable by both programmers & automated tools • Question of when & how to exploit still open • Demonstration this automated reasoning is possible w/ prototype

  49. Q & A

  50. Ownership & The Stack • Ownerships traditionally for encapsulation • Stack not considered by these works • Stack & stack referencing models vary from language to language • I consider a restricted stack model: • Stack and heap are disjoint • Stack locations can be differentiated by name

More Related