1.1k likes | 1.27k Views
Analyses and Optimizations for Multithreaded Programs. John Whaley IBM Tokyo Research Laboratory. Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science. Motivation. Threads are Ubiquitous Parallel Programming for Performance Manage Multiple Connections
E N D
Analyses and Optimizations for Multithreaded Programs John Whaley IBM Tokyo Research Laboratory Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science
Motivation • Threads are Ubiquitous • Parallel Programming for Performance • Manage Multiple Connections • System Structuring Mechanism • Overhead • Thread Management • Synchronization • Opportunities • Improved Memory Management
What This Talk is About • New Abstraction: Parallel Interaction Graph • Points-To Information • Reachability and Escape Information • Interaction Information • Caller-Callee Interactions • Starter-Startee Interactions • Action Ordering Information • Analysis Algorithm • Analysis Uses (synchronization elimination, stack allocation, per-thread heap allocation)
Outline • Example • Analysis Representation and Algorithm • Lightweight Threads • Results • Conclusion
Sum Sequence of Numbers 9 8 1 5 3 7 2 6
1 5 3 7 2 6 9 8 Group in Subsequences
1 5 3 7 2 6 9 8 + + + + 10 17 8 6 Sum Subsequences (in Parallel)
1 5 3 7 2 6 9 8 + + + + 17 10 8 6 Add Sums Into Accumulator Accumulator 0
1 5 3 7 2 6 9 8 + + + + 17 10 8 6 Add Sums Into Accumulator Accumulator 17
1 5 3 7 2 6 9 8 + + + + 17 10 8 6 Add Sums Into Accumulator Accumulator 23
1 5 3 7 2 6 9 8 + + + + 17 10 8 6 Add Sums Into Accumulator Accumulator 33
1 5 3 7 2 6 9 8 + + + + 17 10 8 6 Add Sums Into Accumulator Accumulator 41
Common Schema • Set of tasks • Chunk tasks to increase granularity • Tasks have both • Independent computation • Updates to shared data
Realization in Java class Accumulator { int value = 0; synchronized void add(int v) { value += v; } }
0 2 6 Realization in Java class Task extends Thread { Vector work; Accumulator dest; Task(Vector w, Accumulator d) { work = w; dest = d; } public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum); } } Task work dest Vector Accumulator
0 2 6 Realization in Java class Task extends Thread { Vector work; Accumulator dest; Task(Vector w, Accumulator d) { work = w; dest = d; } public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum); } } Enumeration Task work dest Vector Accumulator
Realization in Java void generateTask(int l, int u, Accumulator a) { Vector v = new Vector(); for (int j = l; j < u; j++) v.addElement(new Integer(j)); Task t = new Task(v,a); t.start(); } void generate(int n, int m, Accumulator a) { for (int i = 0; i < n; i ++) generateTask(i*m, i*(m+1), a); }
Task Generation Accumulator 0
Task Generation Accumulator 0 Vector
2 Task Generation Accumulator 0 Vector
2 6 Task Generation Accumulator 0 Vector
2 6 Task Generation Task work dest Accumulator 0 Vector
2 8 6 9 Task Generation Task work dest Accumulator 0 Vector Vector
2 8 6 9 Task Generation Task work dest Accumulator 0 Vector dest work Task Vector
1 2 8 6 5 9 Task Generation Task work dest Accumulator 0 Vector dest dest Task work work Task Vector Vector
Analysis Overview • Interprocedural • Interthread • Flow-sensitive • Statement ordering within thread • Action ordering between threads • Compositional, Bottom Up • Explicitly Represent Potential Interactions Between Analyzed and Unanalyzed Parts • Partial Program Analysis
Analysis Result for run Method public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum); } this Enumeration Task • Abstraction: Points-to Graph • Nodes Represent Objects • Edges Represent References work dest Vector Accumulator
Analysis Result for run Method public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum); } this Enumeration Task • Inside Nodes • Objects Created Within Current Analysis Scope • One Inside Node Per Allocation Site • Represents All Objects Created At That Site work dest Vector Accumulator
Analysis Result for run Method public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum); } this Enumeration Task • Outside Nodes • Objects Created Outside Current Analysis Scope • Objects Accessed Via References Created Outside Current Analysis Scope work dest Vector Accumulator
Analysis Result for run Method public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum); } this Enumeration Task • Outside Nodes • One per Static Class Field • One per Parameter • One per Load Statement • Represents Objects Loaded at That Statement work dest Vector Accumulator
Analysis Result for run Method public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum); } this Enumeration Task • Inside Edges • References Created Inside Current Analysis Scope work dest Vector Accumulator
Analysis Result for run Method public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum); } this Enumeration Task • Outside Edges • References Created Outside Current Analysis Scope • Potential Interactions in Which Analyzed Part Reads Reference Created in Unanalyzed Part work dest Vector Accumulator
Concept of Escaped Node • Escaped Nodes Represent Objects Accessible Outside Current Analysis Scope • parameter nodes, load nodes • static class field nodes • nodes passed to unanalyzed methods • nodes reachable from unanalyzed but started threads • nodes reachable from escaped nodes • Node is Captured if it is Not Escaped
Why Escaped Concept is Important • Completeness of Analysis Information • Complete information for captured nodes • Potentially incomplete for escaped nodes • Lifetime Implications • Captured nodes are inaccessible when analyzed part of the program terminates • Memory Management Optimizations • Stack allocation • Per-Thread Heap Allocation
Intrathread Dataflow Analysis • Computes a points-to escape graph for each program point • Points-to escape graph is a pair <I,O,e> • I - set of inside edges • O - set of outside edges • e - escape information for each node
Dataflow Analysis • Initial state: I : formals point to parameter nodes, classes point to class nodes O: Ø • Transfer functions: I´ = (I – KillI) U GenI O´ = O U GenO • Confluence operator is U
Intraprocedural Analysis • Must define transfer functions for: • copy statement l = v • load statement l1 = l2.f • store statement l1.f = l2 • return statement return l • object creation site l = new cl • method invocation l = l0.op(l1…lk)
copy statement l = v KillI= edges(I, l) GenI= {l} × succ(I, v) I´ = (I – KillI) U GenI Existing edges l v
copy statement l = v KillI= edges(I, l) GenI= {l} × succ(I, v) I´ = (I – KillI) U GenI Generated edges l v
load statement l1 = l2.f SE= {n2 in succ(I, l2) . escaped(n2)} SI= U{succ(I, n2, f) . n2 in succ(I, l2)} case 1: l2 does not point to an escaped node (SE= Ø) KillI= edges(I, l1) GenI= {l1} × SI Existing edges l1 f l2
load statement l1 = l2.f SE= {n2 in succ(I, l2) . escaped(n2)} SI= U{succ(I, n2, f) . n2 in succ(I, l2)} case 1: l2 does not point to an escaped node (SE= Ø) KillI= edges(I, l1) GenI= {l1} × SI Generated edges l1 f l2
load statement l1 = l2.f case 2: l2 does point to an escaped node (not SE=Ø) KillI= edges(I, l1) GenI= {l1} × (SIU {n}) GenO= (SE× {f}) × {n} where n is the load node for l1 = l2.f Existing edges l1 l2
load statement l1 = l2.f case 2: l2 does point to an escaped node (not SE=Ø) KillI= edges(I, l1) GenI= {l1} × (SIU {n}) GenO= (SE× {f}) × {n} where n is the load node for l1 = l2.f Generated edges l1 n f l2
store statement l1.f = l2 GenI= (succ(I, l1) × {f}) × succ(I, l2) I´ = I U GenI Existing edges l1 l2
store statement l1.f = l2 GenI= (succ(I, l1) × {f}) × succ(I, l2) I´ = I U GenI Generated edges l1 f l2
object creation site l = new cl KillI= edges(I, l) GenI= {<l, n>} where n is inside node for l = new cl Existing edges l
object creation site l = new cl KillI= edges(I, l) GenI= {<l, n>} where n is inside node for l = new cl Generated edges n l
Method Call • Analysis of a method call: • Start with points-to escape graph before the call site • Retrieve the points-to escape graph from analysis of callee • Map outside nodes of callee graph to nodes of caller graph • Combine callee graph into caller graph • Result is the points-to escape graph after the call site
a t v Start With Graph Before Call Points-to Escape Graph before call to t = new Task(v,a)