260 likes | 434 Views
Elixir : A System for Synthesizing Concurrent Graph Programs . Dimitrios Prountzos 1 Roman Manevich 2 Keshav Pingali 1. 1. The University of Texas at Austin 2. Ben-Gurion University of the Negev. Goal. Allow programmer to easily implement correct and efficient
E N D
Elixir : A System for Synthesizing Concurrent Graph Programs Dimitrios Prountzos1 Roman Manevich2 Keshav Pingali1 1. The University of Texas at Austin 2. Ben-Gurion University of the Negev
Goal Allow programmer to easilyimplement correct and efficient parallel graph algorithms • Graph algorithms are ubiquitous Social network analysis, Computer graphics, Machine learning, … • Difficult to parallelize due to their irregular nature • Best algorithm and implementation usually • Platform dependent • Input dependent • Need to easily experiment with different solutions • Focus: Fixed graph structure • Only change labels on nodes and edges • Each activity touches a fixed number of nodes
Example: Single-Source Shortest-Path S 5 2 • Problem Formulation • Compute shortest distancefrom source node Sto every other node • Many algorithms • Bellman-Ford (1957) • Dijkstra (1959) • Chaotic relaxation (Miranker 1969) • Delta-stepping (Meyer et al. 1998) • Common structure • Each node has label distwith knownshortest distance from S • Key operation • relax-edge(u,v) A B A 2 1 7 C C 3 4 3 12 D E 2 2 F 9 1 G if dist(A) + WAC < dist(C) dist(C) = dist(A) + WAC
Dijkstra’s Algorithm <B,5> <B,5> <C,3> <E,6> <B,5> <D,7> S 5 2 Scheduling of relaxations: • Use priority queueof nodes, ordered by label dist • Iterate over nodes u in priority order • On each step: relax all neighbors v of u • Apply relax-edgeto all (u,v) A B 5 3 1 7 C 3 4 D E 7 2 2 6 F 9 1 G
Chaotic Relaxation S 5 2 • Scheduling of relaxations: • Use unordered set of edges • Iterate over edges (u,v) in any order • On each step: • Apply relax-edge to edge (u,v) A B 5 1 7 C 3 4 12 D E 2 2 F 9 1 G (C,D) (B,C) (S,A) (C,E)
Insights Behind Elixir Parallel Graph Algorithm What should be done How it should be done Operators Schedule Unordered/Ordered algorithms Order activity processing Identify new activities Operator Delta “TAO of parallelism” PLDI 2011 : activity Static Schedule Dynamic Schedule
Insights Behind Elixir Parallel Graph Algorithm q = new PrQueue q.enqueue(SRC) while (! q.empty ) { a = q.dequeue for each e = (a,b,w) { if dist(a) + w < dist(b) { dist(b) = dist(a) + w q.enqueue(b) } } } Operators Schedule Order activity processing Identify new activities Static Schedule Dynamic Schedule Dijkstra-style Algorithm
Contributions Parallel Graph Algorithm • Language • Operators/Schedule separation • Allows exploration of implementation space • Operator Delta Inference • Precise Delta required for efficient fixpoint computations • Automatic Parallelization • Inserts synchronization to atomically execute operators • Avoids data-races / deadlocks • Specializes parallelization based on scheduling constraints Operators Schedule Order activity processing Identify new activities Static Schedule Dynamic Schedule Synchronization
SSSP in Elixir Graph [ nodes(node : Node, dist : int) edges(src : Node, dst : Node, wt : int)] Graph type relax = [ nodes(node a, dist ad) nodes(node b, distbd) edges(src a, dst b, wt w)bd> ad + w ] ➔ [ bd = ad + w ] Operator Fixpoint Statement sssp = iterate relax ≫ schedule
Operators Graph [ nodes(node : Node, dist : int) edges(src : Node, dst : Node, wt : int)] relax = [ nodes(node a, dist ad) nodes(node b, distbd) edges(src a, dst b, wt w)bd> ad + w ] ➔ [ bd = ad + w ] Redex pattern Guard Update sssp = iterate relax ≫ schedule ad bd ad ad+w w w a b a b if bd > ad + w
Fixpoint Statement Graph [ nodes(node : Node, dist : int) edges(src : Node, dst : Node, wt : int)] relax = [ nodes(node a, dist ad) nodes(node b, distbd) edges(src a, dst b, wt w)bd > ad + w ] ➔ [ bd = ad + w ] sssp = iterate relax ≫ schedule Scheduling expression Apply operator until fixpoint
Scheduling Examples q = new PrQueue q.enqueue(SRC) while (! q.empty ) { a = q.dequeue for each e = (a,b,w) { if dist(a) + w < dist(b) { dist(b) = dist(a) + w q.enqueue(b) } } } Graph [ nodes(node : Node, dist : int) edges(src : Node, dst : Node, wt : int)] relax = [ nodes(node a, dist ad) nodes(node b, distbd) edges(src a, dst b, wt w)bd > ad + w ] ➔ [ bd = ad + w ] sssp = iterate relax ≫ schedule Locality enhanced Label-correcting group b ≫unroll 2 ≫approx metric ad Dijkstra-style metric ad ≫group b
Operator Delta Inference Parallel Graph Algorithm Operators Schedule Order activity processing Identify new activities Static Schedule Dynamic Schedule
Identifying the Delta of an Operator ? b relax1 ? a
Delta Inference Example c relax2 w2 a b w1 SMT Solver relax1 assume(da + w1< db) assume¬(dc + w2 < db) db_post =da + w1 assert¬(dc + w2 < db_post) SMT Solver (c,b) does not become active Query Program
Delta Inference Example – Active Apply relax on all outgoing edges (b,c) such that: dc > db +w2 and c ≄ a relax1 relax2 a b c w1 w2 SMT Solver assume(da + w1< db) assume¬(db+ w2 < dc) db_post =da + w1 assert¬(db_post+ w2< dc) SMT Solver Query Program
System Architecture C++Program Elixir Synthesize code Insert synchronization Galois/OpenMP Parallel Runtime Parallel Thread-Pool Graph Implementations Worklist Implementations Algorithm Spec
Experiments Compare against hand-written parallel implementations
SSSP Results • 24 core Intel Xeon @ 2 GHz • USA Florida Road Network (1 M nodes, 2.7 M Edges) Group + Unroll improve locality Implementation Variant
Breadth-First Search Results Scale-Free Graph 1 M nodes, 8 M edges USA road network 24 M nodes, 58 M edges
Conclusion • Graph algorithm = Operators + Schedule • Elixir language : imperative operators + declarative schedule • Allows exploring implementation space • Automated reasoning for efficiently computing fixpoints • Correct-by-construction parallelization • Performance competitive with hand-parallelized code
Related Work • DSL-Synthesis • SPIRAL [Puchel et al. IEEE’05], Pochoir [Tang et al. SPAA’11], Green-Marl [Hong et al. ASPLOS’12] • Synthesis from logical specifications • [Itzhaky et al. OOPSLA’10] [Srivastava et al. POPL’10] Sketching[Lezamaet al. PLDI 08], Paraglide [Vechev et al. PLDI’08] • Term and Graph Rewriting • Progress[Schurr’99], GrGen [Gei’06], GP [Plump’09] • Finite Differencing [Paige’82]
Read paper for… • Full scheduling language • Parallelizing ordered iterations • Automatic reasoning to enable level-parallel execution • Specialization of dynamic scheduler • Synchronization details • Synthesis procedures
Influence Patterns d c a b=c a=d b a=c b a=c b=d a b=d d a=d b=c c