270 likes | 688 Views
Streaming Graph Partitioning for Large Distributed Graphs. Isabelle Stanton, UC Berkeley Gabriel Kliot , Microsoft Research XCG. Motivation. Modern graph datasets are huge The web graph had over a trillion links in 2011. Now?
E N D
Streaming Graph Partitioning for Large Distributed Graphs Isabelle Stanton, UC Berkeley Gabriel Kliot, Microsoft Research XCG
Motivation • Modern graph datasets are huge • The web graph had over a trillion links in 2011. Now? • facebook has “more than 901 million users with average degree 130” • Protein networks
Motivation • We still need to perform computations, so we have to deal with large data • PageRank (and other matrix-multiply problems) • Broadcasting status updates • Database queries • And on and on and on… Graph has to be distributed across a cluster of machines! P QL
Motivation • Edges cut correspond (approximately) to communication volume required • Too expensive to move data on the network • Interprocessor communication: nanoseconds • Network communication: microseconds • The data has to be loaded onto the cluster at some point… • Can we partition while we load the data?
High Level Background • Graph partitioning is NP-hard on a good day • But then we made it harder: • Graphs like social networks are notoriously difficult to partition (expander-like) • Large data sets drastically reduce the amount of computation that is feasible – O(n) or less • The partitioning algorithms need to be parallel and distributed
The Streaming Model Possible Buffer of size Each machine holds nodes Graph Stream → • Graph is ordered: • Random • Breadth-First Search • Depth-First Search Partitioner Goal: Generate an approximately balanced k-partitioning
Lower Bounds On Orderings Best balanced -partition cuts edges • Adversarial Ordering • Give every other vertex • See no edges till ! • Can’t compete • DFS Ordering • Stream is connected • Greedy will do optimally Theory says these types of algorithms can’t do well • Random Ordering • Birthday paradox: won’t see edges until • Still can’t compete with edges cut
Current Approach in Real Systems • Totally ignore edges and hash vertex ID • Pro • Fast to locate data • Doesn’t require a complex DHT or synchronization • Con • Hashing the vertex ID cuts a fraction of the edges for any order • Great simple approximation for MAX-CUT
Our Approach • Evaluate 16 natural heuristics on 21 datasets with each of the three orderings with varying numbers of partitions • Find out which heuristics work on each graph • Compare these with the results of • Random Hashing to get worst case • METIS to get ‘best’ offline performance
Caveats • METIS is a heuristic, not true lower bound • Does fine in practice • Available online for reproducing results • Used publicly available datasets • Public graph datasets tend to be much smaller than what companies have • Using meta-data for partitioning can be good • partitioning the web graph by URL • Using geographic location for social network users
Heuristics • Balanced • Chunking • Hashing • (weighted) Deterministic Greedy • (weighted) Randomized Greedy • Triangles • Balance Big Uses a Buffer of size • Prefer Big • Avoid Big • Greedy EvoCut Weight functions Unweighted Linear weighted Exponentially weighted
Datasets • Includes finite element meshes, citation networks, social networks, web graphs, protein networks and synthetically generated graphs • Sizes: 297 vertices to 41.7 million vertices • Synthetic graph models • Barabasi-Albert (Preferential Attachment) • RMAT (Kronecker) • Watts-Strogatz • Power law-Clustered • Biggest graphs: LiveJournal and Twitter
Experimental Method • For each graph, heuristic, and ordering, partition into 2, 4, 8, 16 pieces • Compare with a random cut – upper bound • Compare with METIS – lower bound • Performance was measured by:
BFS DFS Random Heuristic Results Synthetic Hash Best heuristic, LDG, gets an average improvement of 76% over all datasets! METIS Finite element mesh Social network
Scaling in the Size of Graphs: Exploiting Synthetic Graphs Hash LDG METIS
More Observations • BFS is a superior ordering for all algorithms • Avoid Big does 46% WORSE on average than Random Cut • Further experiments showed Linear Det. Greedy has identical performance to Det. Greedy with load-based tie breaking.
Results on a Real System • Compared the streamed partitioning with random hashing on SPARK, a distributed cluster computation system (http://www.spark-project.org/) • Used 2 datasets • 4.6 million users, 77 million edges • 41.7 million users, 1.468 billion edges • Computed the PageRank of each graph
Results on SPARK • LiveJournal – 4.6 million users, 77 million edges • Twitter – 41.7 million users, 1.468 billion edges Twitter Improvement: Naïve – 19.1% Combiner – 18.8 % LJ Improvement: Naïve – 38.7% Combiner – 28.8 %
Streaming graph partitioning is a really nice, simple, effective preprocessing step.
isabelle@eecs.berkeley.edu Where to now? • Can we explain theoretically why the greedy algorithm performs so well?* • What heuristics work better? • What heuristics are optimal for different classes of graphs? • Use multiple parallel streams! • Implement in real systems! *Work under submission: I. Stanton, Streaming Balanced Graph Partitioning Algorithms for Random Graphs
isabelle@eecs.berkeley.edu Acknowledgements • David B. Wecker • Burton Smith • Reid Andersen • Nikhil Devanur • SamehElkinety • SreenivasGollapudi • YuxiongHe • RinaPanigrahy • Yuval Peres All at MSR • SatishRao • Virginia Vassilevska Williams • Alexandre Stauffer • Ngoc Mai Tran • MiklosRacz • MateiZaharia All at Berkeley - CS and Statistics Supported by NSF and NDSEG fellowships, NSF grant CCF-0830797, and an internship at Microsoft Research’s eXtreme Computing Group.