1 / 26

A Study of Partitioning Policies for Graph Analytics on Large-scale Distributed Platforms

A Study of Partitioning Policies for Graph Analytics on Large-scale Distributed Platforms. Gurbinder Gill Roshan Dathathri Loc Hoang Keshav Pingali. Graph Analytics. Applications: machine learning and network analysis. Datasets: unstructured graphs.

vlad
Download Presentation

A Study of Partitioning Policies for Graph Analytics on Large-scale Distributed Platforms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Study of Partitioning Policies for Graph Analytics on Large-scale Distributed Platforms Gurbinder Gill Roshan DathathriLoc Hoang Keshav Pingali

  2. Graph Analytics Applications: machine learning and network analysis Datasets: unstructured graphs Need TBs of memory Credits: Wikipedia, SFL Scientific, MakeUseOf Credits: Sentinel Visualizer

  3. Distributed Graph Analytics • Distributed-memory clusters are used for in-memory processing of very large graphs • D-Galois [PLDI’18], Gemini [OSDI’16], PowerGraph [OSDI’12], ...  • Graph is partitioned among machines in the cluster • Many heuristics for graph partitioning (partitioning policies) • Application performance is sensitive to policy • There is no clear way to choose policies

  4. Existing Partitioning Studies • Performed on small graphs and clusters • Only considered metrics such as edges cut, avg. number of replicas, etc. • Did not consider work-efficient data-driven algorithms • Only topology-driven algorithms evaluated • Used framework that use similar communication pattern for all partitioning policies • Putting some partitioning policies at a disadvantage

  5. Contributions • Experimental study of partitioning strategies for work-efficient graph analytics applications: • Largest publicly available web-crawls, such as wdc12 (~1TB) • Large KNL and Skylake clusters with up to 256 machines (~69K threads) • Uses the start-of-the-art graph analytics system, D-Galois [PLDI’19] • Evaluate various kinds of partitioning policies • Analyze partitioning policies using an analytical model and micro-benchmarking • Present decision tree for selecting the best partitioning policy

  6. Partitioning B C D A F G H E I J Original graph

  7. Partitioning • Each edge is assigned to a unique host B C D A F G H E I J Original graph

  8. Partitioning • Each edge is assigned to a unique host • All edges connect proxy nodes on the same host B B C D A B C F G E F G D A H I F G H E B C D I J D A F G G Original graph F H J I J

  9. Partitioning • Each edge is assigned to a unique host • All edges connect proxy nodes on the same host • A node can have multiple proxies: one is master proxy; rest are mirror proxies B B C D A B C F G E F G D A H I F G H E B C D I J D A F G G Original graph F H J I J

  10. Synchronization • Reduce: Mirror proxies send updates to master • Broadcast: Master sends canonical value to mirrors B B C D A F G E F G H I B C D D A F G G F H Broadcast Reduce J I J

  11. Matrix View of a Graph Destinations B C D A F G Sources H E I J Original graph Matrix representation

  12. Partitioning Policies & Comm. Pattern 1D Partitioning Incoming Edge Cut (IEC)  Outgoing Edge Cut (OEC)  Communication Pattern

  13. Partitioning Policies & Comm. Pattern 2D Partitioning General Vertex Cut Cartesian Vertex Cut (CVC)  Hybrid Vertex Cut (HVC)  Communication Pattern

  14. Implementation: • Partitioning: State-of-the-art graph partitioner, CuSP[IPDSP’19] • Processing: State-of-the-art distributed graph analytics system, D-Galois [PLDI’18] • Bulk synchronous parallel (BSP) execution • Uses Gluon [PLDI’19] as communication substrate: • Uses partition-specific communication pattern

  15. Experimental Setup • Stampede2 at TACC: Up to 256 hosts • KNL hosts • Skylake hosts • Application: • bc • bfs • cc • pagerank • sssp • Partitioning Policies: • XtraPulp (XEC) • Edge Cut (EC) • Cartesian Vertex Cut (CVC) • Hybrid Vertex Cut (HVC) Inputs and their properties

  16. Execution Time

  17. Compute Time Scales

  18. Communication Volume and Time <=100 >1000 Put 8 and 256 micro here + arrows

  19. Best Partitioning Policy (clueweb12) Execution time (sec):

  20. Decision Tree % difference in execution time between policy chosen by decision tree vs. optimal

  21. Key Lessons • Best performing policy depends on: • Application • Input • Number of hosts (scale) • EC performs well at small scale but CVC wins at large scale • Graph analytics systems must support: • Various partitioning policies • Partition-specific communication pattern

  22. Source Code • Frameworks used for the study: • Graph Partitioner: CuSP [IPDPS’19] • Communication Substrate: Gluon [PLDI’18] • Distributed Graph Analytics Framework: D-Galois [PLDI’18] https://github.com/IntelligentSoftwareSystems/Galois ~ Thank you ~

  23. Partitioning Time CuSP [IPDSP’19]

  24. Partitioning Time (clueweb12) Partitioning time (sec) on 256 hosts: Can be amortized? Say here

  25. Micro-benchmarking

More Related