SkewReduce

Skew-Resistant Parallel Processing of Feature-Extracting Scientific User-Defined Functions SkewReduce YongChul Kwon Magdalena Balazinska, Bill Howe, Jerome Rolia* University of Washington, *HP Labs Published in SoCC 2010

Motivation • Science is becoming a data management problem • MapReduce is an attractive solution • Simple API, declarative layer, seamless scalability, … • But it is hard to • Express complex algorithms and • Get good performance (14 hours vs. 70 minutes) • SkewReduce: • Goal: Scalable analysis with minimal effort • Toward scalable feature extraction analysis

Application 1: Extracting Celestial Objects • Input • { (x,y,ir,r,g,b,uv,…) } • Coordinates • Light intensities • … • Output • List of celestial objects • Star • Galaxy • Planet • Asteroid • … M34 from Sloan Digital Sky Survey

Application 2: Friends of Friends • Simple clustering algorithm used in astronomy • Input: • Points in multi-dimensional space • Output: • List of clusters (e.g., galaxy, …) • Original data annotated with cluster ID • Friends • Two points within ε distance • Friends of Friends • Transitive closure of Friends relation ε

Parallel Friends of Friends • Partition • Local clustering • Merge • P1-P2 • P3-P4 • Merge • P1-P2-P3-P4 • Finalize • Annotate original data C2 C1 C5 C5 →C3 C3 P1 P3 P2 P4 C4 C6 C4→C3 C6 →C5 C6 →C3

Parallel Feature Extraction • Partition multi-dimensional input data • Extract features from each partition • Merge (or reconcile) features • Finalize output INPUT DATA Features Map “Hierarchical Reduce”

Problem: Skew Skew Local Clustering (MAP) Task ID Merge (REDUCE) 5 minutes 35 minutes 8 node cluster, 32map/32reduce slots Time The top red line runs for 1.5 hours

Unbalanced Computation: Skew • Computation skew • Characteristics of an algorithm • Same amount of input data != Same runtime O(N log N) ~ O(N2) 0 friends per particle O(N) friends per particle Can we scale out off-the-shelf implementation without (or minimal) modifications?

Solution 1?Micro partition • Assign tiny amount of work to each task to reduce skew

How about having micro partitions? 8 node cluster, 32map/32reduce slots • It works! • Framework overhead! • To find sweet spot, need to try different granularities! Can we find a good partitioning plan without trial and error?

Outline • Motivation • SkewReduce • API (in the paper) • Partition Optimization • Evaluation • Summary

Partition Optimization Serial Feature Extraction Algorithm • Varying granularities of partitions • Can we automatically find a good partition plan and schedule? Merge Algorithm 1 5 9 13 6 11 7 10 8 15 3 12 4 14 2

Approach Runtime Plan • Goal: minimize expected total runtime • SkewReduce runtime plan • Bounding boxes for data partitions • Schedule 1 Sample SkewReduce Optimizer 5 9 13 6 11 7 Cost functions 10 8 15 3 Cluster configuration 12 4 14 2

Partition Plan Guided By Cost Functions “Given sample, how long will it take to process?” • Two cost functions: • Feature cost: (Bounding box, sample, sample rate) → cost • Merge cost:(Bounding boxes, sample, sample rate) → cost • Basis of two optimization decisions • How (axis, point) to split a partition • When to stop partitioning …

Search Partition Plan • Greedy top-down search • Split if total expected runtime improves • Evaluate costs for subpartitions and merge • Estimate new runtime Original Possible Split 50 1 2 3 1 ACCEPT Schedule 1 = 60 3 1 3 2 10 2 Time 100 REJECT 50 Schedule 2 = 110

Summary of Contributions • Given a feature extraction application • Possibly with computation skew • SkewReduce • Automatically partitions input data • Improves runtime in spite of computation skew • Key technique: user-defined cost functions

Evaluation • 8 node cluster • Dual quad core CPU, 16 GB RAM • Hadoop 0.20.1 + custom patch in MapReduce API • Distributed Friends of Friends • Astro: Gravitational simulation snapshot • 900 M particles • Seaflow: flow cytometry survey • 59 M observations

Does SkewReduce work? (18 GB, 3D) (1.9 GB, 3D) MapReduce • SkewReduce plan yields 2 ~ 8 times faster running time 1 hour preparation Hours Minutes

Impact of Cost Function Astro Higher fidelity = Better performance

Highlights of Evaluation • Sample size • Representativeness of sample is important • Runtime of SkewReduce optimization • Less than 15% of real runtime of SkewReduce plan • Data volume in Merge phase • Total volume during Merge = 1% of input data • Details in the paper

Conclusion • Scientific analysis should be easy to write, scalable, and have a predictable performance • SkewReduce • API for feature extracting functions • Scalable execution • Good performance in spite of skew • Cost-based partition optimization using a data sample • Published in SoCC 2010 • More general version is coming out soon!

SkewReduce

SkewReduce

Presentation Transcript

SkewReduce