180 likes | 325 Views
Abhishek Verma , Lucy Cherkasova , Vijay S. Kumar, Roy H. Campbell. Deadline -Based Workload Management for MapReduce Environments: Pieces of the Performance Puzzle. Big Data is here to stay. Motivation. MapReduce paradigm Often MapReduce applications part of critical business pipelines
E N D
Abhishek Verma, Lucy Cherkasova, Vijay S. Kumar, Roy H. Campbell Deadline-Based Workload Management for MapReduce Environments:Pieces of the Performance Puzzle IEEE/IFIP Network Operations and Management Symposium 2012
Motivation • MapReduce paradigm • Often MapReduce applications part of critical business pipelines • Require job completion time guarantees (SLOs) • Existing MapReduce schedulers do not support Service Level Objectives • E.g.: FIFO, Fair-share, Quincy
Current State of the Art • Configure separate queues/pools • Cons:manual, rule of thumb, error-prone • Goal: Design an automatedworkload management framework for MapReduce jobs with completion time goals in sharedenvironments
Three Pieces of the Puzzle How to order Jobs?
Job Ordering • Multiple MapReduce jobs in shared environment • User specifies deadline • Schedule jobs according to earliest deadline first • Give all resources?
Three Pieces of the Puzzle How many resources? How to order Jobs?
How many resources? • N machines in the MapReduce cluster • Sharing machines enables more satisfied SLOs • Allocate minimum resources that complete the job within deadline • ARIA Performance model [ICAC’10] • Profile past job executions and build job profile • Automatically calculate minimum resources using analytical modeling and enforce minimum quota • Dynamically adjust allocation
ARIA Resource Estimation (MinEDF) Find minimum resources using Lagrange multipliers
Allocating Spare Resources • After allocating minimum quota, how to allocate spare resources? • Allocate spare resources among currently running jobs (work conserving), • Complete faster and make room for future jobs • Need ability to pre-empt jobs if more “urgent” jobs arrive • MapReduce jobs can be pre-empted at the task level
MinEDF-Work Conserving Algorithm When new job arrives: • Do we have enough resources to meet jobs’s deadline? • If yes, allocate them; return. • Estimate task durations of currently running jobs • Will enough resources be released in the future? • If yes, wait for them; return. • Kill extra tasks of currently running jobs to release enough slots to meet job’s deadline
Experimental Evaluation • 66 HP DL145 machines • 4 x 2.39 GHz cores, 8 GB RAM, 160 GB hard disks • Two racks: Gigabit Ethernet • Workload: 100 jobs • WordCount, Sort, Bayesian Classification, TF-IDF, Twitter, WikiTrends • Different dataset sizes • Poisson distribution of arrival
Methodology • Simulation: SimMR [Cluster’10] • Metrics • % of jobs with missed deadlines • Average job completion time • Number of extra map and reduce tasks • Relative deadline exceeded
Missed-deadline Jobs Min-EDF-WC misses 2 times lesser job deadlines than Min-EDF
Average Job Completion Time Min-EDF-WC leads to smaller job completion times than Min-EDF
Summary 1 2 3 Questions? verma7@illinois.edu
References • [ICAC’10] "ARIA: Automatic Resource Inference and Allocation for MapReduce Environments", Abhishek Verma, LudmilaCherkasova and Roy H. Campbell. International Conference on Autonomic Computing (ICAC), Karlsruhe, Germany, June 2011. • [Cluster’10] "Play it again, SimMR!”, Abhishek Verma, LudmilaCherkasova and Roy H. Campbell. IEEE International Conference on Cluster Computing (CLUSTER), Austin, Texas, Sept 2011.