220 likes | 236 Views
PBSE: A Robust Path-Based Speculative Execution for Data-Parallel Frameworks. Riza O. Suminto , Cesar A. Stuardo , Alexandra Clark, Huan Ke , Tanakorn Leesatapornwongsa , Bo Fu, Daniar H. Kurniawan , Vincentius Martin, Uma Maheswara Rao G., Haryadi S. G unawi.
E N D
PBSE: A Robust Path-Based Speculative Execution for Data-Parallel Frameworks Riza O. Suminto, Cesar A. Stuardo, Alexandra Clark, HuanKe, TanakornLeesatapornwongsa, Bo Fu, Daniar H. Kurniawan, Vincentius Martin, Uma MaheswaraRao G., Haryadi S. Gunawi
PBSE @ SoCC ’17 Speculative Execution (SE) 101 map task fast input data straggling slow NIC fastbackup task
PBSE @ SoCC ’17 Is Hadoop always tail tolerant? Facebook Hadoopjobs on 30 nodes Normal (no slow NIC) vs. with 1slow 1Mbps NIC 1 job/hour!!! Why SE does not work?? 172 job/hour Many jobs cannot “escape” limping NIC
PBSE @ SoCC ’17 Problem 1: No Straggler Detected • All slow “No”straggler detected! slowNIC
PBSE @ SoCC ’17 Problem 2: Straggling Backup • Backup task is also straggling! Many more cases fast slowNIC slow blame!
PBSE @ SoCC ’17 Findings Basic Hadoop SE (BaseSE) • Good • Resource contention • Heterogeneous resources • Not Robust • Node-level network degradation • slow NIC and switches
PBSE @ SoCC ’17 What’s The Flaw? • Network limpware is not considered a fault model • Task != Path • Slow paths are sometimes not exposed • Path progresses are lumped into per-task scores 4 paths 2 scores all Slow!
PBSE @ SoCC ’17 Path-Based SE (PBSE) • Intuition: A taskis a collection of paths 1. PathDiversity 2. PathProgress 3. PathSpeculation • [Now] • Progresses • of 4paths: • M1R1 • M1R2 • M2R1 • M2R2 [Before] Progresses of 2tasks: R1 R2 1
PBSE @ SoCC ’17 Contribution • Introduce node-level network degradation • Real important fault model • Expose path progress • Not just task progress • Develop pathstraggler detection & speculation • PBSE full integration to Hadoop • Initial integration to QFS, Spark, & Flume
PBSE @ SoCC ’17 Outline • Intro • PBSE Techniques • Path Diversity • Path Progress • Path-based Speculation • Implementation & Evaluation • Conclusion
PBSE @ SoCC ’17 Path Diversity • Enforce no potential SPOF-node • SPOF = single point of [tail-latency] failure • (to compare progresses of diverse paths)
PBSE @ SoCC ’17 Path Progress • Each task reports path progresses/bandwidths • (not just report task scores) • Pathprogress can reveal the culprit • [Now] • 4 Paths: • M1R1 • M1R2 • M2R1 • M2R2 [Before] 2 Tasks: R1 R2 • [Now] • 4 Paths: • R1O1 • O1O1’ • R2O2 • O2O2’ [Before] 2 Tasks: R1 R2 No Straggler Actual straggler!
PBSE @ SoCC ’17 Path-based Speculation • Use knowledge of previous failing paths • (don’t always blame the task/stage, blame the bottleneck)
PBSE @ SoCC ’17 Outline • Intro • PBSE Techniques • Implementation & Evaluation • Conclusion
PBSE @ SoCC ’17 Hadoop + PBSE • 6000+LOC over Hadoop/HDFS 2.7.1 • 3200 LOC in Application Manager • 1400 LOC in Task Management • 1400 LOC in HDFS
PBSE @ SoCC ’17 PBSE vsBaseSE no tail 150 JobsFacebook Trace 15 nodes One 1-Mbps NIC speedup escape tail-SPOF
PBSE @ SoCC ’17 Varied Experiment • Varying: • NIC degradations (60/30/10/1/0.1 Mbps) • Workload (Facebook, Cloudera) • Cluster size (15 to 60 nodes) • PBSE still gain speedups speedup-bw PBSE speedup at specific percentile of jobs
PBSE @ SoCC ’17 PBSE vs Other Strategies • Others: • Vs. Other schedulers (Capacity / FIFO / Fair) • Vs. Other SE solutions (Cloning / Aggressive / HRead) • PBSE wins in all scenarios • We fix the fundamental flaws • i.e. path-based,not task-based PBSE Wins!
PBSE @ SoCC ’17 Beyond MapReduce • All data-parallel systems need robust tail tolerance
PBSE @ SoCC ’17 Outline • Intro • PBSE Techniques • Implementation & Evaluation • Conclusion
PBSE @ SoCC ’17 Conclusion (Task abstraction) (Path abstraction) BaseSE + Resource Contention BaseSE + Limping NIC PBSE + Limping NIC long tail latency! can not escape tail More Robust SE
SystemName @ ConfName ’14 Thank you!Questions? http://ucare.cs.uchicago.edu