210 likes | 343 Views
Three Perspectives & Two Problems. Shivnath Babu Duke University. Outline. I want to highlight two problems / thoughts First some context. Three Perspectives. System Designers / Developers. Users of the System. System Administrators. The Cloud era is ringing in interesting changes
E N D
Three Perspectives &Two Problems Shivnath Babu Duke University
Outline • I want to highlight two problems / thoughts • First some context
Three Perspectives System Designers / Developers Users of the System System Administrators • The Cloud era is ringing in interesting changes • Increasingly overlapping roles • Joe Schmoe can now provision a 100-node Hadoop cluster in minutes • Administrators in traditional roles are getting laid off
Three Perspectives System Designers / Developers Users of the System System Administrators • The Cloud era is ringing in interesting changes • Software abstractions / packing / release cycle have changed • More visibility into how users use the software
Taking the (Next) Bite Out of System Administration • Cloud has automated some system administration tasks • Can we automate others: • System tuning (configuration parameters, SQL queries, MapReduce jobs) • Detecting and repairing data corruption (disaster recovery) • Software /service testing
Database Performance Tuning 2-dim Projection of a 11-dim Surface
MapReduce Job Tuning in Hadoop 2-dim Projection of a 13-dim Surface
Taking the (Next) Bite Out of System Administration • Cloud has automated some system administration tasks • Can we automate others: • System tuning (configuration parameters, SQL queries, MapReduce jobs) • Detecting and repairing data corruption (disaster recovery) • Software /service testing
Data Corruption Applications Database File-System Storage Stored Data • Stored data becomes different from what it is supposed to be • Bugs in software / firmware • Alpha particles, bit rot • Human mistakes • Bad things have happened • Data loss • System unavailability • Incorrect results
Taking the (Next) Bite Out of System Administration • Cloud has automated some system administration tasks • Can we automate others: • System tuning (configuration parameters, SQL queries, MapReduce jobs) • Detecting and repairing data corruption (disaster recovery) • Software /service testing
Key Insight: Need to Run “Experiments” Applications Database File-System Storage Stored Data Challenge: Where / How / When to run experiments? • System tuning: • Running workload under various system settings • Detecting data corruption: • Running integrity checks to verify data correctness • Software /service testing: • Running the tests
Cloud is Part of the Answer Applications Database File-System Storage Production Data Applications Database File-System Storage Data on system for doing experiments • Take snapshots of production data at low overhead • Fire up production-like instances of the system • Pay-as-you-go, elasticity • Run the experiments
Power of Experiments to the People Declarative benchmarking & tuning Declarative Language Plan optimized sequence of expts Protecting against data corruption Conduct expts automatically Resources
Challenges • Joe Schmoe can now provision a 100-node Hadoop cluster in minutes. Is that enough? • Joe may need to answers to: • How many reduce tasks to use in MapReduce job Jfor getting the best perf. on my 8-node production cluster? • My current cluster needs more than 6 hours to process 1day’s worth of data. Want to reduce that to under 3hours. How many and what type of Amazon EC2 nodes to use?
Spectrum Grid Computing Database Systems Newer Data-Parallel Systems Python / R / Java SQL Black-box functions Fixed set of operators Unknown data-access patterns Known data-access patterns Cost-based optimizers, What-if engines
Starfish: Self-Tuning Analytics on Big Data Workload-level tuning Workload Optimizer Elastisizer What-if Engine Workflow-level tuning Workflow-aware Optimizer/Scheduler Job-level tuning Just-in-Time Optimizer Sampler Profiler Data Manager Data Layout & Storage Mgr. Metadata Mgr. Intermediate Data Mgr.
MapReduce Job Tuning in Hadoop True Surface Estimated Surface
Summary • Three perspectives: Developer, User, & Administrator • Two problems: • Automated Experiment-driven System Management • Data-Parallel Computing for the Masses