1 / 21

Three Perspectives & Two Problems

Three Perspectives & Two Problems. Shivnath Babu Duke University. Outline. I want to highlight two problems / thoughts First some context. Three Perspectives. System Designers / Developers. Users of the System. System Administrators. The Cloud era is ringing in interesting changes

gordon
Download Presentation

Three Perspectives & Two Problems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Three Perspectives &Two Problems Shivnath Babu Duke University

  2. Outline • I want to highlight two problems / thoughts • First some context

  3. Three Perspectives System Designers / Developers Users of the System System Administrators • The Cloud era is ringing in interesting changes • Increasingly overlapping roles • Joe Schmoe can now provision a 100-node Hadoop cluster in minutes • Administrators in traditional roles are getting laid off

  4. Three Perspectives System Designers / Developers Users of the System System Administrators • The Cloud era is ringing in interesting changes • Software abstractions / packing / release cycle have changed • More visibility into how users use the software

  5. Problem 1: Automated Experiment-drivenSystem Management

  6. Taking the (Next) Bite Out of System Administration • Cloud has automated some system administration tasks • Can we automate others: • System tuning (configuration parameters, SQL queries, MapReduce jobs) • Detecting and repairing data corruption (disaster recovery) • Software /service testing

  7. Database Performance Tuning 2-dim Projection of a 11-dim Surface

  8. MapReduce Job Tuning in Hadoop 2-dim Projection of a 13-dim Surface

  9. Taking the (Next) Bite Out of System Administration • Cloud has automated some system administration tasks • Can we automate others: • System tuning (configuration parameters, SQL queries, MapReduce jobs) • Detecting and repairing data corruption (disaster recovery) • Software /service testing

  10. Data Corruption Applications Database File-System Storage Stored Data • Stored data becomes different from what it is supposed to be • Bugs in software / firmware • Alpha particles, bit rot • Human mistakes • Bad things have happened • Data loss • System unavailability • Incorrect results

  11. Taking the (Next) Bite Out of System Administration • Cloud has automated some system administration tasks • Can we automate others: • System tuning (configuration parameters, SQL queries, MapReduce jobs) • Detecting and repairing data corruption (disaster recovery) • Software /service testing

  12. Key Insight: Need to Run “Experiments” Applications Database File-System Storage Stored Data Challenge: Where / How / When to run experiments? • System tuning: • Running workload under various system settings • Detecting data corruption: • Running integrity checks to verify data correctness • Software /service testing: • Running the tests

  13. Cloud is Part of the Answer Applications Database File-System Storage Production Data Applications Database File-System Storage Data on system for doing experiments • Take snapshots of production data at low overhead • Fire up production-like instances of the system • Pay-as-you-go, elasticity • Run the experiments

  14. Power of Experiments to the People Declarative benchmarking & tuning Declarative Language Plan optimized sequence of expts Protecting against data corruption Conduct expts automatically Resources

  15. Problem 2: Data-Parallel Computing for the Masses

  16. Challenges • Joe Schmoe can now provision a 100-node Hadoop cluster in minutes. Is that enough? • Joe may need to answers to: • How many reduce tasks to use in MapReduce job Jfor getting the best perf. on my 8-node production cluster? • My current cluster needs more than 6 hours to process 1day’s worth of data. Want to reduce that to under 3hours. How many and what type of Amazon EC2 nodes to use?

  17. Performance Vs. Price Tradeoff

  18. Spectrum Grid Computing Database Systems Newer Data-Parallel Systems Python / R / Java SQL Black-box functions Fixed set of operators Unknown data-access patterns Known data-access patterns Cost-based optimizers, What-if engines

  19. Starfish: Self-Tuning Analytics on Big Data Workload-level tuning Workload Optimizer Elastisizer What-if Engine Workflow-level tuning Workflow-aware Optimizer/Scheduler Job-level tuning Just-in-Time Optimizer Sampler Profiler Data Manager Data Layout & Storage Mgr. Metadata Mgr. Intermediate Data Mgr.

  20. MapReduce Job Tuning in Hadoop True Surface Estimated Surface

  21. Summary • Three perspectives: Developer, User, & Administrator • Two problems: • Automated Experiment-driven System Management • Data-Parallel Computing for the Masses

More Related