Monitoring Hadoop through Tracing

Monitoring Hadoop through Tracing Andy Konwinski and Matei Zaharia

Objectives • Debug and profile data center applications • Hadoop file system and map-reduce • Apache Nutch web indexing engine • Automatically detect problems from traces

State-of-the-Art • Unpublished proprietary log management systems at Google, Yahoo, etc • Per-machine logs • Sawzall for mining log data • Node monitoring daemon (System Health Infrastructure)

Our Idea • Capture causality directly by tracing computations across nodes using X-Trace • Use machine learning to detect problems • Detect unusual runs using unsupervised learning • Classify problems using supervised learning • Also want to study Hadoop performance

Risks • Scaling X-Trace data collection • Analyzing X-Trace reports in real time • Identifying features of X-Trace graphs to run machine learning on • Our manually induced errors may not capture all failures that happen in a production cluster

The Plan

Monitoring Hadoop through Tracing

Monitoring Hadoop through Tracing

Presentation Transcript

Hadoop

Tracing Through Exchanges

Hadoop

Hadoop

HADOOP Monitoring and Diagnostics: Challenges and Lessons Learned

Hadoop , Hadoop , Hadoop !!!

Hadoop

Hadoop High Availability through Metadata Replication

Hadoop

Tracing Vegetation Changes through Carbon Isotopes

HADOOP

Tracing through E01, question 9 – step 1

Tracing

Tracing the Indo-Europeans through Linguistic Palaeontology (II)

Hadoop

Profiling, Tracing, Debugging and Monitoring Frameworks

Hadoop

Performance tuning through Hadoop Mapreduce optimization