170 likes | 305 Views
Community 1.0.0 release riding an elephant was never so easy…. Agenda. Differentiator Ultra light Deployment Features – Description Strategic Users & Positioning. Differentiator. What Jumbune solves?.
E N D
Community 1.0.0 release riding an elephant was never so easy…
Agenda • Differentiator • Ultra light Deployment • Features – Description • Strategic Users & Positioning
What Jumbune solves? • Detects Job Code & data Inaccuracies: Hadoop MapReduce analytics output is not as expected • Analyses Job Profiling: MapReduce Jobs have some performance bottlenecks • On demand Cluster Monitoring: Whole Cluster can’t be monitored/unmonitored at will • Non intrusive operation: Don’t want intrusive deployments on monitoring or analyzing daemons
Key Differentiating Features • Cluster Monitoring can be turned-on on demand • MapReduce Flow drill down (World’s only) • Decoupled installation from Hadoop • MapReduce Phase wise statistics (time vs. data flow ratevs. resource)
Features & Users • Hadoop Cluster Monitoring • MR Job Profiling • HDFS Data Validator • MR Flow Debugger
Analytic Solution Costs & Solutions • MapReduce Solution Development Costs: • Fault prone - Development and Data Staging • Days to resolve on real data (because of Volume) • performance bottlenecks may be present - MapReduce Jobs • Hadoop Cluster Monitoring Costs: • Administrator – analyzes each node separately • On Each node – install & run monitoring daemons • Cluster resources – running daemons will consume them
Hadoop Cluster Monitoring • Data Centre & Rack aware nodes view • Dynamic Interval based monitoring • Hadoop JMX, Node Resource Statistics • Network Latency across Hadoop nodes • Per file, node wise replica Placement (which nodes have replicas of a given file ?) • HDFS data placement view (HDFS balanced ?) • HDFS Health statistics (HDFS corrupted ?)
MR Job Profiling • Per Job Phase wise • performance for each JVM • data flow rate • Resource usage • Per Job Heap sites for Mapper & Reducer • Per Job CPU cycles for Mapper & Reducer
HDFS Data Validator • Validates inconsistencies in HDFS data in the form of : - Null checks - Data type checks - Regular expression checks
MapReduce Flow Debugger • Verifies the flow of input records in user’s map reduce implementation • Drill down visualization helps developer to quickly identify the problem. • Only tool to assist developers to figure out MapReduce implementation faults without any extra coding