160 likes | 296 Views
Nagarjuna K. YARN. Why Next Generation MR. Reliability Availability Scalability - Clusters of 10,000 machines and 200,000 cores, and beyond. Backward (and Forward) Compatibility Ensure customers’ MapReduce applications run unchanged in the next version of the framework.
E N D
Nagarjuna K YARN nagarjuna@outlook.com
Why Next Generation MR • Reliability • Availability • Scalability - Clusters of 10,000 machines and 200,000 cores, and beyond. • Backward (and Forward) Compatibility • Ensure customers’ MapReduce applications run unchanged in the next version of the framework. • Evolution – Ability for customers to control upgrades to the Hadoop software stack. • Predictable Latency – A major customer concern. • Cluster utilization nagarjuna@outlook.com
Why Next Generation MR • Secondary Requirements • Support for alternate programming paradigms to MapReduce. • Support for short-lived services nagarjuna@outlook.com
ReArchitecure • Need • Separate the tasks of Job Tracker • Resource management • Job Scheduling / Management nagarjuna@outlook.com
So, What did we come up with nagarjuna@outlook.com Resource Manager Node Manager Application Master Container
Resource Manager (RM) Manages the global assignment of compute resources to applications. nagarjuna@outlook.com
Resource Manager (RM) • A pure Scheduler • No monitoring, tracking status of application • No guarantee on restarting failed tasks. nagarjuna@outlook.com
Resource Manager (RM) • Each client/application may request multiple resources • Memory • Network • Cpu • Disk .. • This is a significant change from static Mapper / Reducer model nagarjuna@outlook.com
Application Master • A per – applicationApplicationMaster(AM) that manages the application’s life cycle(scheduling and coordination). • An application is either a single job in the classic MapReduce jobs or a DAG of such jobs. nagarjuna@outlook.com
Application Master A per – applicationApplicationMaster(AM) that manages the application’s life cycle. nagarjuna@outlook.com
Application Master • Application Master has the responsibility of • negotiating appropriate resource containers from the Scheduler • launching tasks • tracking their status • monitoring for progress • handling task-failures. nagarjuna@outlook.com
Node Manager • The NodeManager is the per-machine framework agent • responsible for launching the applications’ containers, monitoring their resource usage (cpu, memory, disk, network) and reporting the same to the Scheduler. nagarjuna@outlook.com
Gain with New Architecture • Scalability • Availability • Wire-compatibility • Innovation & Agility • Cluster Utilization • Support for programming paradigms other than MapReduce nagarjuna@outlook.com
Gain with New Architecture Scalability Availability Wire-compatibility Innovation & Agility Cluster Utilization Support for programming paradigms other than MapReduce • RM and Job manager segregated • The Hadoop MapReduceJobTracker spends a very significant portion of time and effort managing the life cycle of applications nagarjuna@outlook.com
Gain with New Architecture Scalability Availability Wire-compatibility Innovation & Agility Cluster Utilization Support for programming paradigms other than MapReduce • ResourceManage • Uses ZooKeeper for fail-over. • When primary fails, secondary can quickly start using the state stored in ZK • Application Master • MapReduceNextGen supports application specific checkpoint capabilities for the ApplicationMaster. • MapReduceApplicationMaster can recover from failures by restoring itself from state saved in HDFS. nagarjuna@outlook.com