Hadoop High Availability through Metadata Replication

Hadoop High Availability through Metadata Replication Presented by: Lin Chen 04/06/2011 F. Wang, et. al. CloudDB ‘09 Proceeding of the first international workshop on Cloud data management, 2009

Why consider high availability support? Hadoop Metadata NameNode One copy DataNode DataNode DataNode Piece1 Piece3 Piece1 Piece3 Piece2 Piece1 Piece2 Piece2 Large Files http://developer.yahoo.com/hadoop/tutorial/module2.html

Why consider high availability support? MapReduce Metadata Jobtracker One copy Tasktracker Tasktracker Tasktracker NameNode and JobTracker are critical nodes in architectures Both them keep one copy System will crush if this critical node is down SPOF: Single Point of Failure Event: Amazon S3 stopped working for nearly eight hours on July 21, 2008, thousands of online stores using S3 service were down for hours too

In order to avoid SPOF, a solution by using metadata replication was proposed and evaluated to guarantee high availability Metadata Metadata Metadata Slave Slave Slave Metadata Primary DataNode DataNode DataNode Both Hadoop and MapReduce follows master-slave structure, their critical node can be replicated to keep high availability Environment with several backup nodes is called primary-slaves, one with single backup is called active-standby

Procedure

Initialization Node Registration Slave Slave Slave Primary Primary Slave Request IP Address Slave IP Address Table Slave IP Address Table Reply Table content ACK

Initialization Initial Metadata Synchronization Initial metadata include version file and file system image (fsimage) Slave Slave Slave Primary Primary node ask slaves version file information Consistent? Mark inconsistent slave node Primary node ask slaves fsimage file information Consistent? Mark inconsistent slave node Update inconsistent slave node This step keeps the consistency between the initial metadata of primary node and slave nodes

Replication Architecture Metadata ACK Primary Slave Heartbeat ACK Control Message

Failover Primary node may be out of work if slave node has not receive the ACK for a long time Leader Election Broadcast sequence number 0 1 2 Slave Slave Slave Slave IP Address Table Primary? Disagree if primary node alive Sequence is highest? Yes, Agree No, Disagree Majority agreement? Yes, slave is qualified No, new round of election Primary

Failover IP Address Transition NameNodeof HDFS is accessed through IP address New primary node just change its IP address to the IP address of old primary node New primary node can take over all communications with other nodes Reinitial the slave nodes Slave Slave Slave Request Re-register Primary

Conclusion Hadoop does not provide high availability for SPOF Metadata replication is used to guarantee high availability Additional time cost is needed to achieve high availability

Questions?

Hadoop High Availability through Metadata Replication