1 / 12

Hadoop High Availability through Metadata Replication

Hadoop High Availability through Metadata Replication. Presented by: Lin Chen 04/06/2011. F. Wang, et. al. CloudDB ‘09 Proceeding of the first international workshop on Cloud data management, 2009. Why consider high availability support?. Hadoop. Metadata.

tallys
Download Presentation

Hadoop High Availability through Metadata Replication

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hadoop High Availability through Metadata Replication Presented by: Lin Chen 04/06/2011 F. Wang, et. al. CloudDB ‘09 Proceeding of the first international workshop on Cloud data management, 2009

  2. Why consider high availability support? Hadoop Metadata NameNode One copy DataNode DataNode DataNode Piece1 Piece3 Piece1 Piece3 Piece2 Piece1 Piece2 Piece2 Large Files http://developer.yahoo.com/hadoop/tutorial/module2.html

  3. Why consider high availability support? MapReduce Metadata Jobtracker One copy Tasktracker Tasktracker Tasktracker NameNode and JobTracker are critical nodes in architectures Both them keep one copy System will crush if this critical node is down SPOF: Single Point of Failure Event: Amazon S3 stopped working for nearly eight hours on July 21, 2008, thousands of online stores using S3 service were down for hours too

  4. In order to avoid SPOF, a solution by using metadata replication was proposed and evaluated to guarantee high availability Metadata Metadata Metadata Slave Slave Slave Metadata Primary DataNode DataNode DataNode Both Hadoop and MapReduce follows master-slave structure, their critical node can be replicated to keep high availability Environment with several backup nodes is called primary-slaves, one with single backup is called active-standby

  5. Procedure

  6. Initialization Node Registration Slave Slave Slave Primary Primary Slave Request IP Address Slave IP Address Table Slave IP Address Table Reply Table content ACK

  7. Initialization Initial Metadata Synchronization Initial metadata include version file and file system image (fsimage) Slave Slave Slave Primary Primary node ask slaves version file information Consistent? Mark inconsistent slave node Primary node ask slaves fsimage file information Consistent? Mark inconsistent slave node Update inconsistent slave node This step keeps the consistency between the initial metadata of primary node and slave nodes

  8. Replication Architecture Metadata ACK Primary Slave Heartbeat ACK Control Message

  9. Failover Primary node may be out of work if slave node has not receive the ACK for a long time Leader Election Broadcast sequence number 0 1 2 Slave Slave Slave Slave IP Address Table Primary? Disagree if primary node alive Sequence is highest? Yes, Agree No, Disagree Majority agreement? Yes, slave is qualified No, new round of election Primary

  10. Failover IP Address Transition NameNodeof HDFS is accessed through IP address New primary node just change its IP address to the IP address of old primary node New primary node can take over all communications with other nodes Reinitial the slave nodes Slave Slave Slave Request Re-register Primary

  11. Conclusion Hadoop does not provide high availability for SPOF Metadata replication is used to guarantee high availability Additional time cost is needed to achieve high availability

  12. Questions?

More Related