170 likes | 319 Views
Distributed System Report 1. Siddharth Sarasvati Karthikeyen Balu Sudipan Mishra. Overview. Distributed Tries for Load Balancing in P2P Systems Distributed Hash Table Load Balancing Job Scheduling in Hadoop Fair Scheduler Capacity Scheduler.
E N D
Distributed System Report 1 Siddharth Sarasvati Karthikeyen Balu Sudipan Mishra
Overview Distributed Tries for Load Balancing in P2P Systems • Distributed Hash Table • Load Balancing Job Scheduling in Hadoop • Fair Scheduler • Capacity Scheduler
Distributed Tries for Load Balancing in P2P Systems What are DHTs? • Decentralized Distributed Hash Tables • Properties • Decentralized • Fault Tolerance • Scalable Load Balancing • Efficiency needed to avoid performance degradation
Distributed Tries for load balancing in P2P Systems • DHT structure
H H L L L L H L L Naïve approaches to Load balancing • Virtual Servers • ID reassignment
Limitations with Naïve approaches • Communication cost for node join & leave is high • Join or leave operation requires prior knowledge of the entire system
Hypothesis Distributed Trie addresses the reduction in communication costs comparing with naïve approaches.
Trie Structure for Load balancing • Construct a Distributed Trie for DHT ID space • To minimize load balance cost • To lower communication cost
Approach Trie is balanced => DHT ID space is balanced
Hadoop • Open source implementation of MapReduce • Default scheduling- FIFO • Critical Jobs? Ad-hoc Analysis?
Job Scheduling Fair Scheduler • Groups jobs into “pools” • Assign each pool a guaranteed minimum share Capacity Scheduler • Jobs are submitted to a queue • Queues get their capacity when they contain jobs • Unused capacity is used between queues
Investigation • Simulate the systems • Prove/Disprove the efficiency of the discussed job scheduling algorithm over the default(FIFO) implementation • Analyze the efficiency of Load balancing using Distributed Trie over naïve approaches
References • Author - Minseok Kwon, Gahyun Park • Title - Distributed Tries for Load Balancing in Peer-to-Peer Systems • Conference - Proceedings of IEEE IWQoS, June 2010 • Year - 2010 • URL - http://www.cs.rit.edu/~jmk/papers/trieload.pdf
References • Author - Yingwu Zhu, YimingHu • Title - Towards Efficient Load Balancing in Structured P2P Systems • Conference - Proceedings of the 18th International Parallel and Distributed Processing Symposium • Year - 2004 • URL - http://fac-staff.seattleu.edu/zhuy/web/papers/load_bala.pdf
References • Author - Michael Isard, VijayanPrabhakaran, Jon Currey, UdiWieder, KunalTalwar and Andrew Goldberg • Title - Quincy: Fair Scheduling for Distributed Computing Clusters • Conference - Proceedings of the ACM SIGOPS 22nd symposium on Operating Systems Principles • Year - 2009 • URL - http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.154.5498