150 likes | 339 Views
COEN317: Distributed Systems April 7, 2006. Review of: “ Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications”. Paper by: Ion Stoica*, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan (MIT, Berkeley*) Technical Report: MIT-LCS-TR-819 On-line as:
E N D
COEN317: Distributed Systems April 7, 2006 Review of: “Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications” Paper by: Ion Stoica*, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan (MIT, Berkeley*) Technical Report: MIT-LCS-TR-819 On-line as: http://www.lcs.mit.edu/publications/pubs/pdf/MIT-LCS-TR-819.pdf Summarized by: Chris Neely
Style of Writing • Robust Technical Report, includes: • Description of Chord protocol • Comparison to related work • Evaluation of protocol and system • Proofs supporting theoretical claims • Simulation results for system with up to 10,000 nodes • Measurement of an efficient implementation, confirms simulation results • Later published at ACM SIGCOMM’01
Overview (p.1) • Goal: to develop an efficient method for determining the location of a data item, in a large peer-to-peer network, using key/value pairs. • Previous work (on consistent hashing algorithms) assumed nodes were aware of most other nodes. • not scalable to large networks. • Instead, Chord’s route table is distributed; • Nodes resolve a hash by communicating with a few neighbors; • O (log N) messages per lookup. • Some models do not work well with frequent join and leave. • Chord requires only O (log2N) messages for join or leave updates.
Much Related Work (p.2) • *DNS [Mockapetris88] • Similarities: Both systems map names to values. • Differences: DNS has root servers; Chord has no special servers. • *Freenet [Clarke,I.99] • Similarities: decentralized, symmetric, and automatically adapts when hosts join and leave. • Differences: Chord queries always result in success or definitive failure and is more scalable—they say that Freenet aggressively replicates documents but cannot make bounded guarantees on retrieval or number of steps for updates. • Ohaha • Similarities: uses consistent hashing-like algorithm. • Differences: Freenet-style queries so, shares some of Freenet’s weaknesses. • The Globe system [Baker,A.00] • Similarities: wide-area location service (like DNS). • Differences: uses tree-based search that does not scale well.
Related Work Cont’d (p.2) • Distributed data location protocol [Paxton,C.97], OceanStore [Kubiatowicz00] • Similarities: (Most similar to Chord) • Differences: makes assumptions about network structure and how many hops queries can travel—they claim Chord is “substantially less complicated.” • The Grid location system [Li,J.00] • Similarities: Chord is analogous to one-dimensional version of Grid. • Differences: Grid requires geographic location information. • *Napster • Differences: Chord avoidssingle points of failureor control of systems like Napster’s centralized directory. • *Gnutella • Differences: Much more scalable than systems like Gnutella, which relies on broadcasts of increasing scope. • They mention Chord could work as a suitable replacement for the lookup services used within the above *applications.
System Model Desired Properties (p.3-4) • Scalability • up to billions of keys, hundreds or millions of nodes. • Availability • functions, despite network partitions and node failures. • Load-balanced operation • resource usage is evenly distributed across system. • Dynamism • common case: nodes join and leave, with no downtime. • Updatability • dynamic key/value binding updates by applications. • Locating according to ‘proximity’ • uses heuristics for local communication when possible.
The base Chord protocol – Overview (p.5) • Chord Servers in a System • implement the Chord protocol: • to return the locations of keys. • to help new nodes bootstrap. • to reorganize the overlay network of server nodes after ‘leave’. • Use a consistent hash function [Karger97], [Lewin98] • that will likely produce a balanced load. • so join (and leave) will likely move only O(K/N) keys. • Small routing information maintained about other nodes • dissimilar to other consistent-hashing schemes; done to improve scalability. • assumed each machine can “get by” communicating with only a few other machines.
The Hash Function (p.5-6) • Each node & key receives ‘m’-bit identifier* (using SHA-1 or similar—based on nodeID or key) Figure 1. Nodes with IDs: 0, 1, and 3. Size of m is 3. Each key {1, 2, 6} is stored at the successor(k). Consistent hashing transfers key in response to join or leave with minimal disruption.
Scalable key location (p.7) • What routing information gets maintained? • “finger table” = m-entry route table Figure 2. (a) Finger table of node n=1; (b) key and fingers for nodes 0,1, and 3. What happens when node doesn’t know successor of k?It searches for closest finger preceding k.
Node joins and departures (p.11) • Main challenge: preserving ability to locate every key • Each node’s finger table is correctly filled • Each key k is stored at node successor(k) Figure 4. (a) after node 6 joins the network, it acquires keys from its successor; (b) when node 1 leaves the network, it transfers keys to its successor;
Handling concurrent operations & failures (p.12-14) • Concurrent joins: • Multiple joining nodes might notify same predecessor that they are its new successor • notify ensures newcomer with lowest ID succeeds • Nodes periodically check their neighbors with stabilize • Additional periodic functions ensure finger tables are correct • Failures: • When node n fails: • Nodes whose tables include n must find n’s successor • n’s successor must ensure it has a copy of n’s key/value pairs • stabilize will observe and report a node’s failure
Paper contains additional details: • Theoretical Analysis (p.14-15) • Proofs: • number of nodes needed to find successor is O (log N) • likely that every node is a finger of O (log2N) nodes • * likely that successor query returns closest living succesor • * O (log N) expected time to resolve a query • * denotes when probability that any node fails is ½ • Simulation Results (p.16-19) • Protocol implementation and simulator • Load balancing • Path length • Simultaneous node failures • Chord System Implementation (p.20-21) • Location table • Experimental results
Summary • Many distributed applications need to determine the node that stores a data item • Chord does this in an efficient, decentralized manner • Each node maintains routing information for O(log N) other nodes, and lookups require O(log N) messages. • Updates to routing information from joining and leaving requires O(log2 N) messages. • Results from theoretical analysis and simulation studies with up to 10,000 nodes, and experiments show this protocol is relatively efficient.