130 likes | 144 Views
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan. Presented by Alexei Semenov. Introduction.
E N D
Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsIon Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented by Alexei Semenov
Introduction • Main problem with peer-to-peer applications - we need to efficiently locate the node that stores a particular data item. • Chord: Only one operation: given a key, it maps the key onto a node. Uses a variant of consistent hashing to assign keys to Chord nodes. • The advantages of using the consistent hashing: • balances the load • little movement of keys when nodes join or leave the system • What distinguishes Chord from many other peer-to-peer lookup protocols? • Simplicity • Provable correctness • Provable performance
Related Work • Chord vs traditional name and location services • Freenet provides anonymity, while Chord doesn’t • Globe exploits network locality better than Chord • Plaxton provides stronger guarantees than Chord • Though in some aspects Chord performs worse than other services, it’s advantage is that it still performs well and in some other aspects even better. And it is considerably less complicated.
System Model • Features of Chord: • Load balance • Decentralization • Scalability • Availability • Flexible naming • Chord software performs as a library = linked with the client and server applications using it. There are two ways of interaction between the application and the Chord: • Chord provides a lookup(key) algorithm, that yields the IP address of the node responsible for the key. • Chord software on each node notifies the application of changes in the set of keys that the node is responsible for.
The Base Chord Protocol – Consistent Hashing (1) • Chord uses consistent hashing, but improves its scalability by avoiding the requirement that every node knows about every other node. • Consistent hash function assigns each node and key an m-bit identifier using a base hash-function such as SHA-1. A node’s identifier is chosen by hashing the node’s IP address, while a key identifier is produced by hashing the key. • Consistent hashing assigns keys to nodes as follows: Identifiers are ordered in an identifier circle modulo 2^m. Key k is assigned to the first node whose identifier is equal to or follows k in the identifier space. This node is called the successor node of k.
The Base Chord Protocol – Consistent Hashing (2) • Example: m=3 The successor identifier 1 is node 1, so key 1 would be located at node 1. Similarly, key 2 would be located at node 3, and key 6 at node 0. Consistent hashing enables nodes to enter and leave the network with minimal disruption.
The Base Chord Protocol – Scalable Key Location • Using only consistent hashing may require to traverse all nodes to find the appropriate mapping. That’s why Chord maintains an additional routing information. • Each node n maintains a routing table with at most m entries, where m is the number of bits in the key/node identifiers. This table is called the finger table. • A finger table entry includes both the Chord identifier and the IP address (and port number) of the relevant node.
The Base Chord Protocol – Node Joins • Nodes can leave or join at any time. Preserving the ability to locate every key in the network may present a challenge. Chord deals with this problem by making sure that: • Each node’s successor is correctly maintained • For every key k, node successor(k) is responsible for k. • Each node’s predecessor is correctly maintained • When a node n joins the network, Chord performs 3 operations: • Initializes the predecessor and fingers of node n • Updates the fingers and predecessors of existing nodes to reflect the addition of n • Notifies the higher layer software so that it can transfer state associated with keys that node n is now responsible for.
Concurrent Operations and failures • Stabilization • Needed in case of concurrent joins. Basic ”stabilization” protocol is used to keep nodes’ successor pointers up to date, which is sufficient to guarantee correctness of lookups. Successor pointers are then used to verify and correct finger table entries, which allows these lookups to be fast as well as correct. • Failures and Replication • When a node n fails, nodes whose finger tables include n must find n’s successor. Besides the failure of n must not allow any disruption of queries that are in progress. • To successfully recover from the failure, one needs to maintain correct successor pointers. For that matter each Chord node maintains a ”successor-list” of its r nearest successors on the Chord ring.
Simulation Results • Protocol Simulator • Implemented in iterative style, which means that a node that resolves a lookup initiates all communication. It asks a series of nodes for information from their finger tables, each time moving closer on the Chord ring to the desired successor. • Load Balance • The number of keys per node exhibits large variations that increase linearly with the number of keys. • Path Length • The mean path length increases logarithmically with the number of nodes. • Simultaneous Node Failures • No significant lookup failure
Experimental Results • Prototype implementation of Chord was deployed on the Internet. Chord nodes at ten sites on a subnet of the RON test-bed in the USA: in California, Colorado, Massachusetts, New York, North Carolina and Pennsylvania. Chord software runs on UNIX, uses 160-bit keys obtained from the SHA-1 cryptographic hash function, and uses TCP to communicate between nodes. Chord runs in iterative style. • Figure shows the measured latency of Chord lookups over a range of number of nodes. • Lookup latency grows slowly with the total number of nodes, which confirms with the simulation results, demonstrating Chord’s scalability.
Conclusion • Chord features simplicity, provable correctness and provable performance even when there are concurrent node arrivals and departures. • It continues to function properly even when the node’s information is only partially correct. • It scales well with the number of nodes, recovers from large numbers of simultaneous node failures and joins, and answers most lookups correctly even during recovery. • Chord might be valuable to peer-to-peer, large-scale distributed applications such as cooperative file sharing, time-shared available storage systems, etc.