330 likes | 463 Views
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications. Ion Stoica Robert Morris David Liben-Nowell David R. Karger M. Frans Kaashoek Frank Dabek Hari Balakrishnan Presented By- Manan Rawal, Bhaskar Gupta. Introduction.
E N D
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications Ion Stoica Robert Morris David Liben-Nowell David R. Karger M. Frans Kaashoek Frank Dabek Hari Balakrishnan Presented By- Manan Rawal, Bhaskar Gupta
Introduction • Efficient lookup of a node which stores data items for a particular search key. • Provides only one operation: given a key, it maps the key onto a node. • Example applications: • Co-operative Mirroring • Time-shared storage • Distributed indexes • Large-Scale combinatorial search
Problems addressed • Load Balance: Distributed hash function spreads keys evenly over the nodes • Decentralization: Fully distributed • Scalability: Lookup grows as a log of number of nodes • Availability: Automatically adjusts internal tables to reflect changes. • Flexible Naming: No constraints on key structure.
Chord Protocol • Assumes communication in underlying network is both symmetric and transitive. • Assigns keys to nodes with consistent hashing • Hash function balances the load • When Nth node joins or leaves only O(1/N) fraction of keys moved.
Chord protocol • Consistent hashing function assigns each node and key an m-bit identifier using SHA-1 base hash function. • Node’s IP address is hashed. • Identifiers are ordered on a identifier circle modulo 2m called a chord ring. • succesor(k) = first node whose identifier is >= identifier of k in identifier space.
Chord protocol m = 6 10 nodes
Theorem • For any set of N nodes and K keys, with high probability: • Each node is responsible for at most (1+e)K/N keys. • When an (N+1)st node joins or leaves the network, responsibility for O(K/N) keys changes hands. e = O(log N)
Simple Key Location Scheme N1 N8 K45 N48 N14 N42 N38 N21 N32
Scalable Lookup Scheme N1 Finger Table for N8 N56 N8 N51 finger 6 finger 1,2,3 N48 N14 finger 5 N42 finger 4 N38 finger [k] = first node that succeeds (n+2k-1)mod2m N21 N32
Scalable Lookup Scheme // ask node n to find the successor of id n.find_successor(id) if (id belongs to (n, successor]) return successor; else n0 = closest preceding node(id); return n0.find_successor(id); // search the local table for the highest predecessor of id n.closest_preceding_node(id) for i = m downto 1 if (finger[i] belongs to (n, id)) return finger[i]; return n;
Lookup Using Finger Table N1 lookup(54) N56 N8 N51 N48 N14 N42 N38 N21 N32
Scalable Lookup Scheme • Each node forwards query at least halfway along distance remaining to the target • Theorem: With high probability, the number of nodes that must be contacted to find a successor in a N-node network is O(log N)
Dynamic Operations and Failures Need to deal with: • Node Joins and Stabilization • Impact of Node Joins on Lookups • Failure and Replication • Voluntary Node Departures
Node Joins and Stabilization • Node’s successor pointer should be up to date • For correctly executing lookups • Each node periodically runs a “Stabilization” Protocol • Updates finger tables and successor pointers
Node Joins and Stabilization • Contains 6 functions: • create() • join() • stabilize() • notify() • fix_fingers() • check_predecessor()
Create() • Creates a new Chord ring n.create() predecessor = nil; successor = n;
Join() • Asks m to find the immediate successorof n. • Doesn’t make rest of the networkaware of n. n.join(m) predecessor = nil; successor = m.find_successor(n);
Stabilize() • Called periodically to learn about new nodes • Asks n’s immediate successor about successor’s predecessor p • Checks whether p should be n’s successor instead • Also notifies n’s successor about n’s existence, so that successor may change its predecessor to n, if necessary n.stabilize() x = successor.predecessor; if (x (n, successor)) successor = x; successor.notify(n);
Notify() • m thinks it might be n’s predecessor n.notify(m) if (predecessor is nil or m (predecessor, n)) predecessor = m;
Fix_fingers() • Periodically called to make sure that fingertable entries are correct • New nodes initializetheir finger tables • Existing nodes incorporatenew nodes into their finger tables n.fix_fingers() next = next + 1 ; if (next > m) next = 1 ; finger[next] = find_successor(n + 2next-1);
Check_predecessor() • Periodically called to check whether predecessor has failed • If yes, it clears the predecessor pointer, which can then be modified by notify() n.check_predecessor() if (predecessor has failed) predecessor = nil;
Theorem 3 • If any sequence of join operations is executed interleaved with stabilizations, then at some time after the last join the successor pointers will form a cycle on all nodes in the network
Stabilization Protocol • Guarantees to add nodes in a fashion to preserve reach ability • By itself won’t correct a Chord system that has split into multiple disjoint cycles, or a single cycle that loops multiple times around the identifier space
Impact of Node Joins on Lookups • Correctness • If finger table entries are reasonably current • Lookup finds the correct successor in O(log N) steps • If successor pointers are correct but finger tables are incorrect • Correct lookup but slower • If incorrect successor pointers • Lookup may fail
Impact of Node Joins on Lookups • Performance • If stabilization is complete • Lookup can be done in O(log N) time • If stabilization is not complete • Existing nodes finger tables may not reflect the new nodes • Doesn’t significantly affect lookup speed • Newly joined nodes can affect the lookup speed, if the new nodes ID’s are in between target and target’s predecessor • Lookup will have to be forwarded through the intervening nodes, one at a time
Theorem 4 • If we take a stable network with N nodes with correct finger pointers, and another set of up to N nodes joins the network, and all successor pointers (but perhaps not all finger pointers) are correct, then lookups will still take O(log N) time with high probability
Failure and Replication • Correctness of the protocol relies on the fact of knowing correct successor • To improve robustness • Each node maintains a successor list of ‘r’ nodes • This can be handled using modified version of stabilize procedure • Also helps higher-layer software to replicate data
Theorem 5 • If we use successor list of length r = O(log N) in a network that is initially stable, and then every node fails with probability ½, then with high probability find_successor returns the closest living successor to the query key
Theorem 6 • In a network that is initially stable, if every node fails with probability ½, then the expected time to execute find_successor is O(log N)
Voluntary Node Departures • Can be treated as node failures • Two possible enhancements • Leaving node may transfers all its keys to its successor • Leaving node may notify its predecessor and successor about each other so that they can update their links
Conclusion • Efficient location of the node that stores a desired data item is a fundamental problem in P2P networks • Chord protocol solves it in a efficient decentralized manner • Routing information: O(log N) nodes • Lookup: O(log N) nodes • Update: O(log2 N) messages • It also adapts dynamically to the topology changes introduced during the run
Future Work • Using Chord to detect and heal partitions whose nodes know of each other. • Every node should know of some same set of initial nodes • Nodes should maintain long-term memory of a random set of nodes that they have visited earlier • Malicious set of Chord participants could present an incorrect view of the Chord ring • Node n periodically asks other nodes to do a lookup for n