Chord

Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google, Inc. OSDI 2006

Introduction • Dynamo stores objects associated with a key through a simple interface: • get(),put() • It should be possible to scale Dynamo incrementally • This requires the ability to partition data over the set of nodes (storage hosts) • Dynamo relies on a concept called consistent hashing • The approach they used is similar to that found in Chord.

Distributed Hash Tables (DHT) • Operationally like standard hash tables • Stores (key, value) pairs • The key is like a filename • The value can be file contents or pointer to location • Goal: Efficiently insert/lookup/delete (key,value) pairs • Each peer stores a subset of (key, value) pairs in the system

DHT • Core operation: Find node responsible for a key • Map key to node • Efficiently route insert/lookup/delete request to this node • Allow for frequent node arrivals and departures

DHT • Introduce a hash function to map the object being searched for to a unique global identifier: • e.g., h(“NGC’02 Tutorial Notes”) → 8045 • Distribute the range of the hash function among all nodes in the network • Each node must “know about” at least one copy of each object that hashes within its range (when one exists) 1500-4999 1000-1999 4500-6999 8045 9000-9500 8000-8999 7000-8500 0-999 9500-9999

DHT:Desirable Properties • Key ID space (search space) is uniformly populated • Mapping of keys to IDs using (consistent) hashing • A node is responsible for indexing all the keys in a certain subspace of the ID space • Nodes have only partial knowledge of other node’s responsibilities • Messages should be routed to a node efficiently (small number of hops) • Node arrival/departure should only affect a few nodes.

Consistent Hashing • The main idea: map both keys and nodes (node IPs) to the same (metric) ID space

Consistent Hashing • The main idea: map both keys and nodes (node IPs) to the same (metric) ID space The ring is just a possibility. Any metric space will do

Consistent Hashing • With high probability, the hash function balances load (all nodes receive roughly the same number of keys). • With high probability, when a node joins (or leaves) the network, only an fraction of the keys are moved to a different location. • This is clearly the minimum necessary to maintain a balanced load.

Consistent Hashing • The consistent hash function assigns each node and key an m-bit identifier using SHA-1 as a base hash function. • A node’s identifier is chosen by hashing the node’s IP address. • A key identifier is produced by hashing the key. • For more info see: • D. R. Karger, E. Lehman, F. Leighton, M. Levine, D. Lewin, and R.Panigrahy, “Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on theWorldWideWeb,” in Proc. 29th ACM Symp. Theory of Computing, El Paso, TX, May 1997, pp. 654–663.

P2P Middleware: Differences • Different P2P middlewares differ in: • The choice of the ID space • The structure of their network of nodes (i.e. how each node chooses its neighbors) • For each object, node(s) whose range(s) cover that object must be reachable via a “short” path • This is a major research topic

Chord • m bit identifier space for both keys and nodes • Key identifier = SHA-1(key) • Key = “LetItBe” ID=50 • Key = “129.100.16.93” ID=70 • How do we assign keys to nodes? SHA-1 SHA-1

Chord • Nodes organized in an identifier circle based on node identifiers • Keys assigned to their successor node in the identifier circle e.g., node with next higher ID.

Chord • Hash function ensures even distribution of nodes and keys on the circle • Range covered by node is from previous ID up to its own ID • Assume an N node network

Chord: Search Possibilities • Routing table size vs search cost • Every peer knows every other peer: O(N) routing table size • Every peer knows its successor: O(N) search time. • The “compromise” is to have each peer know the next m successors.

Finger Table • Let mbe the number of bits in the key/node identifiers • Each node, n, maintains a routing table with at most m entries called the finger table. • Theithentry in the table at node n contains the identity of the first node, s, that succeeds n by at least 2i-1. • s = successor(n+2i-1) • s is called theithfinger of node n

Chord:Finger Table Finger table: finger[i] = successor (n + 2i-1) where 1 ≤ i ≤ m O(log N) table size

Chord: Finger Table Finger table: finger[i] = successor (n + 2i-1)

The Chord algorithm –Scalable node localization

Chord: Search • Assume node nis searching for key k. • Node n does the following: • Find ith table entry of node nsuch that k[finger[i].start, finger[i+1].start]) • If no such entry exists then return the node in the last entry of the finger table • The above two steps are repeated until the condition in the first step is satisfied.

Chord: Join • Nodes can join (and leave) at any time. • Challenge: Preserving the ability to locate every key in the network • Chord must preserve the following: • Each node’s successor correctly maintained • For every key k, node successor(k) is responsible for k. • For lookups to be fast, it is desirable for the finger tables to be correct.

Chord: Join Implementation • Each node in Chord maintains a predecessorpointer. • This consists of the Chord ID and IP address of the immediate predecessor of that node. • It can be used to walk counterclockwise around the identifier circle. • The new node to be added learns the identify of an existing Chord node by some external mechanism

Chord: Join Initialization Steps • Assume n is the node to join. • Find any existing node, n’. • Find successor of nfrom n’. Label this successor(n). • Ask successor(n) for its predecessor. This is labelled as predecessor(successor(n)).

Chord: Join Example • Assume N26 wants to • join; If finds N8 • N8’s finger table suggests • that N26 will be “between” • N21 and N32.

Chord: Join (Initialize finger table) • Node n needs to have its finger table initialized • Node n can ask one its predecessor to be for its finger table as a starting point

Chord: Join (Changing Existing Finger Tables) • Node n needs to entered into the finger tables of some existing nodes. • Node nbecomes the ithfinger of node p, iff • pprecedes n by at least 2i-1; and • The ithfinger of node p succeeds n. • The first node, p, that satisfies these conditions is the immediate predecessor of n-2i-1 • For a given n, the algorithm starts with the ithfinger of nodenand then continues to walk in the counter-clock-wise direction on the identifier circle until it encounters a node whose ithfinger precedes n.

Chord: Join Example (add N26) N21 (old finger table) N21 (new finger table) i=1: Does N21 precede N26 by at least 1 (2i-1); yes: N21+1 becomes N26; i=2: Does N21 precede N26 by at least 2; yes: N21+2 becomes N26; i=3: Does N21 precede N26 by at least 4; yes: N21+4 becomes N26; i=4: Does N21 precede N26 by 8; no; evaluate N14;

Chord: Join Example (add N26) N14 (new finger table) N14 (new finger table) i=4: Does N14 precede N26 by at least 8; yes; N14+8 becomes N26 i=5; Does N15 precede N26 by at least 16; no; evaluate N8 Etc

Chord: Join (Transferring Keys) • Move responsibility for all the keys for which node n is the successor. • Typically this involves moving data associated with each key to the new node. • Nodencan become the successor for keys that were previously the responsibility of the node immediately following n. • Noden only needs to contact one node to transfer responsibility for all relevant keys.

Chord: Join • The previous discussion on join focuses on a single node join. • What if there are multiple node joins? • Join requires that each node’s successor is correctly maintained

Chord: Stabilization Protocol • The successor/predecessor links are rebuilt by periodic stabilize notification messages • Sent by each node to its successor to inform it of the (possibly new) identity of the predecessor • The successor pointers are used to verify and correct finger table entries.

Chord: Join/Stabilize Example

Chord: Join/Stabilize Example • N26 joins the system • N26 acquires N32 as its successor • N26 notifies N32 • N32 acquires N26 as its predecessor

Chord: Join/Stabilize Example • N26 copies keys • N21 runs stabilize() and asks its successor N32 for its predecessor which is N26.

Chord: Join/Stabilize Example • N21 aquires N26 as its successor

Chord Stabilization • Pointers and finger tables may be in a state of flux • Is it possible that data will not be found? • Yes • Recovery: try again

Chord: Node Failure N120 N10 N113 N102 Lookup(90) N85 N80 N80 doesn’t know correct successor, so incorrect lookup

Chord: Node Failure • Solution: Use successor lists • Each node knows r immediate successors • After failure, will know first live successor • Stabilize messages correct finger tables • Replicas of the data associated with a key at the rsuccessor nodes might be used • Application dependent

Chord Properties • In a system with N nodes and K keys, with high probability… • each node receives at most K/N keys • each node maintains info. about O(log N) other nodes • lookups resolved with O(log N) hops • Insertions O(log2N) • The developers of Chord validated this through simulation studies. • No consistency among replicas • Hops have poor network locality

N20 N40 N41 N80 Chord: Network Locality • Nodes close on ring can be far in the network. * Figure from http://project-iris.net/talks/dht-toronto-03.ppt

Chord

Chord

Presentation Transcript

Chord-over-Chord Overlay

Chord Systems

Chord Progressions

84 . Chord

SELF-CHORD

Spinal chord

CHORD LENGTH PARAMETERIZATION

Chord Geometries

Dominant Seventh Chord

Chord Keyboard

Chord

CHORD Semantics

CHORD Semantics

Chord Recognition

Chord and CFS

Location aware CHORD

F-Chord: Improved Uniform Routing on Chord

Chord Substitution

Chord Progressions

CHORD PROGRESSIONS

Chord