1.22k likes | 1.26k Views
Peer-to-Peer (P2P) systems. DHT, Chord, Pastry, …. Unstructured vs Structured. Unstructured P2P networks allow resources to be placed at any node. The network topology is arbitrary, and the growth is spontaneous.
E N D
Peer-to-Peer (P2P) systems DHT, Chord, Pastry, …
Unstructured vs Structured • Unstructured P2P networksallow resources to be placed at any node. The network topology is arbitrary, and the growth is spontaneous. • Structured P2P networkssimplify resource location and load balancing by defining a topology and defining rules for resource placement. • Guarantee efficient search for rare objects What are the rules??? Distributed Hash Table (DHT)
Hash Tables • Store arbitrary keys and satellite data (value) • put(key,value) • value = get(key) • Lookup must be fast • Calculate hash function h() on key that returns a storage cell • Chained hash table: Store key (and optional value) there
What Is a DHT? • Single-node hash table: key = Hash(name) put(key, value) get(key) -> value • How do I do this across millions of hosts on the Internet? • Distributed Hash Table
…. node node node DHT Distributed application data get (key) put(key, data) (DHash) Distributed hash table lookup(key) node IP address (Chord) Lookup service • Application may be distributed over many nodes • DHT distributes data storage over many nodes
Why the put()/get() interface? • API supports a wide range of applications • DHT imposes no structure/meaning on keys
Distributed Hash Table • Hash table functionality in a P2P network : lookup of data indexed by keys • Key-hash node mapping • Assign a unique live node to a key • Find this node in the overlay network quickly and cheaply • Maintenance, optimization • Load balancing : maybe even change the key-hash node mapping on the fly • Replicate entries on more nodes to increase robustness
The lookup problem N2 N1 N3 Key=“title” Value=MP3 data… Internet ? Client Publisher Lookup(“title”) N4 N6 N5
Centralized lookup (Napster) N2 N1 SetLoc(“title”, N4) N3 Client DB N4 Publisher@ Lookup(“title”) Key=“title” Value=MP3 data… N8 N9 N7 N6 Simple, but O(N) state and a single point of failure
Flooded queries (Gnutella) N2 N1 Lookup(“title”) N3 Client N4 Publisher@ Key=“title” Value=MP3 data… N6 N8 N7 N9 Robust, but large number of messages per lookup
Routed queries (Chord, Pastry, etc.) N2 N1 N3 Client N4 Lookup(“title”) Publisher Key=“title” Value=MP3 data… N6 N8 N7 N9
Routing challenges • Define a useful key nearness metric • Keep the hop count small • Keep the tables small • Stay robust despite rapid change • Freenet: emphasizes anonymity • Chord: emphasizes efficiency and simplicity
What is Chord? What does it do? • In short: a peer-to-peer lookup service • Solves problem of locating a data item in a collection of distributed nodes, considering frequent node arrivals and departures • Core operation in most p2p systems is efficient location of data items • Supports just one operation: given a key, it maps the key onto a node
Chord Characteristics • Simplicity, provable correctness, and provable performance • Each Chord node needs routing information about only a few other nodes • Resolves lookups via messages to other nodes (iteratively or recursively) • Maintains routing information as nodes join and leave the system
Mapping onto Nodes vs. Values • Traditional name and location services provide a direct mapping between keys and values • What are examples of values? A value can be an address, a document, or an arbitrary data item • Chord can easily implement a mapping onto values by storing each key/value pair at node to which that key maps
Napster, Gnutella etc. vs. Chord • Compared to Napster and its centralized servers, Chord avoids single points of control or failure by a decentralized technology • Compared to Gnutella and its widespread use of broadcasts, Chord avoids the lack of scalability through a small number of important information for routing
Addressed Difficult Problems (1) • Load balance: distributed hash function, spreading keys evenly over nodes • Decentralization: chord is fully distributed, no node more important than other, improves robustness • Scalability: logarithmic growth of lookup costs with number of nodes in network, even very large systems are feasible
Addressed Difficult Problems (2) • Availability: chord automatically adjusts its internal tables to ensure that the node responsible for a key can always be found • Flexible naming: no constraints on the structure of the keys – key-space is flat, flexibility in how to map names to Chord keys
Chord properties • Efficient: O(log(N)) messages per lookup • N is the total number of servers • Scalable: O(log(N)) state per node • Robust: survives massive failures
The Base Chord Protocol (1) • Specifies how to find the locations of keys • How new nodes join the system • How to recover from the failure or planned departure of existing nodes
Consistent Hashing • Hash function assigns each node and key an m-bit identifier using a base hash function such as SHA-1 • ID(node) = hash(IP, Port) • ID(key) = hash(key) • Properties of consistent hashing: • Function balances load: all nodes receive roughly the same number of keys • When an Nth node joins (or leaves) the network, only an O(1/N) fraction of the keys are moved to a different location
Chord IDs • Key identifier = SHA-1(key) • Node identifier = SHA-1(IP, Port) • Both are uniformly distributed • Both exist in the same ID space • How to map key IDs to node IDs?
Chord IDs • consistent hashing (SHA-1) assigns each node and object an m-bit ID • IDs are ordered in an ID circle ranging from 0 – (2m-1). • New nodes assume slots in ID circle according to their ID • Key k is assigned to first node whose ID ≥ k • successor(k)
Consistent hashing Key 5 K5 Node 105 N105 K20 Circular 7-bit ID space N32 N90 K80 A key is stored at its successor: node with next higher ID
identifier node 6 X key 0 1 7 6 2 5 3 4 2 Successor Nodes 1 successor(1) = 1 identifier circle successor(6) = 0 6 2 successor(2) = 3
0 1 7 6 2 5 3 4 Node Joins and Departures 6 1 6 successor(6) = 7 successor(1) = 3 1 2
Consistent Hashing – Join and Departure • When a node n joins the network, certain keyspreviously assigned to n’s successor now become assigned ton. • When node n leaves the network, all of its assigned keys arereassigned to n’s successor.
0 1 7 6 2 5 3 4 Consistent Hashing – Node Join keys 5 7 keys 1 keys keys 2
0 1 7 6 2 5 3 4 Consistent Hashing – Node Dep. keys 7 keys 1 keys 6 keys 2
Scalable Key Location • A very small amount of routing information suffices to implement consistent hashing in a distributed environment • Each node need only be aware of its successor node on the circle • Queries for a given identifier can be passed around the circle via these successor pointers
A Simple Key Lookup • Pseudo code for finding successor: // ask node n to find the successor of id n.find_successor(id) if (id (n,successor]) return successor; else // forward the query around the circle return successor.find_successor(id);
Scalable Key Location • Resolution scheme correct, BUT inefficient: • it may require traversing all N nodes!
Acceleration of Lookups • Lookups are accelerated by maintaining additional routing information • Each node maintains a routing table with (at most) m entries (where N=2m) called the finger table • ithentry in the table at node n contains the identity of the first node, s, that succeeds n by at least 2i-1 on the identifier circle • s = successor(n + 2i-1) (all arithmetic mod 2m) • s is called the ith finger of node n, denoted by n.finger(i).node
0 1 7 6 2 5 3 4 Scalable Key Location – Finger Tables finger table keys start succ. 6 For. 1 2 4 1 3 0 0+20 0+21 0+22 finger table keys For. start succ. 1 1+20 1+21 1+22 2 3 5 3 3 0 finger table keys For. start succ. 2 4 5 7 0 0 0 3+20 3+21 3+22
finger table keys start int. succ. 6 0 finger table keys 1 7 start int. succ. 1 2 3 5 [2,3) [3,5) [5,1) 3 3 0 6 2 5 3 finger table keys start int. succ. 2 4 4 5 7 [4,5) [5,7) [7,3) 0 0 0 Finger Tables 1 2 4 [1,2) [2,4) [4,0) 1 3 0
Finger Tables - characteristics • Each node stores information about only a small number of other nodes, and knows more about nodes closely following it than about nodes fartheraway • A node’s finger table generally does not contain enough information to determine the successor of an arbitrary key k • Repetitive queries to nodes that immediately precede the given key will lead to the key’s successor eventually
0 finger table keys 1 7 start int. succ. 1 2 3 5 [2,3) [3,5) [5,1) 3 3 0 6 2 finger table keys start int. succ. 7 0 2 [7,0) [0,2) [2,6) 0 0 3 5 3 4 Node Joins – with Finger Tables finger table keys start int. succ. 6 1 2 4 [1,2) [2,4) [4,0) 1 3 0 6 6 finger table keys start int. succ. 2 4 5 7 [4,5) [5,7) [7,3) 0 0 0 6 6
0 1 7 6 2 5 3 4 Node Departures – with Finger Tables finger table keys start int. succ. 1 2 4 [1,2) [2,4) [4,0) 1 3 0 3 6 finger table keys start int. succ. 1 2 3 5 [2,3) [3,5) [5,1) 3 3 0 6 finger table keys start int. succ. 6 7 0 2 [7,0) [0,2) [2,6) 0 0 3 finger table keys start int. succ. 2 4 5 7 [4,5) [5,7) [7,3) 6 6 0 0
Simple Key Location Scheme N1 N8 K45 N48 N14 N42 N38 N21 N32
Scalable Lookup Scheme N1 Finger Table for N8 N56 N8 N51 finger 6 finger 1,2,3 N48 N14 finger 5 N42 finger 4 N38 N21 N32 finger [k] = first node that succeeds (n+2k-1)mod2m
Chord key location • Lookup in finger table the furthest node that precedes key • -> O(log n) hops
Scalable Lookup Scheme // ask node n to find the successor of id n.find_successor (id) n0 = find_predecessor(id); return n0.successor; // ask node n to find the predecessor of id n.find_predecessor (id) n0 = n; while (id not in (n0, n0.successor] ) n0 = n0.closest_preceding_finger(id); return n0; // return closest finger preceding id n.closest_preceding_finger (id) for i = m downto 1 if (finger[i].node belongs to (n, id)) return finger[i].node; return n;
Lookup Using Finger Table N1 lookup(54) N56 N8 N51 N48 N14 N42 N38 N21 N32
Scalable Lookup Scheme • Each node forwards query at least halfway along distance remaining to the target • Theorem: With high probability, the number of nodes that must be contacted to find a successor in a N-node network is O(log N)
“Finger table” allows log(N)-time lookups ½ ¼ 1/8 1/16 1/32 1/64 1/128 N80
Dynamic Operations and Failures Need to deal with: • Node Joins and Stabilization • Impact of Node Joins on Lookups • Failure and Replication • Voluntary Node Departures
Node Joins and Stabilization • Node’s successor pointer should be up to date • For correctly executing lookups • Each node periodically runs a “Stabilization” Protocol • Updates finger tables and successor pointers
Node Joins and Stabilization • Contains 6 functions: • create() • join() • stabilize() • notify() • fix_fingers() • check_predecessor()
Create() • Creates a new Chord ring n.create() predecessor = nil; successor = n;