420 likes | 540 Views
Degree-Optimal Deterministic Routing for P2P Systems. Dipartimento di Informatica e Applicazioni “R.M. Capocelli” Università di Salerno, 84081, Baronissi (SA) - Italy. Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Università di Salerno - GL7. Outline. P2P e DHT DHT performance metrics
E N D
Degree-Optimal Deterministic Routing for P2P Systems Dipartimento di Informatica e Applicazioni “R.M. Capocelli”Università di Salerno, 84081, Baronissi (SA) - Italy Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Università di Salerno - GL7
Outline • P2P e DHT • DHT performance metrics • Greedy Routing vs Non-Greedy Routing • Neighbor of Neighbor Routing algorithm • The Small World Phenomena • Our Proposal: H-Networks • Conclusions
Distributed Hash Table (DHT) • Distributed version of a hash table data structure • Stores (key, value) pairs • The key is like a filename • The value can be file contents • Goal: Efficiently insert/lookup/delete (key, value) pairs • Each peer stores a subset of (key, value) pairs in the system • Core operation: Find node responsible for a key • Map key to node • Efficiently route insert/lookup/delete requests to this node
DHT performance metrics • Three performance metrics: • Routing table size (degree) • Storage cost • Measure the cost of self-stabilization for adapting to node joins/leaves • Diameter and Average path length • Time cost • Fault tolerance
Chord • Chord uses a one-dimensional circular key space (ring) of N=2m identifiers • The node responsible for the key is the node whose identifier most closely follows the key • Chord maintains two sets of neighbors: • A successor list of k nodes that immediately follows it in the key space • A finger list of m = log N nodes spaced exponentially around the key space • Routing consists in forwarding to the node closest, but not past, the key • Performance: • Diameter: log N (O(log n) whp) where n denote the number of nodes present in the network • Routing table size: log N (O(log n) whp) • Average path length: ½ log N Routing correctness Routing efficiency
indice ID Resp. Nodo 8+1=9 1 14 14 2 8+2=11 14 21 3 8+4=12 14 24 4 8+8=16 21 32 8+16=24 5 24 38 8+32=40 6 42 42 Successors Predecessor Nodo 1 m=6 Meeting Firb - Genova, 5-6 luglio 2004
Greedy routing: move to the neighbor that minimizes the distance to the target. t s
Greedy Routing in the Considered Networks Simple – to understand and to implement. Local – routing occurs inside the portion of ring that is delimited by source and destination In some cases – (Hypercube, Chord) – the best we can do. Not optimal with respect to the degree.
Greedy Routing in the Considered Networks • Degree is (log n) • Greedy routing needs (log n) hops. • Lower bound is • Ω(log n / log log n)
Non Greedy Routing Routing is not local • Viceroy Network • Degree: O(1) • Average Path Length: O(log N) • De Bruijn graphs • Degree: O(log N) • Average Path Length:
Neighbor of Neighbor (NoN) Greedy Routing • Let d(x,y) be a metric for the nodes in the network. • Assume the message is currently at node u ≠ target. • Let N = {v1, v2, …, vk} be the neighbors of u. • For each 1 ≤ i ≤k, let wi1, wi2, …, wik be the neighbors of vi and let N'= { wij 1 ≤ i, j ≤ k}. • Among these k2+k nodes, assume that z is the one closest to the target (with respect to metric d). • If z N route the message from u to z else z = wij, for some i and j, and we route the message from u via vi to z. • Manku, Naor, Wieder [2004] NoN-routing within O(log n / loglog n) hops in Small World Networks. • Manku, Bawa, Ragahavan [2003]: a heuristic routing algorithm in Symphony – a Small World P2P network. • Coppersmith, Gamarnik and Sviredenko [2002]: proved an upper bound on the diameter of a Small World graph.
The Small World Phenomena • The “six degree of separation” experiment S. Milgram [M67]. • The sociological experiment relied on social networks to transmit a letter from a person to unfamiliar targets by passing the letter only via acquaintances. • Only a small number (around 6) of steps was needed. • Problem: Locate a resource in a ‘natural’ network based on partial information • Question: How do people find short paths? • Recent work [DRW03], shows that, in the first steps the message was forwarded to a person P by using a guess on who P knew or, in other words, on his/her neighbors.
Small World • Nodes points in a two dimensional grid • Grid edge short range • Each edge (x, y) appears independently with probability 1/d(x,y)2 • Degree of each node (log N)
R-Schemes • R-Chord N=2m [MNW04] • For each 0 ≤ i < m, let r(i) denote an integer chosen uniformly at random from the interval [0,2i), node x is connected by edges to the nodes x+2i+r(i); • R-Hypercube [MNW04] • For each 0 ≤ i ≤ m, node x is connected with y where y is defined as follows: the top i-1 bits of y are identical to those of x. The ith is flipped. The remaining m - i bits are chosen uniformly at random. x 2i 2i+1
Neighbor of Neighbor (NoN): Degree • Cost of Neighbor of Neighbor lists: • Memory: O(log2n) • Maintenance: O(log n) must be updated • Neighbor lists should be maintained (open connection, pinging, etc.) • “In practice, a Chord ring will never be in a stable state; instead, joins and departures will occur continuously, interleaved with the stabilization algorithm. The ring will not have time to stabilize before new changes happen.” [SMLKKDB03]
H-Networks • H-Chord • Let N=2m and H() denote a good hash function. For each 0 ≤ i ≤ m, node x is connected by edges to the nodes x+2i+ H(x) mod 2i; • H-Hypercube • Let H() denote a good hash function, for each 0 ≤ i ≤ m, node x is connected with y where y is defined as fallows: the top i-1 bits of y are identical to those of x. The ith is flipped. The remaining m - i bits are identical to those of H(x). x 2i 2i+1
H-Chord Lemma The average path length is O(log n / loglog n) hops for the NoN Greedy algorithm on H-Chord with n=2m nodes. Proof Phase I: d < n1/loglog n O(log n / loglog n) step to reach the destination n<2m Chernoff Bound s t the distance decrease at last with a factor of ¾ for each step d(s,t)=d
H-Chord I Phase II: d > n1/loglog n • |I|=d’=d / log d • Goal: The probability that s can reach the interval I in two hops is equal to a constant c s t d(s,t)=d
p - 1 neighbors H-Chord I • s has at least p - 1 neighbors s1,… ,sp-1 Claim: The probability that si can reach the interval I is at least d’/2p The probability that s can reach the interval I in two hops is equal to a constant 1-e-1 APL=O(log n / loglogn) s t p=log d • |I|=d’=d / log d • d > n1/loglog n d(s,t)=d
0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 Skip – Graphs • Each node (resource) has a name. • Nodes are arranged on a line sorted by name. b a c f d e • Each node x chooses a random string m(x) of bits. • An edge is established if two nodes share a prefix which is not shared by the nodes between them. • Allows prefix search. • ??? Load balancing ???
0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 Routing in Skip – Graphs • Greedy Routing – use longest edge possible. • Path length and degree are (log N) w.h.p.
0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 H-Skip-graphs Let H() denote a good hash function, H-Skip graphs are identical to Skip-graphs but with m(x)=H(x) b a c f d e
0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 H-Skip-graphs • A node in H-Skip-graphs, in spite of using a deterministic hash function, has no way of estimate its neighbor’s neighbors. • Nevertheless, by using a deterministic hashing function the membership vector m(t) of the target become available to the source, and a more efficient search is now possible. • d(x,y)=(y+2m-x) mod 2m b a c f d e
Lower is better Chord: N IDs, N nodes
Lower is better Chord: 232 IDs, N nodes
Lower is better Skip-graphs: N nodes
Lower is better Skip-graphs: N nodes
Conclusions • H-Networks: • Deterministic P2P networks • No additional information is transmitted nor stored: • Each node x, knowing y, can compute H(y) and then can estimate y’s neighbors. • Asymptotically optimal with respect to average path length and degree (No hidden constant) • Allows a trade-off between efficiency and maintenance • No overhead with respect to greedy routing system
Bibliography "Degree-Optimal Deterministic Routing for P2P Systems”. G.Cordasco, L.Gargano, M.Hammar, and V.Scarano. In Proc. of 10th IEEE Symposium on computers and communications (ISCC 2005) La Manga del Mar Menor, Cartagena, SPAIN June 27-30, 2005.
Questions? Dipartimento di Informatica e Applicazioni “R.M. Capocelli”Università di Salerno, 84081, Baronissi (SA) - Italy
Motivation • Peer to Peer Systems (P2P) • File sharing system; • File storage system; • Distributed file system; • Redundant storage; • Availability; • Performance; • Permanence; • Anonymity; Scalability
Uniform Routing Algorithm • We consider a ring of N identifiers labeled from 0 to N-1 • A routing algorithm is uniform if for each identifier x, x is connected to y iff x+z is connected to y+z (i.e. : all the connection are symmetric). • Advantages • Easy to implement • Greedy algorithm is optimal • Simple – to understand and implement • Local – routing occurs inside the portion of ring that is delimited by source and destination • No node congestion • Fast Bootstrap • Do not need to estimate n • Drawback • Less powerful (De Bruijn Graph and Neighbor of Neighbor Greedy routing are more powerful) Routing is not greedy
Ring Chord et al. Totally connected graph LB Asymptotic tradeoff curve Diameter Uniform Routing algorithm n -1 O(log n) Non-Uniform Routing algorithm O(log n/ log(log n)) 1 1 O(log n) n -1 Routing table size
Classification…. Pure P2P Systems Uniform Systems Chord, CAN, Pastry, Tapestry… F-Chord Non Uniform Systems Greedy Routing Randomized Networks and Neighbor of Neighbor Routing Non Greedy Routing Viceroy, De Bruijn graphs
2 5 13 34 89 F-Chord() [1/2,1] m=log n=1.44 log n F-Chord(1) 1 3 8 21 55 • F-Chord() Fib(2i), for i = 1,2, …,(1-)(m-2) Fib(i), for i = 2 (1-)(m-2) +2, …, m-1 • Degree: F-Chord() use (m-2) jumps • Diameter: For any value of , the diameter of F-Chord() is m/2 0.72021 log N • Average Path Length: The average path length of the F-Chord() scheme is bounded by 0.39812 log N + (1-)0.24805 log N+1 even jumps all jumps
Graphical results Lower is better hops x log n
H-Networks • We denote by j1, j2 , …, jd all the jumps of our schemes (ordered by their size); • Let H() a good hash function that map an id on a sequence of m bits, for each 1 ≤ i ≤ d, node x is connected by edges to node x + ji + (H(x)/2m)*(ji+1 - ji) [0,1) ji+1 ji