Compact Routing and Locality in Peer-to-Peer Systems

Compact Routing and Locality in Peer-to-Peer Systems Ittai Abraham School of Computer Science and Engineering Hebrew University of Jerusalem

Internet Activity • After 1992 – Dominated by the web browser • Client Server paradigm • Clients are Lightweight and Transient • Low bandwidth, computational power and storage requirements • Most servers are relatively simple and static (HTTP) • Distributed Computing community mostly focused on robustifying servers for load balancing and high availability, targeting mostly at clusters of dozens of servers at the most • How will Internet Activity look like in the future ?

Tomorrow’s Internet ? • Internet peers will be stateful and will have a persistent connection • Have high bandwidth, computational power and storage capabilities • Peers are capable of acting both as a server and as a client • Will have an active network presence • Symmetric, decentralized, self organizing paradigm

Peer-to-peer Systems • Group of peers wish to maintain a shared information data structure • Complications • Large group • Enormous amounts of information • Group is spatially distributed • Dynamically changing • Heterogeneous • Selfish/faulty/malicious participants • Challenge: provide efficient access to the shared information data structure

Distributed Hash Tables • A universe U of object ids. A hash function h maps U into a smaller set S spreading ids without many collisions • A set V of nodes that forms a distributed system • The set S is partitioned in to |V| parts, and each node in V maintains its relevant part of the hash table • Basic operations are lookup and store: Given an object id A, find the hash value h(A), route to the node that maintains the key h(A), and read/write the object A

DHT Lookup Example • Hash function h(x) = x mod 1009 • For this example each node with id i maintains all the objects whose has hash value is in [1009*i/6,1009*(i+1)/6) 0 5 6 Hash value is 1, so need to route to node with id 0 3 2 Node 2 wants to find the value of the object with id 1009001 4 1

Traditional Complexity Measures of Distributed Hash Tables • Degree (local memory) of the overlay network O(1), O(logk n), O(n1/k) • Number of hops from source node to target node O(log n), O(log n/loglog n), O(1) • But counting hops does not take into account a weighted communication network

Locality Awareness x t s

Low Stretch Routing • Two models: • Peers communicate through a weighted network G=<V,E,ω> • The cost of communication d(s,t) between peers in V induces a metric space M=<V,d>I • General metric • Growth bounded metric, Euclidean metric, Doubling metric • In either case the stretch of a routing scheme RS is the maximal ratio over all pairs of dRS(s,t)/d(s,t)

Routing on a weighted graph • Devise a distributed routing scheme such that: a node that knows the label of a target node can send a message that will be routed to the target node • Main complexity measures: • Stretch: the ratio between the cost of the path taken by the routing protocol and the cost of a minimum cost path from source to destination. • Memory: the number of bits stored in each node. • A solution is compact if memory is o(n)

Routing on a weighted graph • Lower bounds: • Stretch < 3 requires Ω(n) bits per node [Gavoille & Gengler 01] • Stretch < 5 requires Ω (√n) bits per node [Thorup & Zwick 01] • Stretch < 2k-1 requires Ω (n1/k) bits per node [Thorup & Zwick 01] Under the Erdosh conjecture • Two main variants: • Labeled routing: designer can choose the labels of nodes • Name Independent routing: node labels are given by an adversary • Labeled routing: • Stretch 3 with Õ(n2/3) bits [Cowen 99] • Stretch 3 with Õ(√n) bits [Thorup & Zwick 01] • Stretch 4k-1 with Õ(n1/k) bits [Thorup & Zwick 01]

Name Independent Routing • Awerbuch, Bar-Noy, Linial & Peleg 89 • With Õ(n1/k) bits – stretch O(k29k) • With Õ(n2/3) bits – stretch 468 • With Õ(√n) bits – stretch 2593 • Awerbuch & Peleg 90 (Sparse Partitions) • For diameters that are polynomial in n • With Õ(n1/k) bits – stretch O(k2) • With Õ(n2/3) bits – stretch 624 • With Õ(√n) bits – stretch 1088 • Arias, Cowen, Laing, Rajaraman & Taka 03 • With Õ(√n) bits – stretch 5 [A, Gavoille, Malkhi], DISC 04, Stretch O(k) [A, Gavoille, Malkhi, Nisam, Thorup], SPAA 04, Stretch 3

Compact Name-Independent Routing with Minimum Stretch[A, Gavoille, Malkhi, Nisan, and Thorup SPAA 2004] • Optimal stretch 3 with Õ(√n) bits • Construction in polynomial time • Routing decisions performed in constant time • Surprisingly, with Õ(√n) bits allowing the designer to label the nodes does not improve the stretch factor compared to the task when node labels are predetermined by an adversary.

The Recipe • Ingredients • Vicinity routing • Random coloring to √n colors • Hash labels to colors • Labeled routing on trees • Landmarks • Partial shortest path trees

w v Vicinity Routing • Let B(u) denote the (√n log n)-closest nodes to u (ties broken consistently) • For all vB(u), node u stores the next hop of a minimum cost path from u to v • Simple property [ABLP 89]: If vB(u) and w is on a minimum cost path from u to v, then vB(w) u B(u)

Random Coloring to √n Colors • Every node u chooses a random color c(u) • With high probability • Every color set has O(√n) nodes • Every node has in its vicinity at least one node from every color set • Polynomial number of tests • Each test can be done in logspace • Derandomization using the pseudo random generator of Nisan

Hash Labels to Colors • Label u is hashed to a color h(u){1… √n} • At most O(√n log n) hashed to same color • Trivial if node labels are a permutation of 1…n • Otherwise can collision free hash to n2.5 and then use the techniques of Tarjan and Yao to hash to √n in constant time. Can be deradomized similarly to deterministic dictionaries of Hagerup, Milerstein, & Pagh

Labeled Routing on Trees • Based on DFS Interval Routing, improved by Thorup & Zwick 01 and Frainiaud & Gavoille 01 • A node is heavy if its sub-tree contains more than half of the nodes of its parent’s sub-tree • Each node stores its DFS interval and the DFS interval of its heavy child (if it has one) • Storage is O(log n) bits • A node’s label consists of the names of the non-heavy nodes on the path from the root • Labels require O(log2 n) bits • Routing to v on node u: • If v is not in u’s interval then send to parent • If v is in u’s heavy child interval then send to heavy • Otherwise u’s label contains the appropriate child

Landmarks • Let R(T,v) be the routing information stored at node v for routing on tree T • Let T(u) be the minimum cost tree rooted at u • One color is designated as special • Let L be the set of all nodes l such that c(l)=special color • Every node u maintains R(T(l),u) for all lL • This requires Õ(√n) bits • For node u, let l(u) be a landmark in B(u)

Partial Shortest Path Trees • Every node v stores R(T(u),v) for all uB(v) • Requires Õ(√n) bits • Let L(T,u) be the label of u on tree T • Simple property: If xB(y) then given L(T(x),y), node x can route to node y along a minimum cost path x w y B(y)

Case 1: Inside B(u) • Use vicinity routing u v

w Case 2: B(u) and B(v) are close • Any node on any minimal path from u to v is either in B(u) or in B(v) • Vicinity route to wB(u) s.t. c(w)=h(v) • Node w stores L(T(w),u),x,(xy),L(T(y),v) • Partial tree route to u on T(w) • Vicinity routing to y • Partial tree route to v on T(y) u y x v

l(v) Case 3: B(u) and B(v) are far • Any minimal path from u to v contains a node that is not in B(u) or in B(v) • Vicinity route to wB(u) s.t. c(w)=h(v) • Node w stores L(T(l(v)),l(v)) and L(T(l(v)),v) • Tree route on T(l(v)) to l(v) and then to v u w v

Storage on node u • Routing information and colors in B(u) [Vicinity] • R(T(l),u) for all lL [Landmarks] • R(T(v),u) for all vB(u) [Partial Trees] • For all v such that c(u)=h(v) minimum of • Path to l(v) and then to v. Store <L(T(l(v)), l(v)) , L(T(l(v)), v)> • Let P(u,w,v) be a path from u to v composed of an MCP from u to w, and of an MCP from w to v, such that: • u \in B(w), • there exists an edge (x  y) along the minimum path from w to v such that x  B(w) and y  B(v) Among all the these paths choose the lowest cost path P(u,w,v) and store <L(T(u), w), x, (x  y), L(T(y),v)>

Routing from u to v • If vB(u) use vicinity routing • If vL use tree routing on T(v) • Otherwise vicinity route to wB(u) such that c(w)=h(v) • Node w stores either • <R(T(l(v)), l(v)) , R(T(l(v)), v)>, and routing proceeds to l(v) and then to v • <R(T(u), w), x, (x  y), R(T(y),v)>, and routing proceeds to w then x to y and finally to v

Better than stretch 3 ? • There are worst case metrics in which stretch 3 is the best possible (many edges and high girth) • But do these metrics depict real world distances ? • Studies show that many networks have a bounded expansion ratio. Density changes are somewhat gradual • [Plaxton, Rajaraman & Rica 1997]: • Required • Expected (large) constant stretch • Deployments: Tapestry (Berkeley), Pastry (MS UK)

Compact Routing and Locality in Peer-to-Peer Systems

Compact Routing and Locality in Peer-to-Peer Systems

Presentation Transcript

Engineering peer-to-peer systems

Peer To Peer Distributed Systems

Peer-to-Peer Systems

Peer-to-peer (p2p) systems

Peer-to-peer systems

Peer-to-Peer Systems

Historic Integrity in Peer-to-Peer Systems

Availability in Global Peer-to-Peer Systems

Peer-to-Peer Systems

9 IR in Peer-to-Peer Systems

Peer-to-Peer Systems

Peer-to-peer systems

“Information Retrieval in Peer-to-Peer Systems”

Peer-to-Peer Streaming Systems

Routing Indices For Peer-to-Peer Systems

Routing Indices For Peer-to-Peer Systems

Fault-tolerant Routing in Peer-to-Peer Systems

Data Management in Peer-to-Peer Systems

Peer-to-Peer Information Systems

Peer-to-Peer Protocols and Systems

Peer-to-Peer Systems (cntd.)