1 / 43

File Sharing : Hash/Lookup

File Sharing : Hash/Lookup. Yossi Shasho (HW in last slide) Based on   Chord : A Scalable Peer-to-peer Lookup Service for Internet Applications Partially based on The Impact of DHT Routing Geometry  on Resilience and Proximity

buffy
Download Presentation

File Sharing : Hash/Lookup

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. File Sharing : Hash/Lookup YossiShasho (HW in last slide) Based on  Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Partially based on The Impact of DHT Routing Geometry on Resilience and Proximity Partially based on Building a Low-latency, Proximity-aware DHT-Based P2P Network http://www.computer.org/portal/web/csdl/doi/10.1109/KSE.2009.49 Some slides liberally borrowed from: Carnegie Melon Peer-2-Peer 15-411 PetarMaymounkov and David Mazières’ Kademlia Talk, New York University

  2. Peer-2-Peer • Distributed systems without any centralized control or hierarchical organization. • Long list of applications: • Redundant storage • Permanence • Selection of nearby servers • Anonymity, search, authentication, hierarchical naming and more • Core operation in most p2p systems is efficient location of data items

  3. Outline

  4. Think Big • /home/google/ • One namespace, thousands of servers • Map each key (=filename) to a value (=server) • Hash table? Think again • What if a new server joins? server fails? • How to keep track of all servers? • What about redundancy? And proximity? • Not scalable, Centralized, Fault intolerant • Lots of new problems to come up…

  5. DHT: Overview • Abstraction: a distributed “hash-table” (DHT) data structure: • put(id, item); • item = get(id); • Scalable, Decentralized, Fault Tolerant • Implementation: nodes in system form a distributed data structure • Can be Ring, Tree, Hypercube, Skip List, Butterfly Network, ...

  6. DHT: Overview (2) • Many DHTs:

  7. DHT: Overview (3) • Good properties: • Distributed construction/maintenance • Load-balanced with uniform identifiers • O(log n) hops / neighbors per node • Provides underlying network proximity

  8. Consistent Hashing • When adding rows (servers) to hash-table, we don’t want all keys to change their mappings • When adding the Nth row, we want ~1/N of the keys to change their mappings. • Is this achievable? Yes.

  9. Chord: Overview • Just one operation: item = get(id) • Each node needs routing info about few other nodes • O(logN) for lookup, O(log2N) for join/leave • Simple, provable correctness, provable performance • Apps built on top of it do the rest

  10. Chord: Geometry • Identifier space [1,N], example: binary strings • Keys (filenames) and values (server IPs) on the same identifier space • Keys & values evenly distributed • Now, put this identifier space on a circle • Consistent Hashing: A key is stored at its successor.

  11. Chord: Geometry (2) • A key is stored at its successor: node with next higher ID Node 105 Key 5 K5 N105 • Get(5)=32 • Get(20)=32 • Get(80)=90 • Who maps to 105? Nobody. K20 Circular ID space N32 N90 K80

  12. Chord: Back to Consistent Hashing • “When adding the Nth row, we want ~1/N of the keys to change their mappings.” (The problem, a few slides back) Node 105 Key 5 K5 N15 N105 • Get(5)=3215 • Get(20)=32 • Get(80)=90 • Who maps to 105? Nobody. • Get(5)=32 • Get(20)=32 • Get(80)=90 • Who maps to 105? Nobody. K20 Circular ID space N32 N90 N50 K80

  13. Chord: Basic Lookup • get(k): • If (I have k) • Return “ME” • Else • P next node • Return P.get(k) N120 N10 “Where is key 80?” N105 N32 “N90 has K80” • Each node remembers only next node • O(N) lookup time – no good! N90 K80 K80 N60

  14. Chord: “Finger Table” • Previous lookup was O(N). We want O(logN) 1/2 1/4 1/8 Finger Table 1/16 1/32 i id+2i succ 0 80+20 = 81 __ 1 82+21 = 82 __2 84+22 = 84 __ 1/64 1/128 N80 • Entry i in the finger table of node n is the first node n’such that n’ ≥ n + 2i • In other words, the ith finger of n points 1/2n-i way around the ring

  15. Chord: “Finger Table” Lookups • get(k): • If (I have k) • Return “ME” • Else • P next nodeClosest finger i ≤ k • Return P.get(k) 1/2 1/4 1/8 Finger Table 1/16 1/32 i id+2i succ 0 80+20 = 81 __ 1 82+21 = 82 __2 84+22 = 84 __ 1/64 1/128 N80 • Entry i in the finger table of node n is the first node n’such that n’ ≥ n + 2i • In other words, the ith finger of n points 1/2n-i way around the ring

  16. Chord: “Finger Table” Lookups • get(k): • If (I have k) • Return “ME” • Else • P  Closest finger i ≤ k • Return P.get(k) N2 N9 N90 Finger Table i id+2i succ 0 20 N31 1 21 N31 4 35 N49 N81 N19 N74 Finger Table i id+2i succ 0 65+20 = 66 N74 1 65+21 = 67 N746 65+26 = 29 N19 “40!” N31 N49 N65 K40 “Where is key 40?” K40

  17. Chord: Example • Assume an identifier space [0..8] • Node n1 joins • Responsible for all keys • (Succ == successor) Succ. Table 0 i id+2i succ 0 1+20 = 2 1 1 1+21 = 3 1 2 1+22 = 5 1 1 7 6 2 5 3 4

  18. Chord: Example • Node n2 joins Succ. Table 0 i id+2i succ 0 2 12 1 3 1 2 5 1 1 7 6 2 Succ. Table i id+2i succ 0 3 1 1 4 1 2 6 1 5 3 4

  19. Chord: Example • Node n0, n6 join Succ. Table i id+2i succ 0 1 1 1 2 2 2 4 0 Succ. Table 0 i id+2i succ 0 2 12 1 3 16 2 5 16 1 7 Succ. Table i id+2i succ 0 7 0 1 0 0 2 2 2 6 2 Succ. Table i id+2i succ 0 3 16 1 4 16 2 6 16 5 3 4

  20. Chord: Example • Nodes: n1, n2, n0, n6 • Items: 1,7 Succ. Table Items 7 i id+2i succ 0 1 1 1 2 2 2 4 0 0 Succ. Table Items 1 1 i id+2i succ 0 2 12 1 3 16 2 5 16 7 6 2 Succ. Table i id+2i succ 0 7 0 1 0 0 2 2 2 Succ. Table i id+2i succ 0 3 16 1 4 16 2 6 16 5 3 4

  21. Chord: Routing Upon receiving a query for item id, a node: Checks if it stores the item locally If not, forwards query tolargest node i in its fingertable such that i ≤ id Succ. Table Items i id+2i succ 0 1 1 1 2 2 2 4 0 7 0 Succ. Table Items 1 i id+2i succ 0 2 2 1 3 6 2 5 6 7 1 query(7) 6 2 Succ. Table i id+2i succ 0 7 0 1 0 0 2 2 2 Succ. Table i id+2i succ 0 3 6 1 4 6 2 6 6 5 3 4

  22. Chord: Node Join Node njoins: Need one existing node - n', in hand • Initialize fingers of n • Ask n' to look them up (logN fingers to init) • Update fingers of the rest • Fewnodes need to be updated • Look them up and tell them n is new in town • Transfer keys

  23. Chord: Improvements • Every 30s, ask successor for its predecessor • Fix your own successor based on this • Also, pick and verify a random finger • Rebuild finger table entries this way • keep successor list of r successors • Deal with unexpected node failures • Can use these to replicate data

  24. Chord: Performance • Routing table size? • Log N fingers • Routing time? • Each hop expects to half the distance to the desired id => expect O(log N) hops. • Node joins • Query for the fingers => O(log N) • Update other nodes’ fingers => O(log2N)

  25. Chord: Performance (2) • Real time: Lookup time / #nodes

  26. Chord: Performance (3) • Comparing to other DHTs

  27. Chord: Performance (4) • Promises few O(logN) hops on the overlay • But, on the physical network, this can be quite far f A Chord network with N(=8) nodes and m(=8)-bit key space

  28. Applications employing DHTs • eMule(KAD implements Kademlia - a DHT) • A anonymous network (≥ 2 mil downloads to day) • BitTorrent (≥ 4.1.2 beta) • TrackerlessBitTorrent, allows anonymity(thank god) • Clients A & B handshake • A: “I have DHT, its on port X” • B: ping port X of A • B gets a reply => start adjusting - nodes, rows…

  29. Kademlia (KAD) • Distance between A and B is A XOR B • Nodes are treated as leafs in binary tree • Node’s position in A’s tree is determined by the longest postfix it shares with A • A’s ID: 010010101 • B’s ID: 101000101

  30. Space of 160-bit numbers 11…11 00…00 1 0 0 1 1 0 1 0 0 1 0 1 0 1 1 0 1 0 1 0 0 common prefix: 00 1 0 1 0 1 0 1 0 1 0 1 0 common prefix: 0 common prefix: 001 No common prefix Node / Peer Our node Kademlia: Postfix Tree • Node’s position in A’s tree is determined by the longest postfix it shares with A (=> logN subtrees)

  31. 11…11 00…00 1 0 0 1 1 0 1 0 0 1 0 1 0 1 1 0 1 0 1 0 0 1 0 1 0 1 0 1 0 1 0 1 0 Node / Peer Our node Kademlia: Lookup • Consider a query for ID 111010… initiated by node 0011100…

  32. 11…11 00…00 1 0 0 1 1 0 1 0 0 1 0 1 0 1 1 0 1 0 1 0 0 1 0 1 0 ` 1 0 1 0 1 0 1 0 Node / Peer Our node Its binary tree is divided into a series of subtrees The routing table is composed of a k-bucket s corresponding to each of these subtrees Consider a 2-bucket example, each bucket will have atleast 2 contacts for its key range Kademlia: K-Buckets Consider routing table for a node with prefix 0011 A contact consist of <IP:Port, NodeID>

  33. 1. The Problem • 3. Chord: a DHT scheme • Geometry • Lookup • Node Joins • Performance Summary 2. Distributed hash tables (DHT) 4. Extras

  34. Homework • Load balance is achieved when all Servers in the Chord network are responsible for (roughly) the same amount of keys • Still, with some probability, one server can be responsible for significantly more keys • How can we lower the upper bound to the number of keys assigned to a server? • Hint: Simulation

More Related