580 likes | 717 Views
OpenDHT: A Public DHT Service. Sean C. Rhea UC Berkeley June 2, 2005. Joint work with: Brighten Godfrey, Brad Karp, John Kubiatowicz, Sylvia Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu. Peer-to-Peer File Sharing. Very simple insight Most computers unused most of the time
E N D
OpenDHT: A Public DHT Service Sean C. Rhea UC Berkeley June 2, 2005 Joint work with: Brighten Godfrey, Brad Karp, John Kubiatowicz, Sylvia Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu
Peer-to-Peer File Sharing • Very simple insight • Most computers unused most of the time • Idea: harness this spare capacity to • Quickly download music files [Napster, Gnutella] • Search for aliens [SETI@Home] • Make free long-distance phone calls [Skype] • Question: how to find desired resource(s)? • Early approaches: scoped flooding • Downsides: scalability, accuracy OpenDHT: A Public DHT Service
A Better Search Facility:The Distributed Hash Table (DHT) • Same interface as a programmatic hash table, • put(key, value) stores value under key • get(key) returns the value(s) stored under key • But shared across many machines • Implemented via an overlay network OpenDHT: A Public DHT Service
K V K V K V K V K V K V K V K V K V K V k1,v1 k1 v1 k1,v1 k1 v1 k1,v1 v1 k1 K V A Better Search Facility:The Distributed Hash Table (DHT) stores k1,v1 put(k1,v1) get(k1) OpenDHT: A Public DHT Service
pointer to file K V K V K V K V K V K V K V K V K V K V put(file, IP) K V DHTs and File Sharing:DHT Stores Pointers to Files OpenDHT: A Public DHT Service
K V K V K V K V K V K V K V K V K V K V get(file) IP xfer over HTTP K V DHTs and File Sharing:DHT Stores Pointers to Files pointer to file OpenDHT: A Public DHT Service
K V K V K V K V K V K V K V K V put(hash(msg), IP) K V “I love you!” DHTs and Spam Detection:Detecting Similar Messages OpenDHT: A Public DHT Service
K V K V K V K V K V K V K V K V put(hash(msg), IP) K V DHTs and Spam Detection:Detecting Similar Messages “I love you!” “I love you!” OpenDHT: A Public DHT Service
Something’s fishy! K V K V K V K V K V K V K V K V K V DHTs and Spam Detection:Detecting Similar Messages “I love you!” “I love you!” OpenDHT: A Public DHT Service
K V K V K V K V K V K V K V K V put(hash(msg), IP) K V DHTs and Spam Detection:Detecting Similar Messages “I love you!” “I love you!” “I love you!” OpenDHT: A Public DHT Service
Something’s fishy! K V K V K V K V K V K V K V K V K V DHTs and Spam Detection:Detecting Similar Messages “I love you!” “I love you!” “I love you!” OpenDHT: A Public DHT Service
More DHT Applications • Distributed Storage Systems • CFS, HiveCache, PAST, Pastiche • OceanStore / Pond • Content Distribution Networks / Web Caches • Bslash, Coral, Squirrel • Indexing / Naming Systems • Chord-DNS, CoDoNS, DOA, SFR • Internet Query Processors • Catalogs, PIER • Communication Systems • Bayeux, i3, MCAN, SplitStream OpenDHT: A Public DHT Service
Some Areas of DHT Research • Better routing protocols • One-hop, degree-optimal • Load balancing • Non-uniform key distributions • Security • Byzantine fault-tolerant routing • Data redundancy and fault tolerance • Replication, erasure-coding • Stronger semantics • Supporting read-modify-write OpenDHT: A Public DHT Service
K V K V K V K V K V K V K V K V K V K V K V Spam Detection K V How Many DHTs Will There Be? File Sharing Company Machine: Can’t Share Files Owns Stock in Spam Company OpenDHT: A Public DHT Service
K V K V K V K V K V K V K V K V K V K V K V Spam Detection K V How Many DHTs Will There Be? File Sharing Redundant Link OpenDHT: A Public DHT Service
K V K V K V K V K V K V K V K V K V K V K V Spam Detection K V How Many DHTs Will There Be? File Sharing Unshared Links OpenDHT: A Public DHT Service
Benefits of Sharing a DHT • Amortizes costs across applications • Maintenance bandwidth, connection state, etc. • Facilitates “bootstrapping” of new applications • Working infrastructure already in place • Allows for statistical multiplexing of resources • Takes advantage of spare storage and bandwidth • Facilitates upgrading existing applications • “Share” DHT between application versions OpenDHT: A Public DHT Service
Challenges in Sharing a DHT • Robustness • Must be available 24/7 • Shared Interface Design • Should be general, yet easy to use • Resource Allocation • Must protect against malicious/over-eager users • Economics • What incentives are there to provide resources? OpenDHT: A Public DHT Service
Challenges in Sharing a DHT • Robustness • Must be available 24/7 • Shared Interface Design • Should be general, yet easy to use • Resource Allocation • Must protect against malicious/over-eager users • Economics • What incentives are there to provide resources? OpenDHT: A Public DHT Service
K V K V K V K V K V K V K V K V K V K V The DHT as a Service OpenDHT: A Public DHT Service
K V K V K V K V K V K V K V K V K V K V The DHT as a Service OpenDHT OpenDHT: A Public DHT Service
The DHT as a Service OpenDHT Clients OpenDHT: A Public DHT Service
The DHT as a Service OpenDHT OpenDHT: A Public DHT Service
The DHT as a Service What is this interface? OpenDHT OpenDHT: A Public DHT Service
12 11 1 10 2 9 3 8 4 7 5 6 The Traditional Interface: lookup OpenDHT: A Public DHT Service
k The Traditional Interface: lookup lookup(k) On reaching the successor of k, message passed to an “upcall” OpenDHT: A Public DHT Service
K V K V K V K V K V K V K V K V put(hash(msg), IP) K V Upcall: I’ve seen this message before! DHTs and Spam Detection:Detecting Similar Messages “I love you!” “I love you!” OpenDHT: A Public DHT Service
Something’s fishy! K V K V K V K V K V K V K V K V K V DHTs and Spam Detection:Detecting Similar Messages “I love you!” “I love you!” OpenDHT: A Public DHT Service
Upcall Challenges • Distribution • How do we get new upcall code to all nodes? OpenDHT: A Public DHT Service
lookup(k) How did the upcall code get here? k Upcall Challenges OpenDHT: A Public DHT Service
Upcall Challenges • Distribution • How do we get new upcall code to all nodes? • Active networking experience is a warning… OpenDHT: A Public DHT Service
Upcall Challenges • Distribution • How do we get new upcall code to all nodes? • Active networking experience is a warning… • Security • How do we safely run untrusted clients’ upcalls? OpenDHT: A Public DHT Service
What about Put/Get? • Works great for some applications • File sharing, for example OpenDHT: A Public DHT Service
pointer to file K V K V K V K V K V K V K V K V K V K V put(file, IP) get(file) IP xfer over HTTP K V DHTs and File Sharing:DHT Stores Pointers to Files OpenDHT: A Public DHT Service
What about Put/Get? • Works great for some applications • File sharing, for example • What about applications with upcalls? • Our spam detection application, for example OpenDHT: A Public DHT Service
What about Put/Get? • Works great for some applications • File sharing, for example • What about applications with upcalls? • Our spam detection application, for example • Idea: let application nodes run the upcalls • Each node only runs upcalls for the applications that it’s participating in OpenDHT: A Public DHT Service
Spam Detection I can handle spam detection messages I can handle spam detection messages I can handle spam detection messages Upcall Example File Sharing put/get OpenDHT put/get OpenDHT: A Public DHT Service
Upcall Example File Sharing Spam Detection put/get OpenDHT Who’s handling hash(message)? put/get “I love you!” OpenDHT: A Public DHT Service
Upcall Example File Sharing Spam Detection put/get OpenDHT Who’s handling hash(message)? put/get “I love you!” “I love you!” OpenDHT: A Public DHT Service
Something’s fishy! Upcall Example File Sharing Spam Detection DHT keeps track of which nodes support which upcalls via Recursive Distributed Rendezvous (ReDiR) put/get OpenDHT put/get “I love you!” “I love you!” OpenDHT: A Public DHT Service
H(A) H(B) H(namespace) A A A ReDiR • Goal: Implement two functions using put/get: • join(namespace, node) • node = lookup(namespace, identifier) L0 L1 L2 OpenDHT: A Public DHT Service
H(A) H(C) H(B) ReDiR • Goal: Implement two functions using put/get: • join(namespace, node) • node = lookup(namespace, identifier) A L0 A L1 A, B C L2 OpenDHT: A Public DHT Service
H(A) H(C) H(D) H(B) ReDiR • Goal: Implement two functions using put/get: • join(namespace, node) • node = lookup(namespace, identifier) A L0 A, C D L1 A, B C D L2 OpenDHT: A Public DHT Service
H(A) H(C) H(D) H(E) H(B) ReDiR • Goal: Implement two functions using put/get: • join(namespace, node) • node = lookup(namespace, identifier) A, D L0 A, C D L1 A, B C D E L2 OpenDHT: A Public DHT Service
H(A) H(C) H(D) H(E) H(B) ReDiR • Goal: Implement two functions using put/get: • join(namespace, node) • node = lookup(namespace, identifier) A, D L0 A, C D, E L1 A, B C D E L2 OpenDHT: A Public DHT Service
H(A) H(C) H(D) H(E) H(B) ReDiR • Join cost: • Worst case: O(log n) puts and gets • Average case: O(1) puts and gets A, D L0 A, C D, E L1 A, B C D E L2 OpenDHT: A Public DHT Service
H(k1) ReDiR • Goal: Implement two functions using put/get: • join(namespace, node) • node = lookup(namespace, identifier) A, D L0 A, C D, E L1 successor A, B C D E L2 H(A) H(B) H(C) H(D) H(E) OpenDHT: A Public DHT Service
H(k2) ReDiR • Goal: Implement two functions using put/get: • join(namespace, node) • node = lookup(namespace, identifier) A, D L0 successor A, C D, E L1 no successor A, B C D E L2 H(A) H(B) H(C) H(D) H(E) OpenDHT: A Public DHT Service
H(k3) ReDiR • Goal: Implement two functions using put/get: • join(namespace, node) • node = lookup(namespace, identifier) successor A, D L0 no successor A, C D, E L1 no successor A, B C D E L2 H(A) H(B) H(C) H(D) H(E) OpenDHT: A Public DHT Service
ReDiR • Lookup cost: • Worst case: O(log n) gets • Average case: O(1) gets A, D L0 A, C D, E L1 A, B C D E L2 H(A) H(B) H(C) H(D) H(E) OpenDHT: A Public DHT Service