1 / 17

“Tuple Space” Scalability: Use a DHT!

“Tuple Space” Scalability: Use a DHT!. Antony Rowstron Microsoft Research Cambridge. Linda-like languages: Looking back to the early days. Originally proposed for parallel processing Shared memory versus message passing Simple: in, out, rd, ( inp , rdp ) Complex compile-time analysis

tavita
Download Presentation

“Tuple Space” Scalability: Use a DHT!

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. “Tuple Space” Scalability: Use a DHT! Antony Rowstron Microsoft Research Cambridge

  2. Linda-like languages: Looking back to the early days • Originally proposed for parallel processing • Shared memory versus message passing • Simple: in, out, rd, (inp, rdp) • Complex compile-time analysis • Closed systems • Translate “shared memory” to “message passing” • Challenge: performance better than message passing • Limited success

  3. Linda: a paradigm for open systems“The second wave” • Exploit temporal and spatial separation • Many different extensions proposed • New primitives • Multiple tuple spaces • Access-control • Open systems • New run-time systems required • Scale: • Networks of Workstations through to the Internet

  4. Linda runtimes: An overview out(<10, “hello”>) in(<?int, “hello”>) <10, “hello”> <10, “hello”> Linda runtime

  5. Linda runtimes I in(<?int, “hello”>) out(<10, “hello”>) <10, “hello”> <10, “hello”> <10, “hello”>

  6. Linda runtimes II <?int, “hello”> H( <int,string> ) <10, “hello”> H( <int,string> ) <int, string>

  7. The main challenge: Hashing <10, “hello”> H( <int,string> )

  8. The challenge: The hashing issue • Distributing the load needs a good function • Uniform distribution • But, Linda: • Tuples and templates • Open systems: resorts to types only • Small set of input symbols for hash function • <?int>,<?bool>,<?float>,<?string>… etc • 1-element templates map to ~ 10 unique keys • 2-element templates map to ~ 100 unique keys • Outcome: Difficult to implement scalable runtimes

  9. Get rid of the hash function • Move the hash function into the application • E.g. Distributed Hash Table • Simple API: • Put(key, value) • Get(key) • Looks very familiar (in,out) • Outcome: Possible to implement scalable runtimes

  10. key nodeId DHTs: Peeking under the covers • Large id space • NodeIds picked randomly from space • Keys picked randomly from space • Key is managed by its rootnode: • Live node with id closest to the key id space root node for key

  11. Node routing state 203231 nodeId leaf set • Topology aware routing table • NodeIds and keys in some base 2b (e.g., 4) • Prefix constraints on nodeIds for each slot • Pick closest node satisfying slot constraints

  12. key nodeId Routing • Prefix matching: each hop resolves an extra key digit 323310 323211 route(m, 323310) 203231 322021 313221

  13. Example: DNS service • Linda: • Add DNS entry: • Out(“msrc401.europe.microsoft.com”,157.58.16.56) • Lookup DNS entry: • Rd(“msrc401.europe.microsoft.com”, ?IP address) • DHTs • Add DNS entry: • Put(SHA1(msrc401.europe.microsoft.com”), 157.58.16.56) • Lookup DNS entry: • IP Address = Get(SHA1(msrc401.europe.microsoft.com”))

  14. Example: DNS service • Linda: • Add DNS entry: • Out(“msrc401.europe.microsoft.com”,157.58.16.56) • Lookup DNS entry: • Rd(“msrc401.europe.microsoft.com”, ?IP address) • DHTs • Add DNS entry: • Put(SHA1(msrc401.europe.microsoft.com”), 157.58.16.56) • Lookup DNS entry: • IP Address = Get(SHA1(msrc401.europe.microsoft.com”))

  15. Example: DNS service • Linda: • Add DNS entry: • Out(“msrc401.europe.microsoft.com”,157.58.16.56) • Lookup DNS: • In(“msrc401.europe.microsoft.com”, ?IP address) • DHTs • Add DNS entry: • Put(SHA1(msrc401.europe.microsoft.com”), 157.58.16.56) • Lookup DNS entry: • IP Address = Get(SHA1(msrc401.europe.microsoft.com”))

  16. The Drawback: Nothing comes free! • Range/complex queries • But in, out, rd, (inp and rdp) does not really do enumeration E.g. Find me the host names associated with IPAddresses 92.10.10.1 to 192.10.10.254 Vanilla Linda: For (inti = 1; i < 255; i++) { IPAddressaddr = new IPAddress(192.10.10.i); Tuple t = rdp(?string,addr) } Extensions: Tuple[] tuples = fetch(?string, 192.10.10.1 -> 192.10.10.254);

  17. Questions? • Question: “Should you be using a DHT?” • Sub-questions: • “Do we need an implicit hash function?” • “Do we need complex querying/matching?” • “Do we need great scalability?”

More Related