1 / 18

Distributed Hash Tables

Distributed Hash Tables. Abdo Achkar 11-22-05,Villanova University. 1. Overview. Intro to Hash tables Distributed Hash tables IDA encoding Chord protocol DHash API. 2. Hash tables. Definition: Array of pointers to linked lists Has a hash function. 3. Hash Tables, The data structure.

evette
Download Presentation

Distributed Hash Tables

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Hash Tables Abdo Achkar 11-22-05,Villanova University 1

  2. Overview • Intro to Hash tables • Distributed Hash tables • IDA encoding • Chord protocol • DHash API 2

  3. Hash tables • Definition: • Array of pointers to linked lists • Has a hash function 3

  4. Hash Tables,The data structure • Array of pointers to linked lists of a type T where T is the type of the data structure that contains both the key and data. * * * * * * * Key Data * Key Data * Key Data * Key Data * Key Data T=typeof(<Key,Data>) 4

  5. Hash TablesThe hash function • Takes some data as input, and returns an integer based on the data. • Ex: • int hash(char* data) { int sum = 0; for (int i=0;i<strlen(data);i++) sum = (sum + data[i]) % _tableSize; return sum; } 5

  6. Benefits of Hash tables • Seek time of O(1) • Easy to implement (c++ source) • Improves the performance drastically when working with files. 6

  7. Distributed Hash Tables • Definition: A hash table that is handled by many nodes in a network. Node 0 Node 1 Keys fragment of data 7

  8. Why is DHash important? • Load Balance • Decentralization • Scalability • Availability 8

  9. IDA algorithm • Splits a block of data into f fragments of size s/k. • k distinct fragments are sufficient to reconstruct the original block. f fragments 9

  10. Choosing values for k and f • k and f are selected to optimize for 8192-byte blocks. • k=7 creates 1170-bytes fragments that can fit inside a single IP packet when combined with RPC overhead • Having k=7, we can have f=14 and still be able to reconstruct a block 10

  11. Chord protocol • Implements hash-like look-up operation that maps 160-bit data keys to hosts. • Assigns hosts identifiers from the same 160-bits space as the keys. • The space can be viewed as a sorted by identifier circular linked list. 11

  12. Chord (cont’) • Each node knows the identity of its successor (IP, Chord identifier and synthetic coordinates) • Updates successor list when a node • Joins • Exists 12

  13. Chord API 13

  14. HTab API 14

  15. Block Insert: put(Key k, Block b) • Void put(k,b) // place one fragment on each successor{frags[] = IDAencode(b);succs = lookup(k, 14);for i from 0 to 13 send(succs[i].ipaddr,k,frags[i]);} 15

  16. Block get (k) • Block get (k) {// collect fragments from the successorsfrags = []; succs = lookup(k,7); //lookup at least 7 successorssort_by_latency(succs);for (i=0;i< succs# && I < 14;i++) { // download fragment <ret,data> = download(key,succ[i]) if (ret == OK) frags.push(data); // decode fragments to recover block <ret,block> = IDAdecode(frags); if (ret == OK) return (SHA-1(block) != k) ? FAILURE : block; if (i == #succs -1) { newsuccs = get_successor_list(succs[i]); sort_by_latency(newsuccs); succs.append(newsuccs) }}return FAILURE;} 16

  17. Questions? 17

  18. References • C++ In Action (Bartosz Milewski) • Robust and Efficient Data Management for a Distributed Hash Table by Josh Cates (Ms Thesis, MIT) • Chort: A scalable Peer-to-peer Lookup Service for Internet Applications (Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan, MIT) • Building Peer-to-Peer Systems With Chord, a Distributed Lookup Service (Frank Dabek, Emma Brunskill, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balalkrishnan) • Distributed Hash Tables: Architecture and Implementationhttp://www.usenix.org/events/osdi2000/full_papers/gribble/gribble_html/node4.html 18

More Related