“Umbrella”: A novel fixed-size DHT protocol

“Umbrella”: A novel fixed-size DHT protocol A.D. Sotiriou

Overview • Novel distributed hash table architecture • Supports key publishing and retrieval on top of an overlay network for content distribution • Efficient algorithms based on a distributed routing table of constant size for each node • Minimize traffic load

Related Work • Plaxton, Rajaraman and Richa • Algorithm wasn’t developed for P2P systems • Based on the ground rule of comparing one byte at a time • Required knowledge of latencies between all nodes • Tapestry • Variation of the Plaxton • Adjusted for P2P systems • Routing table of β*logβN neighbors , search of logβN maximum steps • Chord applied a different approach • Placed nodes in a circular space • Maintained information only for a number of successor and predecessor nodes through a finger table • Finger table of O(logN) size • CAN furthered on Pastry’s alternation and • Implied DHT in a d-dimensional Cartesian space based on a d-tore • Constantly divided space and distributed it amongst nodes • Maintained information about their neighbors • Constant O(d) table but required O(dN1/d) steps for lookups

Structure Overview • Creation of an overlay network • All inserting nodes are identified by a unique code • SHA-1 on combination of IP and computer name • Main objective of the architecture • Insert and retain nodes in a simple and well structured manner • Allow querying and fetching of content • Efficient • Fault-tolerant • Retain up-to-date information of a limited, constant number of neighboring nodes

Structure Overview • Form of a 16-ary tree • Each node is placed in a hierarchy • 1 parent node • 16 child nodes • Each node operates autonomously • Further links for fault-tolerance • Each level n withholding max 16n+1nodes • The relation between a parent node at level n and a child node (level n+1) : • The n+1 first digits of the parent’s identifier are equal with the corresponding of the child’s • The n+2 digit of the child’s identifier determines the child’s position in the parent’s child list

Routing Table • Three sets of neighborhood nodes • Basic Main table for routing • Upper Allows routing to nodes of higher level (when the parent node is unreachable) • Lower Allows routing to nodes of lower level (when child nodes fail) • Each node is responsible to modify or fix its routing table when nodes • Enter • Leave • Fail to communicate • Maximum steps required O(logbN)

Main Algorithms – Insert • Contacting an already connected node and issuing a request for insertion • The established node checks if the n+1 first digits of the identifier match its own, where n is the level the node resides • If not then the insertion message is forwarded to the node’s parent • If yes then the message is forwarded to the child with the n+2 digit common with that of the new node • If such a child does not exist then the new node is placed as a child to the current node • The new node is informed of his new neighbors and via versa

Main Algorithms – Publish • If the content’s identifier doesn’t have the first n+1 digits same as the node then the publish message is forwarded to the parent node • If they are matching, then it is forwarded to the child with the corresponding matching n+2 digit • If no such child exists then the node publishes the content itself

Main Algorithms – Search • The node first checks for the keyword in its list of published keywords • If it exists then the search terminates • If not, then it checks whether the first n+1 digits are identical to its own identifier • If not then the message is forwarded to its parent • If yes, then it’s forwarded to the child with corresponding n+2 digit matching • If no such exists, then the search fails

Main Algorithms – Departure • If the node has no children then all of its keywords are forwarded to its parent and it informs all its neighbors of its departure • If it has any child, then it randomly picks one and copies all of its neighborhood and keyword information to it before departing • The chosen child moves up a level and substitutes the departing node • If the child has any child, then the previous step is repeated recursively until a node with no children is reached and the first step is then executed ending the algorithm

Enhanced Algorithms • System liable to sudden node departures • Voluntary departure without calling appropriate mechanism • Sudden departures due to client errors • Network disconnections • Treat all of the above cases in the same manner • Changes in the algorithms already presented • Allow the system to bypass node failures • Most changes are based on using the upper and lower set • The upper set is utilized to forward messages to nodes of a higher level • The lower set for nodes on a lower level

Enhanced Algorithms– Parent Failure • Forward requests consequently to: • The parent’s parent node ( field Up2 on the upper set) • The node to the right of the parent node (field Right2 on the upper set) • The node to the left of the parent node (field Left2 on the upper set) • Whichever of the above succeeds first terminates the mechanism

Enhanced Algorithms– Child Failure • Forward requests consequently to: • One of the child’s child (field Umbrella2 on the lower set) • The node on the right of the child (field Umbrella on the basic set) • The node on the left of the child (field Umbrella on the basic set) • A child of the node right of the issuing node (field Right3) • A child of the node left of the issuing node (field Left3) • Whichever of the above succeeds first terminates the mechanism

Repair Mechanism • We have designed a repair mechanism • Invoked whenever such a failure is detected • Algorithm utilizes the delete algorithm in order to repair a failure to a child • All failures can be transformed into a child failure through contacting nodes in the neighboring table and forwarding a repair message • Once the appropriate node is reached and informed of the child failure, a variation of the delete algorithm is evoked in order to repair the failure • Substituting the failed node with one of its children • Deleting it if none is available • Each node is responsible for checking its neighborhood table periodically • Issuing ping messages to all node entries • Invoking the repair mechanism whenever a failure is detected • This mechanism increases the system’s stability and fault tolerance tremendously

Repair Mechanism • Check if the node had children • If it didn’t have any then just contact all of its neighbors by utilizing the neighborhood table and inform them of the new structure • If it did then one of them must be in the Umbrella2 entry • Pick a random entry in the Umbrella2 field and inform all neighbors of the change • The new child is informed and gathers the appropriate new neighborhood settings from nearby nodes

Simulation Results • Extended neurogrid simulator • Implemented umbrella algorithms • Two sets of results • Without repair mechanism • With repair mechanism • Variable network size • Random node failures

No Failures - Hops • Prove the integrity of our design under normal conditions • Conducted simulations with node populations varying from 10 nodes up to 6000 nodes • Investigated the number of hops required for a successful insertion and lookup with a varying population of nodes • Number of hops grows logarithmically with the node population in all mechanisms

No Failures - Messages • Investigated the overall traffic generated by our architecture • Total messages per request • Low number of messages exchanged • Due to the small number of hops required for each successful request • Also due to the limited (constant) number of neighbors maintained by each node • Total number increases linearly with the node population

Failures With No Repair • Conducted a second set of simulations to test the system’s tolerability against node failures • Progressively caused node failures from 0 up to 80% of the total node population, in steps of 5% • For a rate of up to 22% of failing nodes, the success rate is kept high (over 80%) • Slowly degrades up to a mid-point of 50% • Onwards our system becomes unstable and success rates drop dramatically

Failures With Repair – Success Rate • Conducted a third set or simulations with repair • Progressively caused node failures from 0 up to 80% of the total node population, in steps of 5% • 3T, 6T and 20T repair periods, where T is a constant representing communication activity • This ensures that an inactive node will not suffocate the network with repair messages • Repair mechanism dramatically increases the success rate • Regardless of the node population

Failures With Repair – Messages • Total amount was expected to increase • Remains almost constant for rate failures of up to 50% • Increases linearly from then on • In all cases, the total per node average is kept reasonably low

Conclusions • Novel protocol • Based on a distributed hash table • Supports key publishing • Retrieval on top of an overlay network • For content distribution • Analysed our system • Proved its correctiveness and efficacy • Its main strengths are • Fixed-size routing table • Provides efficient routing in O(logbN) steps • Even when more than half of the system’s population suddenly fails

Questions?

“Umbrella”: A novel fixed-size DHT protocol

“Umbrella”: A novel fixed-size DHT protocol

Presentation Transcript

Writing an Investigator Initiated Protocol: From Proposal to IRB-Ready

Chapter 10 File Systems: Interface

The FIX Protocol Moving Forwards

All About Books with a Big Umbrella Topic

Why photographers choose the umbrella softboxes

The Umbrella

P2P Protocol Requirements, part 2

Protocol Example

Drift leads to fixation at all loci

Analysis of 802.16e Multicast/Broadcast group privacy keying protocol

Umbrella Insurance

UDP Protocol Specification

Umbrella organisation of the youthcouncils

Presentation at Yale University Ramu Thiagarajan

Evolution Of The Umbrella

USAID Environmental Procedures applied to Subgrant or “Umbrella” projects

Umbrella AAI Photon / Neutron community

File Processing : File Organization and File Systems

The FIX Protocol Tool Kit Lunch and Learn - March 21, 2006

Decomposition Algorithm