Enhancing DHT Reliability with MultiRouting: Design and Implementation

Increasing the Reliability of DHTs using MultiRouting James Newell CS598ig Scattered Systems Jan 28, 2005

Distributed Hash Tables • DHTs ID space spans over many nodes • Stores objects on nodes (files, address, caches) • Routes key to correct node storing replica 0x00 – 0x1F 0x20 – 0x3F 0x40 – 0x5F 0x60 – 0x7F

Overcoming Churn • Routing tables are susceptible to churn, node and link failures • Standard techniques to mitigate problem • Retrying alternate routes • Adding additional replicas • Result: High complexity with mediocre improvement • Constrained to underlying ID space • Routing infrastructure is difficult to modify

MultiRouting • Build an additional layer on top of multiple DHT substrates • Independent replica placement • Differing routing behavior • Predict which substrate will most likely succeed • Adaptive properties to increase availability during transient failures • Increased lookup performance by exploiting opposing DHT strengths

Design Description • User application interfaces with MultiRouter API • Multiple underlying DHTs • Transparent to application • Run concurrent and independent • Customizable DHT combinations • “Plug-and-Play” ensures compatibility with traditional networks User App MultiRouter DHT 1 DHT 2 DHT 3 Physical Layer

MultiRouter User Operations • Simple user operations • Join network • Insert object key • Lookup object key • Remove object key Once the join is finished, it can now handle inserts and queries The MultiRouter uses the DHT(s) that it predicts will perform the best For inserts, the MultiRouter replicates object keys on all DHTs to improve availability Simultaneously joins all networks

MultiRouter Lookups • How does the MultiRouter decide which DHT(s) to use? • Previous object metrics are maintained in a StatsTable • Stats are entered into a cost function CDHT(x) • A set of rules R interprets the results of the cost functions • The rules return a BOOL that indicates the set of DHT(s) that should be included

Formal Notation • Given an array of statistical data di • M is a set of Metric functions (m1, m2,…mn) M where mi(di, t) → Real • The cost function for DHTi is Ci = 1(m1) + 2(m2) + … + n(mn) • R is the set of rules s.t. Ri(C, Ci)→Bool, where C is the set of all cost functions. • The union of all rules is the final set of DHT(s)

MultiRouter StatsTable • Information stored in the StatsTable is customizable to the application. • Metrics are updated upon a query request or timeout • StatsTable is initially cold but could be optionally warmed-up by neighbor Example Metrics

Implementation • Fully developed MultiRouter prototype • Java 1.4.2 • Link-layer discrete-event simulator • FreePastry derivation (including filetuple storage system) • Kelips derivation • Simple MultiRouter overlay • Objects are simple filetuples <key, address>

Network Simulator • Discrete-event, link-layer simulator • Nodes pass full messages after a certain delay • Time measured in discrete rounds (10 msec) • Routing uses an ideal shortest-path algorithm (No routing failures or queuing effects) • Transient and permanent node and link failures • Uses GT-ITM generated trans-stub topologies and an event file.

DHT Substrates • Pastry • Uses routing table, leaf set, neighborhood set, and file table • Pastry routes messages to the node with the closest ID to the key [in ceil(log2b N) hops] • Low message overhead but higher latency and failure rate • Kelips • Uses √N affinity groups and contacts • Actively gossips “heartbeats” to maintain up-to-date information • Constant time lookup and high fault-tolerance, but high message overhead and slow steady-state

MultiRouter Prototype • Similar in design to Kelips and Pastry but interacts at with only DHTs • Inserts filetuples on all DHTs • Uses old information on rejoins • Maintains two metrics: latency and success rate • Common to most applications • Encompass long-term and short-term performance

Metric Functions • Use aging function to smooth out variations: latencyi(l) =  · l + (1 - ) · latencyi-1 • Use decaying function to effectively ignore old failures: failurei(f, t) = f +  · failurei-1 • Strong spikes of failure allows the MultiRouter to respond to failures promptly • Quick decay of failure prevents transient problems from having long-term effects on MultiRouter behavior

Cost and Rule Functions • Each metric holds equal weight • Use rule sets to “interpret” the cost results • Prototype rules • A DHT is cold when • Filetuple was recently added • currenttime - lastsent > threshold

Experimental Results • Use simulation runs and trace-based experiments • Micro-benchmark: “proof-of-concept” • Generic churn • Overnet traced-based churn • Message Overhead Analysis • DHT parameters are consistent across experiments

Micro-benchmark • “Proof-of-concept” of MultiRouter’s adaptability properties • Inserted one filetuple and queried the filetuple every 0.5 sec for 2 mins • MultiRouter uses Pastry until Kelips disseminates the information through the affinity group • MultiRouter has generally lower latency than both DHTs

Generic Churn • Two queries from 100 inserted filetuples every second with two members leave and join every second • MultiRouters success rate is 10% better than Kelips and 35% better than Pastry • Latency is not sacrificed for increased success rate

Overnet-trace Churn • Mapped Overnet trace files to 500 nodes with 100 inserted filetuples • Same querying behavior • Scaled down and varied interval between trace files • Similar results as generic churn

Message Overhead • Increased message overhead is equivalent to the summation of its substrate’s overhead • Unavoidable due to “plug-and-play” aspect • An intelligent choice of DHT combinations does not drastically increase overhead

Conclusion • MultiRouter is a “über-overlay” • Improves both success rate and performance under stressful conditions • Takes advantage of differing properties of independent DHT substrates • Calculates which DHT(s) are most likely successful • Does not drastically increase message overhead • Future work include an expressive metric/rule language and a mechanism to easily “bridge” multiple DHT networks.

Enhancing DHT Reliability with MultiRouting: Design and Implementation

Enhancing DHT Reliability with MultiRouting: Design and Implementation

Presentation Transcript

Increasing the penetration of wind using DER

Increasing the Reliability of Wellness Metrics in Unique Groups

Overlays and DHTs

The Theory of Reliability

Program 1. Increasing the reliability of grain supply and reducing feed costs.

Increasing capacity and reliability of the Waalbrug and K. Traianusplein

Increasing Reliability of Performance-critical Pipeline structures

Efficient Ways of Increasing In-Line Inspection Reliability

Applications of DHTs

Overlays and DHTs

Increasing the price of

Origin Reliability of the

Increasing the Reliability of Preclinical Research

Reliability of the Bible

The Reliability of the Bible

Applications of DHTs

The Reliability of the Bible

Increasing Reliability in the Dodge Cummins When Completing Repairs

The Reliability of the Bible

Efficient Ways of Increasing In-Line Inspection Reliability

The Reliability of the Bible

THE RELIABILITY OF THE GAME