210 likes | 279 Views
Increasing the Reliability of DHTs using MultiRouting. James Newell CS598ig Scattered Systems Jan 28, 2005. Distributed Hash Tables. DHTs ID space spans over many nodes Stores objects on nodes (files, address, caches) Routes key to correct node storing replica. 0x00 – 0x1F. 0x20 – 0x3F.
E N D
Increasing the Reliability of DHTs using MultiRouting James Newell CS598ig Scattered Systems Jan 28, 2005
Distributed Hash Tables • DHTs ID space spans over many nodes • Stores objects on nodes (files, address, caches) • Routes key to correct node storing replica 0x00 – 0x1F 0x20 – 0x3F 0x40 – 0x5F 0x60 – 0x7F
Overcoming Churn • Routing tables are susceptible to churn, node and link failures • Standard techniques to mitigate problem • Retrying alternate routes • Adding additional replicas • Result: High complexity with mediocre improvement • Constrained to underlying ID space • Routing infrastructure is difficult to modify
MultiRouting • Build an additional layer on top of multiple DHT substrates • Independent replica placement • Differing routing behavior • Predict which substrate will most likely succeed • Adaptive properties to increase availability during transient failures • Increased lookup performance by exploiting opposing DHT strengths
Design Description • User application interfaces with MultiRouter API • Multiple underlying DHTs • Transparent to application • Run concurrent and independent • Customizable DHT combinations • “Plug-and-Play” ensures compatibility with traditional networks User App MultiRouter DHT 1 DHT 2 DHT 3 Physical Layer
MultiRouter User Operations • Simple user operations • Join network • Insert object key • Lookup object key • Remove object key Once the join is finished, it can now handle inserts and queries The MultiRouter uses the DHT(s) that it predicts will perform the best For inserts, the MultiRouter replicates object keys on all DHTs to improve availability Simultaneously joins all networks
MultiRouter Lookups • How does the MultiRouter decide which DHT(s) to use? • Previous object metrics are maintained in a StatsTable • Stats are entered into a cost function CDHT(x) • A set of rules R interprets the results of the cost functions • The rules return a BOOL that indicates the set of DHT(s) that should be included
Formal Notation • Given an array of statistical data di • M is a set of Metric functions (m1, m2,…mn) M where mi(di, t) → Real • The cost function for DHTi is Ci = 1(m1) + 2(m2) + … + n(mn) • R is the set of rules s.t. Ri(C, Ci)→Bool, where C is the set of all cost functions. • The union of all rules is the final set of DHT(s)
MultiRouter StatsTable • Information stored in the StatsTable is customizable to the application. • Metrics are updated upon a query request or timeout • StatsTable is initially cold but could be optionally warmed-up by neighbor Example Metrics
Implementation • Fully developed MultiRouter prototype • Java 1.4.2 • Link-layer discrete-event simulator • FreePastry derivation (including filetuple storage system) • Kelips derivation • Simple MultiRouter overlay • Objects are simple filetuples <key, address>
Network Simulator • Discrete-event, link-layer simulator • Nodes pass full messages after a certain delay • Time measured in discrete rounds (10 msec) • Routing uses an ideal shortest-path algorithm (No routing failures or queuing effects) • Transient and permanent node and link failures • Uses GT-ITM generated trans-stub topologies and an event file.
DHT Substrates • Pastry • Uses routing table, leaf set, neighborhood set, and file table • Pastry routes messages to the node with the closest ID to the key [in ceil(log2b N) hops] • Low message overhead but higher latency and failure rate • Kelips • Uses √N affinity groups and contacts • Actively gossips “heartbeats” to maintain up-to-date information • Constant time lookup and high fault-tolerance, but high message overhead and slow steady-state
MultiRouter Prototype • Similar in design to Kelips and Pastry but interacts at with only DHTs • Inserts filetuples on all DHTs • Uses old information on rejoins • Maintains two metrics: latency and success rate • Common to most applications • Encompass long-term and short-term performance
Metric Functions • Use aging function to smooth out variations: latencyi(l) = · l + (1 - ) · latencyi-1 • Use decaying function to effectively ignore old failures: failurei(f, t) = f + · failurei-1 • Strong spikes of failure allows the MultiRouter to respond to failures promptly • Quick decay of failure prevents transient problems from having long-term effects on MultiRouter behavior
Cost and Rule Functions • Each metric holds equal weight • Use rule sets to “interpret” the cost results • Prototype rules • A DHT is cold when • Filetuple was recently added • currenttime - lastsent > threshold
Experimental Results • Use simulation runs and trace-based experiments • Micro-benchmark: “proof-of-concept” • Generic churn • Overnet traced-based churn • Message Overhead Analysis • DHT parameters are consistent across experiments
Micro-benchmark • “Proof-of-concept” of MultiRouter’s adaptability properties • Inserted one filetuple and queried the filetuple every 0.5 sec for 2 mins • MultiRouter uses Pastry until Kelips disseminates the information through the affinity group • MultiRouter has generally lower latency than both DHTs
Generic Churn • Two queries from 100 inserted filetuples every second with two members leave and join every second • MultiRouters success rate is 10% better than Kelips and 35% better than Pastry • Latency is not sacrificed for increased success rate
Overnet-trace Churn • Mapped Overnet trace files to 500 nodes with 100 inserted filetuples • Same querying behavior • Scaled down and varied interval between trace files • Similar results as generic churn
Message Overhead • Increased message overhead is equivalent to the summation of its substrate’s overhead • Unavoidable due to “plug-and-play” aspect • An intelligent choice of DHT combinations does not drastically increase overhead
Conclusion • MultiRouter is a “über-overlay” • Improves both success rate and performance under stressful conditions • Takes advantage of differing properties of independent DHT substrates • Calculates which DHT(s) are most likely successful • Does not drastically increase message overhead • Future work include an expressive metric/rule language and a mechanism to easily “bridge” multiple DHT networks.