200 likes | 274 Views
LHT : A Low-Maintenance Indexing Scheme over DHTs. Yuzhe Tang and Shuigeng Zhou 湯宇哲 , 周水庚 復旦大學 The 28th International Conference on Distributed Computing Systems. 89721001 博一 張睿元. Outline. Introduction Preliminaries Algorithms Experimental Results Conclusion. Introduction.
E N D
LHT:A Low-Maintenance Indexing Scheme over DHTs Yuzhe Tang and Shuigeng Zhou 湯宇哲,周水庚 復旦大學 The 28th International Conference on Distributed Computing Systems 89721001 博一 張睿元
Outline • Introduction • Preliminaries • Algorithms • Experimental Results • Conclusion
Introduction • DHTs have several outstanding advantages: --(1) Scalability and efficiency. --(2) Robustness. --(3) Load balance.
Introduction • Basic DHTs do not support complex query processing. • However, complex queries are highly desired and actually gaining popularity in many P2P applications.
Introduction • The popularity of complex queries poses an urgent need for DHT-based indexing schemes. • Generally speaking, to design an indexing structure, query efficiency comes as the first priority. • As a result, P2P systems have to invest a lot maintenance cost for adjusting their index structures.
Introduction • This problem, however, has not be effectively resolved in the existing P2P indexing schemes. • Instead, they focused on improving query efficiency, and as a trade-off, sacrificed maintenance efficiency. • For example, in Prefix Hash Tree (PHT), each leaf knows its neighboring leaves.
Introduction • Based on this observation, in this paper we propose LHT, a Low maintenance Hash Tree for data indexing in DHT based P2P systems. • LHT requires no modification of the underlying DHTs and can be easily adapted to any DHT substrate.
Algorithms • Incremental Tree Growth: --When a data insert and make the number of data in a leaf bucket exceed θsplit, then we will split this leaf bucket. # EX: θsplit=2 and insert 0.72 #010 0.62 0.70 #0100 #0101 0.62 0.70 0.72 0.72
Algorithms • LHT Lookup: Consider lookup for 0.9 with depth 14. λ #01110011001100 λfn(λ) NULL #0111001 #011100 λfn(λ) Not Exist 0.9 #011 #0 λ #01111 λ #01110
Algorithm • Range Queries: --Simple case: --General case:
Algorithms • Min/Max Queries: --Min:A DHT-Lookup of # returns the result. --Max:A DHT-Lookup of #0 returns the result.
Experimental Results • LHT has no need of periodical maintenance for index integrality and consistency. • LHT’s maintenance cost is only paid for its tree structure adjustment, incurred by data insertion/deletion. • This structural adjustment involves leaf split and merge. • In comparison with PHT, LHT’s saving ratio of maintenance cost can be up to 75% and at least 50%.
Experimental Results • Both uniform and gaussian datasets were used. • Maintenance Cost: --Data-Movement Cost: LHT is about 50% lower than PHT. --DHT-Lookup Cost: LHT is about 25% lower than PHT.
Experimental Results • Lookup Performance: --For uniform data: LHT is about 20% lower than PHT. --For gaussian data: LHT is about 30% lower than PHT. • When data size equals to some special number (ex:212, 216, 220 which lead the tree depth equal to D/2, D/4, 3D/8) the binary search thus can be resolved in the first (fewer) DHT-lookup.
Experimental Results • Range Query Performance: --Bandwidth: PHT(parallel)>PHT(sequential)≈LHT --Latency: PHT(parallel)>PHT(sequential)≈LHT LHT is about 18% lower than PHT(parallel). • PHT(sequential)’s near-optimal bandwidth consumption is owing to the presence of B+ tree-like leaf link which incurs extra maintenance cost. • PHT(parallel) can achieve competitive time latency, which however deteriorates when data distribution tends to be skewed (like in gaussian data).
Conclusion • This paper proposes LHT for data indexing over DHTs. • As compared with PHT, LHT can save up to 75%(at least 50%) maintenance cost, and achieves better performance in exact-match and range query processing. • LHT is adaptable to any generic DHT, and is easy to be implemented and deployed.