370 likes | 503 Views
IMAGINE-P2P:A Scalable P2P Platform for the Knowledge Grid. Hai Zhuge, Xiaoping Sun et al. China Knowledge Grid Research Group Institute of Computing Technology Chinese Academy of Sciences. Main work.
E N D
IMAGINE-P2P:A Scalable P2P Platform for the Knowledge Grid Hai Zhuge, Xiaoping Sun et al. China Knowledge Grid Research Group Institute of Computing Technology Chinese Academy of Sciences
Main work • IMAGINE-P2P:Integrated Multi-disciplinary Autonomous Global Innovation Networking Environment on P2P network A platform to efficiently support index-based path queries by incorporating a semantic overlay on a structured P2P network The deployment of a scalable distributed trie index for broadcast queries on key strings A decentralized load balancing method for improving the system utilization A replication method is used to improve the availability of distributed index
Outline • Background • Design Rationale • Architecture of IMAGINE-P2P • Deployment of Distributed Trie Index • Performance Improvements • Experiment Results • Conclusion
Background • Motivation: • Sharing: Expend services of resource sharing and cooperation from local distributed systems to large-scale and geographically distributed systems.
Background • A Challenge: • Scalability: “A SIMPLE GOAL”(Jim Gray, 2003) to scale up and scale out systems in large-scale and dynamic distributed environments.
Function Level C/S systems Knowledge Knowledge Grid To be explored Knowledge base P2P GRID Information Information Grid MIS Web Distributed DB Data Data Grid File sharing(Gnutella…) Database Computational Grid whatever@home Cluster Computation Local Distributed Global Distributed Global Decentralized Scalability Level Background • Current Situation:
Background • Our Goal : To build a scalable P2P platform of the Knowledge Grid — IMAGINE-P2P Provide architectural extensibility for different types of complex queries Achieve scalable performance of queries Improve the utilization and the availability
Design Rationale • Make reasonable trade-offs to achieve an acceptable scalability of the whole system. Distributed index —Topology dependent vs. Topology independent Topology —Complexity vs. Efficiency/Robustness Query routing —Complexity vs.Store/Query Efficiency Utilization —Load balancing vs.Query Efficiency Availability —Fault-tolerance vs.Store/QueryEfficiency
Knowledge Grid Applications Architecture of IMAGINE-P2P • Layered Architecture Future Knowledge Grid applications built on various distributed indexes A distributed trie index supporting scalable wild-card and broadcasting queries on objects Distributed Trie index Semantic Overlay Distributed indexes supporting scalable semantics-rich path queries on objects Object Overlay A P2P overlay network providing scalable management of resources
Architecture of IMAGINE-P2P • Object Overlay— Topology Consideration Theorem 1: Comparison-based structured overlays have to build a linear-order relation on their ID spaces to allow a deterministic routing. Theorem 2:Constructing a comparison-based structured overlay is the same as sorting IDs of nodes and objects by a linear-order relation, which features a lower bound of O(N log N) comparisons. N is the number of nodes. Decision:Ring topology is the most direct and simple way to build comparison-based structured overlay network. Chord is such a case.
N1 Physical node 1 N2 n 21 2K Ni Oi Object 2K+1 Oi ≤ 2k and there is no node Nj, Nj < Ni and Nj>Oi. Architecture of IMAGINE-P2P • Object Overlay—Topology Chord has O(log N) hops and proved correctness of stabilization in dynamic environments
SO1 SO3 SO1(O1, R, O3) SO2(O3, R, O6) SO3(O6, R, O7) SO2 Architecture of IMAGINE-P2P • Semantic Overlay— Basic structure Object Overlay Distributed Indexing Structure Query for a sp(O1O2O6O7) Indexing Node Object Semantic Object: SO=(a,R,b) N1 Physical node O1 1 N2 n O2 O3 21 Semantic Overlay O5 O4 O6 2K Ni O7 2K+1 Key Semantic path: a sp(a1R1a2R2…an-1Rn-1an)
Architecture of IMAGINE-P2P • Semantic Overlay— Querying Semantic Object: SO=(a,R,b) , either a or b, or both can be used as the keys by the DHT function. Semantic path: —a query q = a1R1a2R2…an-1Rn-1an is decomposed into n−1 subqueries, q1 = a1R1a2, q2 = a1a2R2a3, …, and qn-1 = a1a2…an-1Rn-1an . O (log N) for a semantic object. O (log N + L) for a semantic path of length L in the best cases. O (log N * L)for a semantic path of length L in the worst cases.
Architecture of IMAGINE-P2P • Semantic Overlay— Basic query operations
Distributed Trie Index— Basic Structure N1 Physical node computer computing 1 N2 n 21 o SO2(da, S, r) m SO3(dar, S, k) p SO4(dark, S, e) 2K u Ni t i 2K+1 e n r SO1 SO4 SO3 SO2 g / computer / computing Deployment of Distributed Trie Index A full trie index Query = dark // b c d SO1(d, S, a) a i r a c e g r k a k / / big t / back dark / create A trie path tp(dark) L=O(logmN), m = the size of attribute set, N the key number
Trie Index — Two basic types Deployment of Distributed Trie Index A full trie index A pruned trie index // // b c d b c d a i r o / / / a o / c create dark e m g r m big back k a p k p / / big t u / u back dark / t t create i e / / n computing computer r g / computer / computing
Trie Index— Compressed pruned trie index o / A newly added key string is published in one message with its new indexing node o. Deployment of Distributed Trie Index To avoid splitting and moving existing indexing nodes A pruned trie index A compressed pruned trie index c d b // / / b c / / / d / / / o create / computing dark back big create dark m big back computer p A key object is defined as KO(a1a2…aj,S,K), where key K=a1a2…aj…an and aj is the leaf trie node of the trie path of K u t / / computing computer
Deployment of Distributed Trie Index • If there is no SO(a1,S,e) or SO(a1,S,a2), SO(a1,S,e) is published and the key K is published by KO(a1,S,K). • If there is SO(a1,S,e) but no KO(a1,S,K1) where K1=a1b2b3…bn(b2≠a2), the key K is published by KO(a1,S,K). • If there are already SO(a1,S,e) and a KO(a1,S,K1) that shares some prefixes with K, where K1=a1a2…ajbj+1…bm, j≥2, and bj+1≠aj+1, SO(a1,S,e) is changed to SO(a1,S,a2) and two objects are published. One is SO(a1a2,S,e), the other is KO(a1a2,S,K). • If there is already a SO(a1,S,a2), forward the key K along the trie path tp(a1a2…ame) until to SO(a1 a2…am,S,e) (m≤n). If there is no such a KO(a1a2a3…am,S,K2) that K2=a1a2…amam+1bm+2…bp, just publish a KO(a1a2a3…am,S,K). Else change SO(a1a2…am,S,e) to SO(a1a2…am,S,am+1) and publish objects SO(a1a2a3…amam+1,S,e) and KO(a1a2a3…amam+1,S,K). • Trie Index— Publish compressed pruned trie index Same colored objects share the same prefix and thus can be published in one message.
q4 = abcdSe q1 = aSb q3 = abcSd q2 = abSc q5 = abcdeSe Deployment of Distributed Trie Index • Trie Index— Multi-access on physical nodes Query = abcde (a, b) (abcd, e) Node C Node A (abc, d) (ab, c) (abcde, e) Node B abcde On a full trie index and a pruned trie index
q2 = abcdSe q1 = aSb q3 = abcdeSe Deployment of Distributed Trie Index • Trie Index— Avoiding multi-access Query = abcde (a, b) (abcd, e) Node C Node A (abc, d) (ab, c) (abcde, e) Node B abcde On a full trie index and a pruned trie index
q4 = abcdSe q1 = aSb q3 = abcSd q2 = abSc Deployment of Distributed Trie Index • Trie Index—Multi-access Query = abcdef (a, b) (abcd, e) Node C Node A (abc, d) abcdef (ab, c) (abcde, e) Node B On a compressed pruned trie index
q2’ = abcdSef q1 = aSb q2 = abS~cdef Deployment of Distributed Trie Index • Trie Index—Avoid multi-access Query = abcdef (a, b) (abcd, e) Node C Node A (abc, d) abcdef (ab, c) (abcde, e) Node B On a compressed pruned trie index
Performance Improvements • Utilization Improvement— Decentralized load balancing Target: for each node ni(i=1,2,,N), Action: nimoves loads to neighbors nodes nj selected from its neighbor node set according to: Which object should be moved: and When should the object be moved: Where should the object be moved: with
Performance Improvements • Availability Improvement— Using path key replication to improve availability of semantic paths and distributed trie paths. Duplicate a semantic object SO(a,R,b) by using key a and key b to publish it. A path key of a semantic object contains the path information of the objects published before it on the same path. And A semantic object can be recovered from any latterly published semantic object on the same semantic path.
Experiment Results • An event-driven simulation environment • Simulation on a ring network with 200 and 2000 nodes. • Different distributions of object loads and node capacities are tested.
Experiment Results Trie index properties compared with B-tree and B+-tree. Compressed trie index has very short average depth.
Experiment Results The size of a trie index is sensitive to only key string distribution. The independence to the network size and the number of keys make it scalable in large-scale and dynamic environment.
Experiment Results Average search hops of a broadcast query for all the keys on the network using distributed trie indexes in network with different size and key number.
Experiment Results An optimized search on trie indexes with 2349 PDF file names as keys
Experiment Results Load balancing process show the variance of the system load decreasing with the load balancing iterations in different load distributions.
Experiment Results Chord uses virtual servers to improve the load balance, where each physical node holds more than one virtual server and data objects are mapped by DHT function to virtual servers instead of physical nodes. They proposed that logN virtual servers per physical node can be optimal with high probability when considering only the number of keys.
Experiment Results Load balancing process works effectively for distributed trie indexes that cause heavily imbalanced load distributions
Experiment Results If each extra hop incurred by the load balancing does not significantly delay a query, the average query latency under load balancing can be reduced when only considering storage consumption of objects.
Experiment Results The availability of the full trie with the replication is better than that of the pruned trie because the pruned trie has much shorter path length and there are fewer copies in path key replication. The pruned trie however has better availability without replication, because it has much shorter search paths, i.e., it is less probably broken under the same failure distribution.
Conclusion • Publishing distributed indexes using semantic overlay methods can be a solution to support complex queries with high level semantics. • There are many conflicting factors that should be compromised when designing P2P system to achieve a scalable solution. • The distributed trie index can be scalable in large-scale and dynamic environments where keys string distribution is relatively stable. • Decentralized load balancing in large-scale and dynamic distributed systems can work effectively. • Future work still faces challenging in building more efficient distributed indexes, relieving hot spots on distributed indexes, improving availability while keeping system decentralized and scalable. • Future theoretic work should show that to what scale the trade-off can be made to achieve an acceptable scalability. This work has been published in IEEE Transaction on Knowledge and Data Engineering
Questions and Comments Thanks! Full paper is available at IEEE Transactions on Knowledge and Data Engineering