220 likes | 363 Views
An Optimal Broadcast Algorithm for Content- Addressable Networks. Ludovic Henrio Fabrice Huet Justine Rochas. Background Efficient Algorithm Experiments. General Motivation – RDF Storage. Context Web Semantic : RDF data C hallenge Store and retrieve RDF data
E N D
18/12/2013 - OPODIS (Nice) An Optimal BroadcastAlgorithmfor Content-Addressable Networks Ludovic Henrio Fabrice Huet Justine Rochas
General Motivation – RDF Storage • Context • Web Semantic: RDF data • Challenge • Store and retrieve RDF data • Large scalesetting • Our solution • Content Addressable Network
Content-Addressable Networks (CAN) 1 E B C • Overlay network • Nodes are peers A • Structuredorganization • MultidimensionalCartesianspace • Entirelypartitioned D dim #2 • A zone ismanaged by one peer • A zone = a (hyper)rectangle 0 1 dim #1 • Neighborhoodbased on adjacent zones • Routing = successivelyapproaching value in all dimensions
Problem: Cost of Queries 2 queries over 2 variables: conjunction of two 2-dimensional broadcast NOT OK • Naivebroadcastdoes not scale OK OK 1 query over 2 variables 1 query over 1 variable
Problem: Duplicated Messages 1 • Duplicated messages • 11 peers 40 messages ! • How to eliminateduplicates? • For eachpeerP • Find the peerthatisreponsiblefor sending the message to P E dim #2 0 1 dim #1
Existing Solutions • Use the CAN structure to route messages • Meghdoot [1] « upperLeft » predicate • M-CAN [2] • M-CAN principles • Initiatorpeersends to all neighbors • Otherpeersforward to neighbors on • Same dimension on opposite side • Lower dimensions on all sides • Forwarding on the last dimension depends on a constraint Meghdoot: startfrom a corner A C B [1] A. Gupta, O. D. Sahin, D. Agrawal, A. El Abbadi: Meghdoot: Content-Based Publish/Subscribe over P2P Networks. Middleware 2004
M-CAN Execution Message Message that leads to duplication INIT Corner Constraint [2] S. Ratnasamy, M. Handley, R. M. Karp, S. Shenker: Application-Level Multicast Using Content-Addressable Networks. Networked Group Communication 2001
PreliminaryWork • Existence of an optimal algorithmproved [3] • A solution to exhibit existence • Valid for a verygenericdefinition of CAN • Not efficient (execution time) • Parallelize messagessendingonlywhenreaching a « border » INIT [3] Francesco Bongiovanni, Ludovic Henrio: A Mechanized Model for CAN Protocols. FASE 2013
Hypothesis and Goals • CAN = adjacent rectangles • No additional structure • ToleratechurnsbetweentwoBcast • Not implementation-dependent • Do not toleratechurnsduringBcast • Optimal in number of messages and good parallelization INIT A spanningtree
Efficient Algorithm – Principle • Removes all duplicates • In all dimensions • How ? • Uses the corner constraint • Plus a spatial constraint • A set of fixed values • Reducethe problem • Appliesrecursively spatial constraint in 2D CAN spatial constraint in 3D CAN
Efficient Algorithm • Observation #1 • Easyto forward in 1D • Observation #2 • Only one zone touches a corner • Idea of the algorithm • Suppose an efficient broadcast in dimension N • Apply on a hyperplane of dimension N - 1 • Send to bothsidesof thishyperplaneusing the corner constraint • Repeatuntil the hyperplaneisjust a line (dimension 1)
Efficient Algorithm – Execution Message Message that leads to duplication INIT Corner Constraint Spatial Constraint
Efficient Algorithm – Properties • Proved to becorrect • All peersreceive a broadcast message at least once • Proved to beminimal • All peersreceive a broadcast message atmostonce • Elements of proof – Whenreceiving on dimension D: • dim < D spatial constraintissatisfied • For dim = D ascending or descending direction • dim > D corner constraintissatisfied • This algorithmisoptimal • All peersreceive a broadcast message exactly once
Experimental Setup • Using the Grid5000 platform • Multisiteexperimentation • Deployment • From 50 to 1500 peers • Up to 200 physical machines • CAN setting • Successivelysplit zones in half • Zone to split ischosenrandomly C A B
Number of messages Maximum gain of 5.3 MB
Execution Time Significantspeedup
Conclusion: Broadcast on CAN • Wefound an optimal solution • Proved to becorrectand optimal • Efficient on large scale settings • Support range multicast • Currentlyin use in the EventCloudproject [4] • Management of RDF data • Algorithmused for one year • Tested and approved ! A range multicast EventCloud [4] http://www.play-project.eu/solutions/event-cloud
dim #1 dim #3 dim #2