190 likes | 211 Views
A replication-driven overlay. UIUC / INRIA Collaboration. Goal: Large-scale replication. What are the difficulties Manage a large number of peers IDs Parameters Efficiency of group communications Solution: partial view of the group Easier group management
E N D
A replication-driven overlay UIUC / INRIA Collaboration GDS meeting - LIP6
Goal: Large-scale replication • What are the difficulties • Manage a large number of peers • IDs • Parameters • Efficiency of group communications • Solution: partial view of the group • Easier group management • Sub-group communications enabled GDS meeting - LIP6
Solution 1: hierarchical architecture • Selected solution for • JuxMem (cluster-groups / juxmem-group) • Hierarchical failure detectors(local / global) • Principle: deterministically partition groups into sub-groups • Takes advantage of the underlying architecture • Hard to create and maintain (need for representative nodes for upper levels) GDS meeting - LIP6
Hierarchical architecture GDS meeting - LIP6
Solution 2: randomly-chosen neighbors • Principle: • Neighbors randomly-chosen • Weights are used to drive the random choice • Self-organization • Hard to provide strong guaranties GDS meeting - LIP6
Existing architectures • Structured overlays (DHT based) • Unstructured overlays (random neighbors) • No overlay take the application into account • Overlay should be malleable, application-driven • Link peers with common interest GDS meeting - LIP6
Application-driven overlay • None existing for the moment • General-purpose overlay: difficult to know which peer will communicate with which one • While using replication • Communications expected within replica-groups • All members of a replica-group should be close (hop count) in the overlay GDS meeting - LIP6
Basic idea GDS meeting - LIP6
Basic overlay: random-links • Random-links created to maintain peers connectivity • Soft limit n: maximal number of link • Hard limit according to the node resources • p: number of “close” neighbors (RTT) • n-p: number of “long-distance” links GDS meeting - LIP6
Overlay malleability: data-links • Creation of new replicas • Transforming random links into data links • Adding dataID to existing data links • Adding data-links overriding the soft limit • Malleability • Overlay self-adaptation to make replicas closer GDS meeting - LIP6
New replica insertion GDS meeting - LIP6
Refreshing the graph • New node insertion, replica creation and failures => pathologic topologies • Periodically the graph is refreshed • Peers drop links and start random walks to create new ones GDS meeting - LIP6
Random-walk • Data-independent • Using random-links and data-links • Useful for peer insertion, failure reparation • Data-dependent • Using data-links only • Useful for new replica insertion • Principle • Max hop count • Next hop chosen randomly (weighted by stored data, RTT, number of links) GDS meeting - LIP6
Propagating updates • Use data-links • Data sent and forwarded on data links with a version number to avoid cycles • No knowledge of the whole group, even it size • => Concurrent writes may occur Need to detect write conflicts GDS meeting - LIP6
Conflict detection • Each update is sent with the previous data hash • Each peer keeps a hash history for each piece of data • Upon update request reception • If the received hash match the local one, the update is applied and forwarded • Else, the last local hash is sent back • if it is in the sender’s hash history, some updates are missing • Else, it’s a conflict => reported to the application GDS meeting - LIP6
Simulation • Discrete event simulator • Using accurate topology files • Application fully configurable (replicate, read, write, time) • Try to make it “JuxMem-compliant” • 38 java classes / ~3200 lines GDS meeting - LIP6
Ongoing work • Need simulation data • Many parameters to tune • How the overlay is modified by data-links • Degree distribution, • #random links and #datalinks wrt #data and #nodes • Refine algorithms • Theoretical analysis • JuxMem integration ? Does it make sense ? GDS meeting - LIP6