350 likes | 459 Views
Correctness of Gossip-Based Membership under Message Loss. Maxim Gurevich Idit Keidar Technion. The Setting. Many nodes – n 10,000s, 100,000s, 1,000,000s, … Come and go Churn Fully connected network Like the Internet Every joining node knows some others (Initial) Connectivity.
E N D
Correctness of Gossip-Based Membership under Message Loss Maxim Gurevich Idit Keidar Technion
The Setting • Many nodes – n • 10,000s, 100,000s, 1,000,000s, … • Come and go • Churn • Fully connected network • Like the Internet • Every joining node knows some others • (Initial) Connectivity
Membership: Each node needs to know some live nodes • Each node has a view • Set of node ids • Supplied to the application • Constantly refreshed • Typical size – log n
Applications • Applications • Gossip-based algorithm • Unstructured overlay networks • Gathering statistics • Work best with random node sample • Gossip algorithms converge fast • Overlay networks are robust, good expanders • Statistics are accurate
Modeling Membership Views • Modeled as a directed graph w y u v
Modeling Protocols: Graph Transformations • View is used for maintenance • Example: push protocol w z u v
Desirable Properties? • Randomness • View should include random samples • Holy grail for samples: IID • Each sample uniformly distributed • Each sample independent of other samples • Avoid spatial dependencies among view entries • Avoid correlations between nodes • Good load balance among nodes
What About Churn? • Views should constantly evolve • Remove failed nodes, add joining ones • Views should evolve to IID from anystate • Minimize temporal dependencies • Dependence on the past should decay quickly • Useful for application requiring fresh samples
Global Markov Chain • A global state – all n views in the system • A protocol action – transition between global states • Global Markov Chain G u v u v
Defining Properties Formally • Small views • Bounded dout(u) • Load balance • Low variance of din(u) • From any starting state, eventually(In the stationary distribution of MC on G) • Uniformity • Pr(v u.view) = Pr(w u.view) • Spatial independence • Pr(v u. view| y w. view) = Pr(v u. view) • Perfect uniformity + spatial independence load balance
Temporal Independence • Time to obtain views independent of the past • From an expected state • Refresh rate in the steady state • Would have been much longer had we considered starting from arbitrary state • O(n14) [Cooper09]
Existing Work: Practical Protocols Push protocol • Tolerates asynchrony, message loss • Studied only empirically • Good load balance [Lpbcast, Jelasity et al 07] • Fast decay of temporal dependencies [Jelasity et al 07] • Induce spatial dependence w w z z v v u u
Existing Work: Analysis w z Shuffle protocol • Analyzed theoretically [Allavena et al 05, Mahlmann et al 06] • Uniformity, load balance, spatial independence • Weak bounds (worst case) on temporal independence • Unrealistic assumptions – hard to implement • Atomic actions with bi-directional communication • No message loss * u v
Our Contribution : Bridge This Gap • A practical protocol • Tolerates message loss, churn, failures • No complex bookkeeping for atomic actions • Formally prove the desirable properties • Including under message loss
Send & Forget Membership • The best of push and shuffle • Some view entries may be empty w u v
S&F: Message Loss • Message loss • Or no empty entries in v’s view w w u v u v
S&F: Compensating for Loss • Edges (view entries) disappear due to loss • Need to prevent views from emptying out • Keep the sent ids when too little ids in view • Push-like when views are too small w w u v u v
S&F: Advantages over Other Protocols • No bi-directional communication • No complex bookkeeping • Tolerates message loss • Simple • Without unrealistic assumptions • Amenable to formal analysis Easy to implement
Key Contribution: Analysis • Degree distribution • Closed-form approximation without loss • Degree Markov Chain with loss • Stationary distribution of MC on the global graph G • Uniformity • Spatial Independence • Temporal Independence • Hold even under (reasonable) message loss!
Degree Distribution without loss • In all reachable graphs: • dout(u) + 2din(u) = const • Better than in a random graph – indegree bounded • Uniform stationary distribution on reachable states in G • Combinatorial approximation of degree distribution • The fraction of reachable graphs with specified node degree • Ignoring dependencies among nodes
Degree Distribution without Loss: Results • Similar (better) to that of a random graph • Validated by a more accurate Markov model
Setting Degree Thresholds to Compensate for Loss • Note: dout(u) + 2din(u) = const invariant no longer holds – indegree not bounded
Key Contribution: Analysis • Degree distribution • Closed-form approximation without loss • Degree Markov Chain with loss • Stationary distribution of MC on the global graph G • Uniformity • Spatial Independence • Temporal Independence
Degree Markov Chain • Given loss rate, degree thresholds, and degree distributions • Iteratively compute the stationary distribution … outdegree 0 2 4 6 … … State corresponding to isolated node 0 Transitions without loss … 1 … indegree Transitions due to loss 2 … … 3 …
Results • Outdegree is bounded by the protocol • Decreases with increasing loss • Indegree is not bounded by the protocol • Still, its variance is low, even under loss • Typical overload at most 2x
Key Contribution: Analysis • Degree distribution • Closed-form approximation without loss • Degree Markov Chain with loss • Stationary distribution of MC on the global graph G • Uniformity • Spatial Independence • Temporal Independence
Uniformity • Simple! • Nodes are identical • Graphs where uv isomorphic to graphs where uw • Same probability in stationary distribution
Key Contribution: Analysis • Degree distribution • Closed-form approximation without loss • Degree Markov Chain with loss • Stationary distribution of MC on the global graph G • Uniformity • Spatial Independence • Temporal Independence
Decay of Spatial Dependencies w w • Assume initially > 2/3 independent good expander • For uniform loss < 15%, dependencies decay faster than they are created … … u u v v u does not delete the sent ids
Decay of Spatial Dependencies: Results • 1 – 2loss ratefraction of view entries are independent • E.g., for loss rate of 3% more than 90% of entries are independent
Key Contribution: Analysis • Degree distribution • Closed-form approximation without loss • Degree Markov Chain with loss • Stationary distribution of MC on the global graph G • Uniformity • Spatial Independence • Temporal Independence
Temporal Independence • Start from expected state • Uniform and spatially independent views • High “expected conductance” of G • Short mixing time • While staying in the “good” component
Temporal Independence: Results • Ids travel fast enough • Reach random nodes in O(log n)hops • Due to “sufficiently many” independent ids in views • Dependence on past views decays within O(log n view size)time
Conclusions • Formalized the desired properties of a membership protocol • Send & Forget protocol • Simple for both implementation and analysis • Analysis under message loss • Load balance • Uniformity • Spatial Independence • Temporal Independence