1 / 29

Gossip Algorithms

Gossip Algorithms. Presented by George Frederick. Introduction. Designing scalable P2P application-level protocols isn’t straightforward and is still being actively researched

samara
Download Presentation

Gossip Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Gossip Algorithms Presented by George Frederick

  2. Introduction • Designing scalable P2P application-level protocols isn’t straightforward and is still being actively researched • Gossip algorithms are effective solutions for information dissemination in large-scale systems, especially for P2P networks • Gossip algorithms are inherently easy to deploy, robust, and resilient to failures

  3. Gossip Algorithm Properties • Mimic the spread of contagious diseases • Difficult to destroy once set loose into population • Much research has been performed into stopping epidemics but not aiding them • Nodes randomly choose other nodes to pass information to and basically cascade information through out the system along many channels at once

  4. Gossip Algorithm Properties • Buffer Capacity • How much space each node allocates for messages until the buffer capacity (b) is reached • Relay Count • Number of times (t) to relay a message • Pool of Potential Recipients/Fanout • The number of processes visible from this process, known as the fanout (f)

  5. Gossip Algorithm Properties • Most gossip algorithms are distinguished by varying values for b, t, and f • These parameters may be independent of the number (n) of processes in the system but work best when they scale with it • If b, t, and f are properly tuned, the same guarantees that deterministic algorithms make can also apply to gossip algorithms

  6. Practical Issues • Membership Maintenance • How do processes become aware of other processes? • Network Awareness • How do processes maintain a consistent view of overall network topology?

  7. Practical Issues • Buffer Management • How do processes handle messages when the message buffer becomes full? • Message Filtering • How do processes know which messages are relevant to send to which nodes?

  8. Membership Maintenance • Important because determining what view this process has of other processes affects the behavior of the entire algorithm • Maintaining a list of all processes in each process can take up too much space and network load • Therefore each process needs to see a subset of processes in the system

  9. Membership Maintenance • Tradeoff must be made between reliability and scalability • The more knowledge a process has about the system, the more storage and network traffic it requires • Knowing too little about the system increases the odds that processes become isolated and cannot effectively spread their messages throughout the system

  10. Network Awareness • Membership alone doesn’t take into account the fact that not all processes are created equal • Some processes may be running remotely and sending traffic to geographically distant processes just to communicate with close ones is a waste • A possible solution is to organize processes into hierarchies based on geographical distance

  11. Buffer Management • How to deal with new messages when the message buffer is full • If new messages are disallowed, new information never gets disseminated • If the oldest is disposed of, then it too may not be propagated • Can address with a couple of methods • Prioritization • Time stamping

  12. Message Filtering • It seems obvious that messages unnecessary to a node should be filtered • It is difficult to know which processes are interested in what messages • If a process isn’t interested, other processes in its view still might be • Can be addressed through the aforementioned hierarchy by also grouping by interest type

  13. Modeling • Gossip algorithms are predominantly evaluated empirically rather than theoretically • Theoretical models do not adequately encompass all factors present in practical situations • Real world networks can change dynamically, but most models assume a static network structure

  14. Key Elements • Problem • A distributed computing problem for the gossip algorithm to solve • System • The communication network and operating environment • Complexity • Time Complexity • Connectivity Complexity • Space Complexity

  15. Time Complexity • Total number of delivery rounds, measured from start state to termination • Due to nondeterministic nature of gossip algorithms, it is useful to measure using the probability that a given number of nodes will have been reached by given round

  16. Connectivity Complexity • Total communication channels established throughout the course of execution • Connections can be gained and lost dynamically throughout execution

  17. Space Complexity • The total amount of memory devoted to the algorithm throughout its execution • Space is used for views, message buffering, and history management • View measurements should be a function of network size • Message and history data should be a function of the total amount of data transferred over network channels throughout execution

  18. Problem Families • Three typical groupings of gossip algorithms • Information Spread • How to transmit effectively relay a message to the rest of the network • Aggregate Computation • How to gather data from each node to perform a computation • Overlay Management • How to organize the network in such a way that it exhibits some gestalt property

  19. Internal Process • Communication Phase • Choose subset of communication partners from local view and exchange information • Processing Phase • Perform state transition from current state to new state, determined by internal structure and new message information • Determines what message will be sent in the next round

  20. Node Structure and Behavior • Transition Model • Two basic modes: push or pull • Push mode spreads information to other nodes • Pull mode gathers information from other nodes • Communication Strategy • Determines how many nodes to communicate with and which to choose in each round • Can be deterministic or random

  21. Node Structure and Behavior • Buffer Management and Message Size • Determines how long to keep sharing a message • What messages to send when • What to do with duplicates • History Buffer Size • Related to buffer management • Determines if message has already been received and what to do if it has or hasn’t

  22. Network Topology • Degree Distribution • More edges coming from a node means more candidates to receive message • More edges coming into a node means higher likelihood of receiving messages • Scale-free networks may spread very efficiently or not efficiently at all, depending on the tuning of the gossip algorithm

  23. Network Topology • Closeness • Average distance between nodes • Each edge traversed has associated with it a probability that the message will be dropped • The closer a node is to a desired destination node, the more likely it is that the destination node will receive the message

  24. Network Topology • Betweenness • How many shortest paths a node lies on • High betweenness node can act as a choke point • Gossip algorithms perform best when many routes are available

  25. Network Topology • Eigenvector • Describes number of “popular” nodes connected to a specified node • “Popular” nodes are generally more likely to receive messages due to high degree • The more connections to popular nodes, the higher the chance that the node will receive messages

  26. Network Topology • Assortativity/Disassortativity • Describes tendency for nodes to form connections to similar or dissimilar nodes • High assortativity could help facilitate faster spread of specialized information among groups of nodes that need it • High disassortativity could hinder the spread of information, as the nodes desiring the information may be spread far apart

  27. Network Topology • Connectivity and density • Disconnected subgraphs obviously cannot be reached • Edge density can increase rate of spread as more routes are available for traversal • Conversely, could hinder spread in the case of large volume of redundant messages • Depends on gossip algorithm tuning and implementation

  28. Questions?

  29. References • Patrick T. Eugster, Rachid Guerraoui, Anne-Marie Kermarrec, Laurent Massoulieacute;, "Epidemic Information Dissemination in Distributed Systems," Computer, vol. 37,  no. 5,  pp. 60-67,  May,  2004 • Y. Fernandess, A. Fernández, and M. Monod. 2007. A generic theoretical framework for modeling gossip-based algorithms. SIGOPS Oper. Syst. Rev. 41, 5 (Oct. 2007), 19-27. DOI= http://doi.acm.org/10.1145/1317379.1317384 • D. J. Watts, P. S. Dodds, M. E. J. Newman. Identity and Search in Social Networks. Science 269(5571), 2002. • Indranil Gupta, Anne-Marie Kermarrec, Ayalvadi J. Ganesh, "Efficient Epidemic-Style Protocols for Reliable and Scalable Multicast," srds, p. 180,  21st IEEE Symposium on Reliable Distributed Systems (SRDS'02),  2002 • R. Karp, C. Schindelhauer, S. Shenker, B. Vocking, "Randomized rumor spreading," focs, p. 565,  41st Annual Symposium on Foundations of Computer Science,  2000 • L. Alvisi and J. M. Doumen, R. Guerraoui, B. Koldehofe, H. Li, R. van Renesse and G. Tredan. (2007) How robust are gossip-based communication protocols? Operating Systems Review, 41 (5). pp. 14-18. ISSN 0163-5980

More Related