1 / 69

Epidemics

Epidemics. Presented By: Lucas Cook and Wade Fagen CS 525, The University of Illinois (UIUC) 6 February 2007. History. Two schools of algorithms multicast: Proactive Reactive Existing Algorithms/Implementations: SRM IP Multicast (best-effort) NNTP (gossip) IRC (hierarchical multicast)

jela
Download Presentation

Epidemics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Epidemics Presented By: Lucas Cook and Wade Fagen CS 525, The University of Illinois (UIUC) 6 February 2007

  2. History • Two schools of algorithms multicast: • Proactive • Reactive • Existing Algorithms/Implementations: • SRM • IP Multicast (best-effort) • NNTP (gossip) • IRC (hierarchical multicast) • PSYC (multicast; more “web”-like than IRC)

  3. Multicast Routing • “Reliable” Multicast Routing: • Ensure that a message sent from any node is received by all other nodes within any distributed system. • …but we don’t live in an “ideal” world.

  4. Multicast Routing • Three general categories: • Algorithms that provide “strong” reliability properties. • Ex: atomic multicasts Round: #1 #2 #3 …

  5. Atomic Multicast • Nodes only process at the beginning of rounds • New rounds don’t start until all previous messages are received Round: #1 #2 #3 …

  6. Multicast Routing • Three general categories: • “strong” reliability algorithms • “best-effort” reliability algorithms • Ex: MUSE algorithm • Provides no assurance of end-to-end reliability assurance • Problems and solutions exist both at the physical network layer and the overlay area • Focus of distributed systems: overlay

  7. “Best Effort” Multicast Routing • Many algorithms implement some neighbor-based approach…

  8. “Best Effort” Multicast Routing • End-to-end assurance may be lost by one node’s failure:

  9. Multicast Routing • Three general categories: • “strong” reliability algorithms • “best-effort” reliability algorithms • “proactively probabilistic” multicast algorithms • Provides predictable reliability • Goal: Achieve better reliability than “best-effort” without the overhead of “strong” reliability • Method: Epidemics

  10. Epidemic Algorithms • Epidemics help ensure probabilistic end-to-end reliability with an assurance of “almost all” or “almost none” structure • Tradeoff between scale and reliability: epidemics allow for expansive scale with near-perfect reliability

  11. Epidemic Algorithms • To be less verbose, the following citations are used throughout the presentation: • [1]: Bimodal multicast, K Birman et al, ACM TOCS 1999 • [2]: Epidemic algorithms for replicated database maintenance, A. Demers et al, PODC 1987. • [3]: Gossip-based ad hoc routing, Z. Haas et al, Infocom 2002

  12. Epidemic Algorithms in Databases • Site updating has been a key problem since the beginning of distributed database work: • Data is injected at one site • Data needs to be updated at every site Incoming Transaction

  13. Epidemic Algorithms in Databases • Classic Examples: • NNTP • First use of e-mail servers • etc…

  14. Epidemic Algorithms in Databases • Three core concepts: • Direct Communication • Bottleneck! • Anti-Entropy Measure • Possible full comparison (slow!) • Rumor Management • Only updates!

  15. Epidemic Algorithms in Databases • Three states of a message /node • Susceptive: Message not received at node • Infective: Message is actively propagated by node • Removed: Message is no longer actively propagated by node

  16. Epidemic Algorithms in Databases • Decide on two phase algorithm: • Phase 1: Rumor Mongering • Probabilistic spread of messages to (hopefully) nearly all nodes • Considerations between Push/Pull models

  17. Epidemic Algorithms in Databases • Decide on two phase algorithm: • Phase 1: Rumor Mongering • Phase 2: Epidemic (Anti-Entropy) • Ran periodically in the background • Ran at each node

  18. Epidemic Push/Pull • Generic Epidemic Message • An epidemic message contains a summary of recent events • Two types: “push” and “pull” • The different types of messages allow formalization of the mathematics

  19. Epidemic Push/Pull • A “push” is a message sent from some infected site to a susceptible site. push Incoming Transaction

  20. Epidemic Push/Pull • A “pull” is a message sent from some susceptible site to an infected site push push pull Incoming Transaction

  21. Epidemic Algorithms in Databases • With the general idea, the specifics of [2] relate to databases: • Two primary distributed operations • “Data Insertion”: INSERT, UPDATE • “Data Deletion”: DELETE • Epidemic message for DELETE are augmented with a “death certificate” • In [2], SELECT is simply done locally at each distributed end point of the database

  22. Epidemic Algorithms in Databases • Results published in [2] • Key Result: Replacing deterministic algorithms for database consistency • Actual Results: Simulation-based solution only • Showed internal based results • Simulation of traditional schemes wasn’t done for accurate comparison

  23. Bimodal Multicast • [1] presents a bimodal multicast algorithm called pbcast • pbcast := “probabilistic broadcast”

  24. The pbcast Algorithm (from [1]) • Six Properties: • Atomicity (probabilistically) • Throughput Stability • Ordering (FIFO) • Multicast Stability • Detection of Lost Messages • Scalability • [Acceptability of soft failures]

  25. The pbcast Algorithm (from [1]) • Two sub-protocols: • Part 1: Hierarchical broadcast • Unreliable, “best-effort” approach • Part 2: Anti-entropy to correct packet loss if needed • Results in predictable end-to-end assurances

  26. The pbcast Algorithm (from [1]) • Basic Hierarchical Broadcast: m1 m1 m1 Node 1: {1} Node 2: {1} Node 3: {1} Node 4: {1}

  27. The pbcast Algorithm (from [1]) • Basic Hierarchical Broadcast: m1 m1 m1 m2 m2 m2 m1 m1 m1 m2 m2 m2 m2 m2 m1 m1 Node 1: {1, 2} Node 2: {1, 2} Node 3: {1, 2} Node 4: {1, 2}

  28. The pbcast Algorithm (from [1]) • Basic Hierarchical Broadcast: m1 m1 m1 m1 m1 m2 m2 m2 m2 m1 m1 m1 m1 m1 m1 m1 m1 m2 m2 m2 m2 m2 m2 m2 m3 m3 m3 m2 m2 m1 m1 m1 m1 m1 m1 Node 1: {1, 2} Node 2: {1, 2} Node 3: {1, 2, 3} Node 4: {1, 2} Node 1: {1, 2} Node 2: {1, 2} Node 3: {1, 2, 3} Node 4: {1, 2} Node 1: {1, 2} Node 2: {1, 2} Node 3: {1, 2, 3} Node 4: {1, 2}

  29. The pbcast Algorithm (from [1]) • Basic Hierarchical Broadcast: m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m2 m2 m2 m2 m2 m2 m2 m2 m4 m4 m4 m4 m4 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m2 m2 m2 m2 m2 m2 m2 m2 m2 m2 m2 m3 m3 m3 m3 m3 m3 m2 m2 m2 m2 m2 m2 m2 m2 m2 m2 m2 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m4 m4 m4 m4 m4 m4 m4 m4 m4 m4 m4 Node 1: {1, 2, 4} Node 2: {1, 2, 4} Node 3: {1, 2, 3, 4} Node 4: {1, 2, 4} Node 1: {1, 2, 4} Node 2: {1, 2, 4} Node 3: {1, 2, 3, 4} Node 4: {1, 2, 4}

  30. The pbcast Algorithm (from [1]) • Basic Hierarchical Broadcast: m1 m2 m4 m5 m4 m5 m1 m2 m3 m2 m1 m4 m5 Node 1: {1, 2, 4} Node 2: {1, 2, 4, 5} Node 3: {1, 2, 3, 4, 5} Node 4: {1, 2, 4, 5}

  31. The pbcast Algorithm (from [1]) • The anti-entropy protocol runs simultaneously with the broadcast messages • Protocol runs in rounds: • Ran at every process • Rounds longer than round-trip time • Paper suggests: ~100ms • … maybe a traffic-based metric would be better?

  32. The pbcast Algorithm (from [1]) • Anti-entropy round: m1 m1 m1 m1 m1 m1 m5 m5 m2 m2 m2 m2 m2 m3 m3 m3 m3 m4 m4 m4 Rounds need not be synchronized across nodes!

  33. The pbcast Algorithm (from [1]) • Anti-entropy round: m1 m1 m1 m1 m1 m1 m5 m5 m2 m2 m2 m2 m2 m3 m3 m3 m3 m4 m4 m4 For example sake, we’ll assume they happento all occur at the same time across all nodes

  34. The pbcast Algorithm (from [1]) • Anti-entropy round: • Gossip Messages: • Each process chooses another random process and sends a summary of its recent messages

  35. The pbcast Algorithm (from [1]) • Gossip Messages: m1 m1 m1 m1 m1 m1 m1 m1 m1 m5 m5 m5 m5 m5 m2 m2 m2 m2 m2 m2 m2 m2 m2 m3 m3 m3 m3 m3 m3 m3 m4 m4 m4 m4 m4 m4 Node 1  Node 3: {1} Node 2  Node 1: {1, 2} Node 3  Node 2: {1, 2} Node 4  Node 2: {1}

  36. The pbcast Algorithm (from [1]) • Gossip Messages: m1 m1 m1 m1 m1 m1 m5 m5 m2 m2 m2 m2 m2 m3 m3 m3 m3 m4 m4 m4 Node 1  Node 4: {1, 2} Node 2  Node 1: {1, 2, 4} Node 3  Node 4: {1, 2, 3, 4} Node 4  Node 3: {1, 2}

  37. The pbcast Algorithm (from [1]) • Summary contains missed messages: m1 m5 m2 m3 m4 Node 1  Node 4: {1, 2} Node 2  Node 1: {1, 2, 4} Node 3  Node 4: {1, 2, 3, 4} Node 4  Node 3: {1, 2} Node 1  Node 4: {1, 2} Node 2  Node 1: {1, 2, 4} Node 3  Node 4: {1, 2, 3, 4} Node 4  Node 3: {1, 2} Node 1  Node 4: {1, 2} Node 2  Node 1: {1, 2, 4} Node 3  Node 4: {1, 2, 3, 4} Node 4  Node 3: {1, 2}

  38. The pbcast Algorithm (from [1]) • Anti-entropy round: • Solicitation Messages: • Messages sent back to the sender of the gossip message requesting a resend of a given set of messages (not necessarily the original source) • Message Resend: • Upon reception of a solicitation message, the sender resends that message

  39. The pbcast Algorithm (from [1]) • Summary contains missed messages: m1 m5 This is m3… m2 m3 m4 What was m3? Node 1  Node 4: {1, 2} Node 2  Node 1: {1, 2, 4} Node 3  Node 4: {1, 2, 3, 4} Node 4  Node 3: {1, 2, 3}

  40. The pbcast Algorithm (from [1]) • Summary contains missed messages: m1 m5 m2 m3 m4

  41. The pbcast Algorithm (from [1]) • Anti-entropy Protocol: • [1] suggests a number of optimizations • Reduces numbers of rounds required to gossip about messages • Reduces the redundant messages • [1] also suggests a number of extensions: • Gossip about a messages to a fraction of all nodes (ex: 100 in a 10,000 node system)

  42. The pbcast Algorithm (from [1]) • Key Results:

  43. The pbcast Algorithm (from [1])

  44. The pbcast Algorithm (from [1])

  45. The pbcast Algorithm (from [1])

  46. Epidemics • With [1] and [2], a good overview of key epidemic behavior and strategies have been established. • In [1] and [2], domain specific optimizations were often applied.

  47. Ad-Hoc Routing Epidemics • [3] focuses on reachability of epidemics for wireless ad hoc routing • need to broadcast (multicast) to find routes • theoretical/abstract simulation analysis • 1 basic protocol, 4 extensions • Goal: find optimal configurations based on reachability vs. "load"

  48. Ad-Hoc Routing Epidemics • GOSSIP1(p, k) • k := The number of rounds of gossiping about the message with 100% probability • p := The probability of gossiping about the message after k rounds • Optimal: • 1000 nodes: (0.75, 4) to (0.65, 4) • Backpropagation Effects

  49. Ad-Hoc Routing Epidemics • GOSSIP1(p, k)

  50. Ad-Hoc Routing Epidemics • GOSSIP2(p1, k, p2, n) • Just like GOSSIP1(p1, k), unless the number of neighbors are n -- then GOSSIP1(p2, k) • Intuition: • Nodes with low degrees will have the hardest time receiving information • Optimal: • GOSSIP1(0.8, 4) performs like GOSSIP2(0.6, 4, 1, 6) but with 13% more messages

More Related