690 likes | 854 Views
Epidemics. Presented By: Lucas Cook and Wade Fagen CS 525, The University of Illinois (UIUC) 6 February 2007. History. Two schools of algorithms multicast: Proactive Reactive Existing Algorithms/Implementations: SRM IP Multicast (best-effort) NNTP (gossip) IRC (hierarchical multicast)
E N D
Epidemics Presented By: Lucas Cook and Wade Fagen CS 525, The University of Illinois (UIUC) 6 February 2007
History • Two schools of algorithms multicast: • Proactive • Reactive • Existing Algorithms/Implementations: • SRM • IP Multicast (best-effort) • NNTP (gossip) • IRC (hierarchical multicast) • PSYC (multicast; more “web”-like than IRC)
Multicast Routing • “Reliable” Multicast Routing: • Ensure that a message sent from any node is received by all other nodes within any distributed system. • …but we don’t live in an “ideal” world.
Multicast Routing • Three general categories: • Algorithms that provide “strong” reliability properties. • Ex: atomic multicasts Round: #1 #2 #3 …
Atomic Multicast • Nodes only process at the beginning of rounds • New rounds don’t start until all previous messages are received Round: #1 #2 #3 …
Multicast Routing • Three general categories: • “strong” reliability algorithms • “best-effort” reliability algorithms • Ex: MUSE algorithm • Provides no assurance of end-to-end reliability assurance • Problems and solutions exist both at the physical network layer and the overlay area • Focus of distributed systems: overlay
“Best Effort” Multicast Routing • Many algorithms implement some neighbor-based approach…
“Best Effort” Multicast Routing • End-to-end assurance may be lost by one node’s failure:
Multicast Routing • Three general categories: • “strong” reliability algorithms • “best-effort” reliability algorithms • “proactively probabilistic” multicast algorithms • Provides predictable reliability • Goal: Achieve better reliability than “best-effort” without the overhead of “strong” reliability • Method: Epidemics
Epidemic Algorithms • Epidemics help ensure probabilistic end-to-end reliability with an assurance of “almost all” or “almost none” structure • Tradeoff between scale and reliability: epidemics allow for expansive scale with near-perfect reliability
Epidemic Algorithms • To be less verbose, the following citations are used throughout the presentation: • [1]: Bimodal multicast, K Birman et al, ACM TOCS 1999 • [2]: Epidemic algorithms for replicated database maintenance, A. Demers et al, PODC 1987. • [3]: Gossip-based ad hoc routing, Z. Haas et al, Infocom 2002
Epidemic Algorithms in Databases • Site updating has been a key problem since the beginning of distributed database work: • Data is injected at one site • Data needs to be updated at every site Incoming Transaction
Epidemic Algorithms in Databases • Classic Examples: • NNTP • First use of e-mail servers • etc…
Epidemic Algorithms in Databases • Three core concepts: • Direct Communication • Bottleneck! • Anti-Entropy Measure • Possible full comparison (slow!) • Rumor Management • Only updates!
Epidemic Algorithms in Databases • Three states of a message /node • Susceptive: Message not received at node • Infective: Message is actively propagated by node • Removed: Message is no longer actively propagated by node
Epidemic Algorithms in Databases • Decide on two phase algorithm: • Phase 1: Rumor Mongering • Probabilistic spread of messages to (hopefully) nearly all nodes • Considerations between Push/Pull models
Epidemic Algorithms in Databases • Decide on two phase algorithm: • Phase 1: Rumor Mongering • Phase 2: Epidemic (Anti-Entropy) • Ran periodically in the background • Ran at each node
Epidemic Push/Pull • Generic Epidemic Message • An epidemic message contains a summary of recent events • Two types: “push” and “pull” • The different types of messages allow formalization of the mathematics
Epidemic Push/Pull • A “push” is a message sent from some infected site to a susceptible site. push Incoming Transaction
Epidemic Push/Pull • A “pull” is a message sent from some susceptible site to an infected site push push pull Incoming Transaction
Epidemic Algorithms in Databases • With the general idea, the specifics of [2] relate to databases: • Two primary distributed operations • “Data Insertion”: INSERT, UPDATE • “Data Deletion”: DELETE • Epidemic message for DELETE are augmented with a “death certificate” • In [2], SELECT is simply done locally at each distributed end point of the database
Epidemic Algorithms in Databases • Results published in [2] • Key Result: Replacing deterministic algorithms for database consistency • Actual Results: Simulation-based solution only • Showed internal based results • Simulation of traditional schemes wasn’t done for accurate comparison
Bimodal Multicast • [1] presents a bimodal multicast algorithm called pbcast • pbcast := “probabilistic broadcast”
The pbcast Algorithm (from [1]) • Six Properties: • Atomicity (probabilistically) • Throughput Stability • Ordering (FIFO) • Multicast Stability • Detection of Lost Messages • Scalability • [Acceptability of soft failures]
The pbcast Algorithm (from [1]) • Two sub-protocols: • Part 1: Hierarchical broadcast • Unreliable, “best-effort” approach • Part 2: Anti-entropy to correct packet loss if needed • Results in predictable end-to-end assurances
The pbcast Algorithm (from [1]) • Basic Hierarchical Broadcast: m1 m1 m1 Node 1: {1} Node 2: {1} Node 3: {1} Node 4: {1}
The pbcast Algorithm (from [1]) • Basic Hierarchical Broadcast: m1 m1 m1 m2 m2 m2 m1 m1 m1 m2 m2 m2 m2 m2 m1 m1 Node 1: {1, 2} Node 2: {1, 2} Node 3: {1, 2} Node 4: {1, 2}
The pbcast Algorithm (from [1]) • Basic Hierarchical Broadcast: m1 m1 m1 m1 m1 m2 m2 m2 m2 m1 m1 m1 m1 m1 m1 m1 m1 m2 m2 m2 m2 m2 m2 m2 m3 m3 m3 m2 m2 m1 m1 m1 m1 m1 m1 Node 1: {1, 2} Node 2: {1, 2} Node 3: {1, 2, 3} Node 4: {1, 2} Node 1: {1, 2} Node 2: {1, 2} Node 3: {1, 2, 3} Node 4: {1, 2} Node 1: {1, 2} Node 2: {1, 2} Node 3: {1, 2, 3} Node 4: {1, 2}
The pbcast Algorithm (from [1]) • Basic Hierarchical Broadcast: m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m2 m2 m2 m2 m2 m2 m2 m2 m4 m4 m4 m4 m4 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m2 m2 m2 m2 m2 m2 m2 m2 m2 m2 m2 m3 m3 m3 m3 m3 m3 m2 m2 m2 m2 m2 m2 m2 m2 m2 m2 m2 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m1 m4 m4 m4 m4 m4 m4 m4 m4 m4 m4 m4 Node 1: {1, 2, 4} Node 2: {1, 2, 4} Node 3: {1, 2, 3, 4} Node 4: {1, 2, 4} Node 1: {1, 2, 4} Node 2: {1, 2, 4} Node 3: {1, 2, 3, 4} Node 4: {1, 2, 4}
The pbcast Algorithm (from [1]) • Basic Hierarchical Broadcast: m1 m2 m4 m5 m4 m5 m1 m2 m3 m2 m1 m4 m5 Node 1: {1, 2, 4} Node 2: {1, 2, 4, 5} Node 3: {1, 2, 3, 4, 5} Node 4: {1, 2, 4, 5}
The pbcast Algorithm (from [1]) • The anti-entropy protocol runs simultaneously with the broadcast messages • Protocol runs in rounds: • Ran at every process • Rounds longer than round-trip time • Paper suggests: ~100ms • … maybe a traffic-based metric would be better?
The pbcast Algorithm (from [1]) • Anti-entropy round: m1 m1 m1 m1 m1 m1 m5 m5 m2 m2 m2 m2 m2 m3 m3 m3 m3 m4 m4 m4 Rounds need not be synchronized across nodes!
The pbcast Algorithm (from [1]) • Anti-entropy round: m1 m1 m1 m1 m1 m1 m5 m5 m2 m2 m2 m2 m2 m3 m3 m3 m3 m4 m4 m4 For example sake, we’ll assume they happento all occur at the same time across all nodes
The pbcast Algorithm (from [1]) • Anti-entropy round: • Gossip Messages: • Each process chooses another random process and sends a summary of its recent messages
The pbcast Algorithm (from [1]) • Gossip Messages: m1 m1 m1 m1 m1 m1 m1 m1 m1 m5 m5 m5 m5 m5 m2 m2 m2 m2 m2 m2 m2 m2 m2 m3 m3 m3 m3 m3 m3 m3 m4 m4 m4 m4 m4 m4 Node 1 Node 3: {1} Node 2 Node 1: {1, 2} Node 3 Node 2: {1, 2} Node 4 Node 2: {1}
The pbcast Algorithm (from [1]) • Gossip Messages: m1 m1 m1 m1 m1 m1 m5 m5 m2 m2 m2 m2 m2 m3 m3 m3 m3 m4 m4 m4 Node 1 Node 4: {1, 2} Node 2 Node 1: {1, 2, 4} Node 3 Node 4: {1, 2, 3, 4} Node 4 Node 3: {1, 2}
The pbcast Algorithm (from [1]) • Summary contains missed messages: m1 m5 m2 m3 m4 Node 1 Node 4: {1, 2} Node 2 Node 1: {1, 2, 4} Node 3 Node 4: {1, 2, 3, 4} Node 4 Node 3: {1, 2} Node 1 Node 4: {1, 2} Node 2 Node 1: {1, 2, 4} Node 3 Node 4: {1, 2, 3, 4} Node 4 Node 3: {1, 2} Node 1 Node 4: {1, 2} Node 2 Node 1: {1, 2, 4} Node 3 Node 4: {1, 2, 3, 4} Node 4 Node 3: {1, 2}
The pbcast Algorithm (from [1]) • Anti-entropy round: • Solicitation Messages: • Messages sent back to the sender of the gossip message requesting a resend of a given set of messages (not necessarily the original source) • Message Resend: • Upon reception of a solicitation message, the sender resends that message
The pbcast Algorithm (from [1]) • Summary contains missed messages: m1 m5 This is m3… m2 m3 m4 What was m3? Node 1 Node 4: {1, 2} Node 2 Node 1: {1, 2, 4} Node 3 Node 4: {1, 2, 3, 4} Node 4 Node 3: {1, 2, 3}
The pbcast Algorithm (from [1]) • Summary contains missed messages: m1 m5 m2 m3 m4
The pbcast Algorithm (from [1]) • Anti-entropy Protocol: • [1] suggests a number of optimizations • Reduces numbers of rounds required to gossip about messages • Reduces the redundant messages • [1] also suggests a number of extensions: • Gossip about a messages to a fraction of all nodes (ex: 100 in a 10,000 node system)
The pbcast Algorithm (from [1]) • Key Results:
Epidemics • With [1] and [2], a good overview of key epidemic behavior and strategies have been established. • In [1] and [2], domain specific optimizations were often applied.
Ad-Hoc Routing Epidemics • [3] focuses on reachability of epidemics for wireless ad hoc routing • need to broadcast (multicast) to find routes • theoretical/abstract simulation analysis • 1 basic protocol, 4 extensions • Goal: find optimal configurations based on reachability vs. "load"
Ad-Hoc Routing Epidemics • GOSSIP1(p, k) • k := The number of rounds of gossiping about the message with 100% probability • p := The probability of gossiping about the message after k rounds • Optimal: • 1000 nodes: (0.75, 4) to (0.65, 4) • Backpropagation Effects
Ad-Hoc Routing Epidemics • GOSSIP1(p, k)
Ad-Hoc Routing Epidemics • GOSSIP2(p1, k, p2, n) • Just like GOSSIP1(p1, k), unless the number of neighbors are n -- then GOSSIP1(p2, k) • Intuition: • Nodes with low degrees will have the hardest time receiving information • Optimal: • GOSSIP1(0.8, 4) performs like GOSSIP2(0.6, 4, 1, 6) but with 13% more messages