1 / 48

Diffusing updates without false rumors

Diffusing updates without false rumors. Presented by Alex Kogan. Byzantine environment. Uncorrupted hosts follow their protocol Corrupted hosts can behave arbitrary Send conflicting updates Fake forwarded messages Conspire and form coalitions

banyan
Download Presentation

Diffusing updates without false rumors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Diffusing updates without false rumors Presented by Alex Kogan

  2. Byzantine environment • Uncorrupted hosts follow their protocol • Corrupted hosts can behave arbitrary • Send conflicting updates • Fake forwarded messages • Conspire and form coalitions • Stop sending messages for limited/unlimited time • A corruption may occur due to • hardware/software failure • virus/hacker attack

  3. First shot: signatures • Every message sent is signed • Prevents malicious changes in the forwarded messages • Does not answer all problems • corrupted hosts can still send conflicting updates • Computationally expensive • keys distribution • signing every message

  4. Outline • Motivation • Problem definition • Performance measures • Direct verification • Lower bounds • Random propagation • l-Tree propagation • Path verification • Direct diffusion • Youngest diffusion • Bundle sampling • Short-Path diffusion

  5. Diffusion problem • Synchronous fully-connected network • n hosts (or replicas, or nodes) • α correct hosts hold an initial update u • up to t corrupted hosts, t < α • In every round, a correct host h sends up to Fout messages • Goal: cause u to be accepted by all correct hosts • Safety: No correct host accepts an update other than u • Liveness: Every correct host eventually accepts some update, with probability 1

  6. Performance measures • Delay (diffusion time), D • Expected number of rounds until all correct hosts accept u • Fan-in, Fin • Expected max number of messages any correct host receives in any round from correct hosts • Amortized Fan-in defined respectively to multiple rounds

  7. General framework • When h receives u: • Accepts only when got evidence on u’s veracity • what is the evidence? • if accepts, h is called active for u • so all initially updated nodes are • Forwards u to other hosts? • If done “carelessly”, may violate safety • Conservative approach: only active nodes forward updates • Liberal approach: forward update without accepting it

  8. Direct verification (by Malkhi et.al.) • h accepts u when it receives t+1 copies of u from different sources • Conservative approach • only active nodes forward updates • Partitions the diffusion into two phases • only initially updated hosts are active • slow! • other hosts become active and start forwarding updates • exponentially fast

  9. Gossip partners selection • Two protocols: • Random propagation • l-Tree propagation • Trade-off host load (fan-in) and diffusion time (delay)

  10. General results For any direct verification algorithm A: • D is Ω(tlog(n/α) / Fout) • D * Fin = Ω (tn/α), for t ≥ 2logn • Fin- D-amortized fan-in • D * Fin- max # incoming messages at h in D rounds • inherent tradeoff between two measures • good delay must incur a high load, and vice versa • very discouraging • with crash-stop failures, epidemic diffusion achieves D * Fin = O (logn)

  11. D is Ω(t log(n/α) / Fout) Proof: • mk - # times u is sent by correct hosts in first k rounds • αk - # hosts that accepted u by first k rounds • Then we have: • αk ≤ α + mk / (t + 1) • t+1 copies must reach any host to be accepted

  12. D is Ω(t log(n/α) / Fout) Proof: • mk - # times u is sent by correct hosts in first k rounds • αk - # hosts that accepted u by first k rounds • Then we have: • αk ≤ α + mk / (t + 1) • mk+1 ≤ mk + Foutαk≤ Fout∑αj • every correct host sends at most Fout messages

  13. D is Ω(t log(n/α) / Fout) Proof: • mk - # times u is sent by correct hosts in first k rounds • αk - # hosts that accepted u by first k rounds • Then we have: • αk ≤ α + mk / (t + 1) • mk+1 ≤ mk + Foutαk≤ Fout∑αj • Hence, αk≤α(1+Fout/(t+1))k • Comparing with n, we get that when k < (t log(n/α)) / Fout, αk < n ■

  14. D * Fin = Ω (tn/α) Proof sketch: • In D rounds, with Pr=0.9, h receives less than 10DFinmessages • Markov’s inequality Iu: random subset of hosts that are initially active (|Iu| = ) Xh: number of hosts in Iuthat send a message to h during D rounds • Show that if 10DFinis too small,w.h.p. h does not become active, i.e., Xh ≤ t

  15. D * Fin = Ω (tn/α) (cont.) • When D ≤ nk/20eFin, Pr(Xh≥k) ≤ 2(10eDFin/kn)k • When t ≥ 2logn Pr(Xh≥t) < O(1/n2) • For any host h, when D ≤ nt/20eFin,Pr(Xh<t) ≥ 1-O(1/n) • With Pr=0.9, at least (nt/Fin) rounds are required

  16. Random propagation • At every round, active h selects Fout targets uniformly and randomly

  17. l-Tree propagation • Partition all hosts into blocks of size l ≥ 4t

  18. l-Tree propagation • Partition all hosts into blocks of size l ≥ 4t • Arrange blocks as nodes of a binary tree

  19. l-Tree propagation • Partition all hosts into blocks of size l ≥ 4t • Arrange blocks as nodes of a binary tree • At every round, active h selects Fouttargets randomly out of a candidate set • l hosts at the root • l hosts at h’s node • 2l hosts in the children of h’s node 4l hosts in total

  20. l-Tree propagation • Partition all hosts into blocks of size l ≥ 4t • Arrange blocks as nodes of a binary tree • At every round, active h selects Fouttargets randomly out of a candidate set • l hosts at the root • l hosts at h’s node • 2l hosts in the children of h’s node 4l hosts in total • In Random propagation, l = n

  21. Complexity properties - Fan-in Theorem: Fin is O(nFout / l + log n) Proof: • r - root host • Root hosts have the highest Fin • Case 1: 4t ≤ l ≤ nFout/12 log n Pr(r gets message from h) ≤ Fout / (2l) • Exp(r’s Fin) ≤ n*Fout / (2l) • Pr(r’s Fin > 2n*Fout / (2l))≤ (Chernoff) • Pr(r’s Fin > 2n*Fout / (2l)) ≤ 1/n2 • w.h.p., any root host gets at most 2n*Fout / (2l) messages

  22. Fin is O(nFout / l + logn) • Case 2: nFout/12 logn < l Pr(r’s Fin ≥ k messages) ≤ • Pr(r’s Fin ≥ k messages) ≤ • when k = 2 * 18 logn, Pr(r’s Fin ≥ k messages) ≤ 1/n2 • w.h.p., any root host gets at most O(logn) messages ■

  23. Fin is O(nFout / l + logn) cont. Corollary: Finin Random propagation is O(Fout + logn) Theorem: logn-amortized Finin Random propagation is O(Fout) Proof: • Pr(h gets ≥ k msgs in logn rounds) ≤ • For k =6Foutlogn, this Pr ≤ 1/n2 ■

  24. Complexity properties - Delay Theorem: D is O((t/Fout)(l/α)(1-1/(3t)) + log(l)/Fout + (t/Fout)log(n/l)) Proof sketch: • Split the analysis into two stages: • Expected delay until all root hosts are active • Expected delay of propagating updates down the tree

  25. Activating root hosts • Split to phases • in each phase, the number of active hosts is doubled • Count # messages required • to get # rounds, divide by # active nodes * Fout • Use coupon collector analysis • bounds # messages to be received to collect t + 1 messages from distinct sources This stage contributes most to the total delay • especially, its first phase

  26. Propagating updates down the tree • Bound Pr(s is active after O((t + logl)/Fout) rounds | p is active) • Each leaf node has log(n/l) nodes on the path to the root • Split to meta-rounds, each of O((t + logl)/Fout) rounds • in each meta-round, another node in the path becomes active p s This stage contributes a logarithmic factor to the total delay

  27. Direct verification Summary of results low delay high fan-in high delay low fan-in 4t-Tree Random propagation

  28. Path verification (by Minsky and Shneider) • Liberal approach • allows forwarding u without accepting • Track the path through which u is gossiped • nodes exchange proposals <u, path> • accept only when got t+1 proposals with the same u, but disjoint paths

  29. Example Overlapping paths Disjoint paths

  30. Proposals management • Storing and exchanging all possible proposals is unpractical • Two sub-protocols for managing • selection protocol • chooses a single distinguished proposal at each host • sampling protocol • gathers selected proposals from a set of selected hosts • targets are selected uniformly and randomly

  31. General scheme selection sampling if found a satisfying subset, accept

  32. Direct diffusion Selection: ifh accepted uthendh = (u, ø) elsedh =  Sampling (given partner j): ifdj = (u, ø)thenDh = Dh {(u, [ j ])} Random propagation! Every proposal with a non-empty path is discarded

  33. Improving performance • Non- proposals selected by random nodes should contain disjoint paths • Such proposals should change quickly • avoid “bad” distributions to persist • Shorter paths tend to have less common nodes • Try: select the proposal with the shortest path • many nodes may hold the same update with the same short path • Better: select a proposal based on its “age”

  34. Youngest selection If h is a source: dh = (u,ø), ageh = 0 Otherwise: Initially: dh = , ageh =  Given partner j: if (agej ≤ ageh) then dh = dj::j ageh = min(ageh, agej) + 1

  35. Simple sampling • Queue yh holds S most recent proposals If h is not a source: Initially: yh is empty Given partner j with dj  : yh.enqueue(dj::j) if ( yh > S ) thenyh.dequeue() Youngest diffusion = Youngest selection + Simple sampling

  36. More efficient sampling • Simple sampling obtains at most one proposal per round • Hosts may share their sampled proposals • sampling may obtain multiple proposals • Which proposals to keep? • based on path length? • remaining samples are not likely to change • based on proposal’s age? • corrupted nodes may displace many legitimate proposals • based on sample’s age!

  37. Bundle accumulation Initially: bundleh is empty Given partner j: bundleh = UpdateBundle(bundleh  bundlej::j  {(dh, 0)}) UpdateBundle(bundle) = {(d,sAge+1) | (d,sAge)bundle  sAge < SA)} controls the age of the oldest sample

  38. Bundle accumulation cont. • Samples are collected from ≤ SA hosts • SA must be greater than t • otherwise, termination is impossible • Space complexity is (2t) • Solution: keep a queue of bundles • Requires O(t * 2SA) space • but now SA can be arbitrary small!

  39. Bundle sampling Initially: yh is empty Given partner j: yh.enqueue(bundlej::j) if ( yh > S ) thenyh.dequeue()

  40. Final protocol Run: Youngest selection Bundle Accumulation Bundle Sampling

  41. p1 : (u, {(1,2), (1,3)}) p2 : (u, {(1,2), (2,3), (2,4)}) p3 : (u, {(1,3), (2,3), (3,4)}) p4 : (u, {(2,4), (3,4)}) 2 1 ‘ 4 3 Local computation time • How do we find t+1 disjoint paths? • Independent set  disjoint proposals • NP-hard problem • practical only for small values of t and number of proposals

  42. Direct vs. Path verification * an algorithm with the same analytical bounds using much larger messages exists

  43. Short-Path diffusion (by Malkhi et.al.) • Keep and send all proposals with path length < log(n/(t+1)) • similar to Direct diffusion with longer paths • Optimal analytical delay and delay*fan-in product - O(t + logn) • But, message and storage size is O((n/t)O(log(t+logn))) • non-exponential, but grows faster than any polynomial in n • finding disjoint paths is computationally expensive

  44. Delay analysis outline Denote: b=t+1 bk=b/2k • gossip-circle C(p,d,r) set of correct hosts that received u originated at p over “good” paths of length up to d in r rounds Initially active bk hosts create disjoint low-diameter (up to 2logn/(bbk))gossip-circles of size n/4bk in O(b + logn/(bbk)) rounds

  45. Delay analysis outline (cont.) • Initially active bk hosts create disjoint low-diameter (up to 2logn/(bbk))gossip-circles of size n/4bk in O(b + logn/(bbk)) rounds • Given such disjoint gossip-circles, it takes exp. 4bk rounds for a correct h to receive u from bk/2 disjoint gossip-circles • Coupon-collector (bk/2 coupons out of bk) • It takes O(b+logn) rounds to receive b disjoint-path copies of u • By induction on bk, for k=0 …logb - 1

  46. Delay analysis outline (cont.) • For any constant c, (n-b)(1-1/c) hosts are active in exp. O(b+logn) rounds • Markov’s inequality • Choose a particular value for c, e.g., c=2 • Assuming b<n/60, (n-b)1/2 > 2/5n • If at least 2/5ncorrect hosts are active, then within exp. O(b+logn) rounds all hosts become active • Chernoff bound • The expected delay is O(b+logn)

  47. References • “On diffusing updates in a Byzantine environment”, by D. Malkhi, Y. Mansour and M.K. Reiter, at SRDS, 1999 • “Diffusion without false rumors: on propagating updates in a Byzantine environment”, by D. Malkhi, Y. Mansour and M.K. Reiter, at Theoretical Computer Science, 2003 • “Tolerating malicious gossip”, by Y.M. Minsky, F.B. Shneider, at Distr. Computing, 2003 • “Optimal Unconditional Information Diffusion“, by D. Malkhi, E. Pavlov and Y. Sella, at SRDS, 2001

  48. Questions? Thank you!

More Related