480 likes | 604 Views
Diffusing updates without false rumors. Presented by Alex Kogan. Byzantine environment. Uncorrupted hosts follow their protocol Corrupted hosts can behave arbitrary Send conflicting updates Fake forwarded messages Conspire and form coalitions
E N D
Diffusing updates without false rumors Presented by Alex Kogan
Byzantine environment • Uncorrupted hosts follow their protocol • Corrupted hosts can behave arbitrary • Send conflicting updates • Fake forwarded messages • Conspire and form coalitions • Stop sending messages for limited/unlimited time • A corruption may occur due to • hardware/software failure • virus/hacker attack
First shot: signatures • Every message sent is signed • Prevents malicious changes in the forwarded messages • Does not answer all problems • corrupted hosts can still send conflicting updates • Computationally expensive • keys distribution • signing every message
Outline • Motivation • Problem definition • Performance measures • Direct verification • Lower bounds • Random propagation • l-Tree propagation • Path verification • Direct diffusion • Youngest diffusion • Bundle sampling • Short-Path diffusion
Diffusion problem • Synchronous fully-connected network • n hosts (or replicas, or nodes) • α correct hosts hold an initial update u • up to t corrupted hosts, t < α • In every round, a correct host h sends up to Fout messages • Goal: cause u to be accepted by all correct hosts • Safety: No correct host accepts an update other than u • Liveness: Every correct host eventually accepts some update, with probability 1
Performance measures • Delay (diffusion time), D • Expected number of rounds until all correct hosts accept u • Fan-in, Fin • Expected max number of messages any correct host receives in any round from correct hosts • Amortized Fan-in defined respectively to multiple rounds
General framework • When h receives u: • Accepts only when got evidence on u’s veracity • what is the evidence? • if accepts, h is called active for u • so all initially updated nodes are • Forwards u to other hosts? • If done “carelessly”, may violate safety • Conservative approach: only active nodes forward updates • Liberal approach: forward update without accepting it
Direct verification (by Malkhi et.al.) • h accepts u when it receives t+1 copies of u from different sources • Conservative approach • only active nodes forward updates • Partitions the diffusion into two phases • only initially updated hosts are active • slow! • other hosts become active and start forwarding updates • exponentially fast
Gossip partners selection • Two protocols: • Random propagation • l-Tree propagation • Trade-off host load (fan-in) and diffusion time (delay)
General results For any direct verification algorithm A: • D is Ω(tlog(n/α) / Fout) • D * Fin = Ω (tn/α), for t ≥ 2logn • Fin- D-amortized fan-in • D * Fin- max # incoming messages at h in D rounds • inherent tradeoff between two measures • good delay must incur a high load, and vice versa • very discouraging • with crash-stop failures, epidemic diffusion achieves D * Fin = O (logn)
D is Ω(t log(n/α) / Fout) Proof: • mk - # times u is sent by correct hosts in first k rounds • αk - # hosts that accepted u by first k rounds • Then we have: • αk ≤ α + mk / (t + 1) • t+1 copies must reach any host to be accepted
D is Ω(t log(n/α) / Fout) Proof: • mk - # times u is sent by correct hosts in first k rounds • αk - # hosts that accepted u by first k rounds • Then we have: • αk ≤ α + mk / (t + 1) • mk+1 ≤ mk + Foutαk≤ Fout∑αj • every correct host sends at most Fout messages
D is Ω(t log(n/α) / Fout) Proof: • mk - # times u is sent by correct hosts in first k rounds • αk - # hosts that accepted u by first k rounds • Then we have: • αk ≤ α + mk / (t + 1) • mk+1 ≤ mk + Foutαk≤ Fout∑αj • Hence, αk≤α(1+Fout/(t+1))k • Comparing with n, we get that when k < (t log(n/α)) / Fout, αk < n ■
D * Fin = Ω (tn/α) Proof sketch: • In D rounds, with Pr=0.9, h receives less than 10DFinmessages • Markov’s inequality Iu: random subset of hosts that are initially active (|Iu| = ) Xh: number of hosts in Iuthat send a message to h during D rounds • Show that if 10DFinis too small,w.h.p. h does not become active, i.e., Xh ≤ t
D * Fin = Ω (tn/α) (cont.) • When D ≤ nk/20eFin, Pr(Xh≥k) ≤ 2(10eDFin/kn)k • When t ≥ 2logn Pr(Xh≥t) < O(1/n2) • For any host h, when D ≤ nt/20eFin,Pr(Xh<t) ≥ 1-O(1/n) • With Pr=0.9, at least (nt/Fin) rounds are required
Random propagation • At every round, active h selects Fout targets uniformly and randomly
l-Tree propagation • Partition all hosts into blocks of size l ≥ 4t
l-Tree propagation • Partition all hosts into blocks of size l ≥ 4t • Arrange blocks as nodes of a binary tree
l-Tree propagation • Partition all hosts into blocks of size l ≥ 4t • Arrange blocks as nodes of a binary tree • At every round, active h selects Fouttargets randomly out of a candidate set • l hosts at the root • l hosts at h’s node • 2l hosts in the children of h’s node 4l hosts in total
l-Tree propagation • Partition all hosts into blocks of size l ≥ 4t • Arrange blocks as nodes of a binary tree • At every round, active h selects Fouttargets randomly out of a candidate set • l hosts at the root • l hosts at h’s node • 2l hosts in the children of h’s node 4l hosts in total • In Random propagation, l = n
Complexity properties - Fan-in Theorem: Fin is O(nFout / l + log n) Proof: • r - root host • Root hosts have the highest Fin • Case 1: 4t ≤ l ≤ nFout/12 log n Pr(r gets message from h) ≤ Fout / (2l) • Exp(r’s Fin) ≤ n*Fout / (2l) • Pr(r’s Fin > 2n*Fout / (2l))≤ (Chernoff) • Pr(r’s Fin > 2n*Fout / (2l)) ≤ 1/n2 • w.h.p., any root host gets at most 2n*Fout / (2l) messages
Fin is O(nFout / l + logn) • Case 2: nFout/12 logn < l Pr(r’s Fin ≥ k messages) ≤ • Pr(r’s Fin ≥ k messages) ≤ • when k = 2 * 18 logn, Pr(r’s Fin ≥ k messages) ≤ 1/n2 • w.h.p., any root host gets at most O(logn) messages ■
Fin is O(nFout / l + logn) cont. Corollary: Finin Random propagation is O(Fout + logn) Theorem: logn-amortized Finin Random propagation is O(Fout) Proof: • Pr(h gets ≥ k msgs in logn rounds) ≤ • For k =6Foutlogn, this Pr ≤ 1/n2 ■
Complexity properties - Delay Theorem: D is O((t/Fout)(l/α)(1-1/(3t)) + log(l)/Fout + (t/Fout)log(n/l)) Proof sketch: • Split the analysis into two stages: • Expected delay until all root hosts are active • Expected delay of propagating updates down the tree
Activating root hosts • Split to phases • in each phase, the number of active hosts is doubled • Count # messages required • to get # rounds, divide by # active nodes * Fout • Use coupon collector analysis • bounds # messages to be received to collect t + 1 messages from distinct sources This stage contributes most to the total delay • especially, its first phase
Propagating updates down the tree • Bound Pr(s is active after O((t + logl)/Fout) rounds | p is active) • Each leaf node has log(n/l) nodes on the path to the root • Split to meta-rounds, each of O((t + logl)/Fout) rounds • in each meta-round, another node in the path becomes active p s This stage contributes a logarithmic factor to the total delay
Direct verification Summary of results low delay high fan-in high delay low fan-in 4t-Tree Random propagation
Path verification (by Minsky and Shneider) • Liberal approach • allows forwarding u without accepting • Track the path through which u is gossiped • nodes exchange proposals <u, path> • accept only when got t+1 proposals with the same u, but disjoint paths
Example Overlapping paths Disjoint paths
Proposals management • Storing and exchanging all possible proposals is unpractical • Two sub-protocols for managing • selection protocol • chooses a single distinguished proposal at each host • sampling protocol • gathers selected proposals from a set of selected hosts • targets are selected uniformly and randomly
General scheme selection sampling if found a satisfying subset, accept
Direct diffusion Selection: ifh accepted uthendh = (u, ø) elsedh = Sampling (given partner j): ifdj = (u, ø)thenDh = Dh {(u, [ j ])} Random propagation! Every proposal with a non-empty path is discarded
Improving performance • Non- proposals selected by random nodes should contain disjoint paths • Such proposals should change quickly • avoid “bad” distributions to persist • Shorter paths tend to have less common nodes • Try: select the proposal with the shortest path • many nodes may hold the same update with the same short path • Better: select a proposal based on its “age”
Youngest selection If h is a source: dh = (u,ø), ageh = 0 Otherwise: Initially: dh = , ageh = Given partner j: if (agej ≤ ageh) then dh = dj::j ageh = min(ageh, agej) + 1
Simple sampling • Queue yh holds S most recent proposals If h is not a source: Initially: yh is empty Given partner j with dj : yh.enqueue(dj::j) if ( yh > S ) thenyh.dequeue() Youngest diffusion = Youngest selection + Simple sampling
More efficient sampling • Simple sampling obtains at most one proposal per round • Hosts may share their sampled proposals • sampling may obtain multiple proposals • Which proposals to keep? • based on path length? • remaining samples are not likely to change • based on proposal’s age? • corrupted nodes may displace many legitimate proposals • based on sample’s age!
Bundle accumulation Initially: bundleh is empty Given partner j: bundleh = UpdateBundle(bundleh bundlej::j {(dh, 0)}) UpdateBundle(bundle) = {(d,sAge+1) | (d,sAge)bundle sAge < SA)} controls the age of the oldest sample
Bundle accumulation cont. • Samples are collected from ≤ SA hosts • SA must be greater than t • otherwise, termination is impossible • Space complexity is (2t) • Solution: keep a queue of bundles • Requires O(t * 2SA) space • but now SA can be arbitrary small!
Bundle sampling Initially: yh is empty Given partner j: yh.enqueue(bundlej::j) if ( yh > S ) thenyh.dequeue()
Final protocol Run: Youngest selection Bundle Accumulation Bundle Sampling
p1 : (u, {(1,2), (1,3)}) p2 : (u, {(1,2), (2,3), (2,4)}) p3 : (u, {(1,3), (2,3), (3,4)}) p4 : (u, {(2,4), (3,4)}) 2 1 ‘ 4 3 Local computation time • How do we find t+1 disjoint paths? • Independent set disjoint proposals • NP-hard problem • practical only for small values of t and number of proposals
Direct vs. Path verification * an algorithm with the same analytical bounds using much larger messages exists
Short-Path diffusion (by Malkhi et.al.) • Keep and send all proposals with path length < log(n/(t+1)) • similar to Direct diffusion with longer paths • Optimal analytical delay and delay*fan-in product - O(t + logn) • But, message and storage size is O((n/t)O(log(t+logn))) • non-exponential, but grows faster than any polynomial in n • finding disjoint paths is computationally expensive
Delay analysis outline Denote: b=t+1 bk=b/2k • gossip-circle C(p,d,r) set of correct hosts that received u originated at p over “good” paths of length up to d in r rounds Initially active bk hosts create disjoint low-diameter (up to 2logn/(bbk))gossip-circles of size n/4bk in O(b + logn/(bbk)) rounds
Delay analysis outline (cont.) • Initially active bk hosts create disjoint low-diameter (up to 2logn/(bbk))gossip-circles of size n/4bk in O(b + logn/(bbk)) rounds • Given such disjoint gossip-circles, it takes exp. 4bk rounds for a correct h to receive u from bk/2 disjoint gossip-circles • Coupon-collector (bk/2 coupons out of bk) • It takes O(b+logn) rounds to receive b disjoint-path copies of u • By induction on bk, for k=0 …logb - 1
Delay analysis outline (cont.) • For any constant c, (n-b)(1-1/c) hosts are active in exp. O(b+logn) rounds • Markov’s inequality • Choose a particular value for c, e.g., c=2 • Assuming b<n/60, (n-b)1/2 > 2/5n • If at least 2/5ncorrect hosts are active, then within exp. O(b+logn) rounds all hosts become active • Chernoff bound • The expected delay is O(b+logn)
References • “On diffusing updates in a Byzantine environment”, by D. Malkhi, Y. Mansour and M.K. Reiter, at SRDS, 1999 • “Diffusion without false rumors: on propagating updates in a Byzantine environment”, by D. Malkhi, Y. Mansour and M.K. Reiter, at Theoretical Computer Science, 2003 • “Tolerating malicious gossip”, by Y.M. Minsky, F.B. Shneider, at Distr. Computing, 2003 • “Optimal Unconditional Information Diffusion“, by D. Malkhi, E. Pavlov and Y. Sella, at SRDS, 2001
Questions? Thank you!