560 likes | 689 Views
Reconciling Differences: towards a theory of cloud complexity. George Varghese UCSD, visiting at Yahoo! Labs. Part 1 : Reconciling Sets across a link. Joint with D. Eppstein , M. Goodrich, F. Uyeda Appeared in SIGCOMM 2011. Motivation 1: OSPF Routing (1990).
E N D
Reconciling Differences: towards a theory of cloud complexity George Varghese UCSD, visiting at Yahoo! Labs
Part 1: Reconciling Sets across a link • Joint with D. Eppstein, M. Goodrich, F. Uyeda • Appeared in SIGCOMM 2011
Motivation 1: OSPF Routing (1990) • After partition forms and heals, R1 needs updates at R2 that arrived during partition. R1 R2 Partition heals Must solve the Set-Difference Problem!
Motivation 2:Amazon S3 storage (2007) • Synchronizing replicas. S1 S2 Periodic Anti-entropy Protocol between replicas Set-Difference across cloud again!
What is the Set-Difference problem? Host 1 Host 2 • What objects are unique to host 1? • What objects are unique to host 2? A B E F A C D F
Use case 1: Data Synchronization Host 1 Host 2 • Identify missing data blocks • Transfer blocks to synchronize sets C D A B E F A C D F B E
Use case 2: Data De-duplication Host 1 Host 2 • Identify all unique blocks. • Replace duplicate data with pointers A B E F A C D F
Prior work versus ours • Trade a sorted list of keys. • Let n be size of sets, U be size of key space • O(n log U) communication, O(n log n) computation • Bloom filters can improve to O(n) communication. • Polynomial Encodings (Minsky ,Trachtenberg) • Let “d” be the size of the difference • O(d log U) communication, O(dn+d3) computation • Invertible Bloom Filter (our result) • O(d log U) communication, O(n+d) computation
Difference Digests • Efficiently solves the set-difference problem. • Consists of two data structures: • Invertible Bloom Filter (IBF) • Efficiently computes the set difference. • Needs the size of the difference • Strata Estimator • Approximates the size of the set difference. • Uses IBF’s as a building block.
IBFs: main idea • Sum over random subsets:Summarize a set by “checksums” over O(d) random subsets. • Subtract: Exchange and subtract checksums. • Eliminate: Hashing for subset choice common elements disappear after subtraction • Invert fast: O(d) equations in d unknowns; randomness allows expected O(d) inversion.
“Checksum” details • Array of IBF cells that form “checksum” words • For set difference of size d, use αd cells (α > 1) • Each element ID is assigned to many IBF cells • Each cell contains:
IBF Encode B C A Assign ID to many cells All hosts use the same hash functions Hash1 Hash2 Hash3 idSum⊕A hashSum⊕ H(A) count++ idSum⊕A hashSum⊕H(A) count++ idSum⊕ A hashSum⊕ H(A) count++ IBF: “Add” ID to cell Not O(n), like Bloom Filters! αd
Invertible Bloom Filters (IBF) Host 1 Host 2 • Trade IBF’s with remote host A B E F A C D F IBF 1 IBF 2
Invertible Bloom Filters (IBF) Host 1 Host 2 • “Subtract” IBF structures • Produces a new IBF containing only unique objects A B E F A C D F IBF 2 IBF 1 IBF (2 - 1)
Disappearing act • After subtraction, elements common to both sets disappear because: • Any common element (e.g W) is assigned to same cells on both hosts (same hash functions on both sides) • On subtraction, W XOR W = 0. Thus, W vanishes. • While elements in set difference remain, they may be randomly mixed need a decode procedure.
IBF Decode H(V ⊕ X ⊕ Z) ≠ H(V) ⊕ H(X) ⊕ H(Z) Test for Purity: H( idSum ) H( idSum ) = hashSum H(V) = H(V)
How many IBF cells? Overhead to decode at >99% Hash Cnt 3 Hash Cnt 4 α Space Overhead Small Diffs: 1.4x – 2.3x Large Differences: 1.25x - 1.4x Set Difference
How many hash functions? • 1 hash function produces many pure cells initially but nothing to undo when an element is removed. C A B
How many hash functions? • 1 hash function produces many pure cells initially but nothing to undo when an element is removed. • Many (say 10) hash functions: too many collisions. C C C B B C B A A A B A
How many hash functions? • 1 hash function produces many pure cells initially but nothing to undo when an element is removed. • Many (say 10) hash functions: too many collisions. • We find by experiment that 3 or 4 hash functions works well. Is there some theoretical reason? C C B C A A A B B
Theory • Let d = difference size, k = # hash functions. • Theorem 1: With (k + 1) d cells, failure probability falls exponentially with k. • For k = 3, implies a 4x tax on storage, a bit weak. • [Goodrich,Mitzenmacher]: Failure is equivalent to finding a 2-core (loop) in a random hypergraph • Theorem 2: With ck d, cells, failure probability falls exponentially with k. • c4 = 1.3x tax, agrees with experiments
Recall experiments Overhead to decode at >99% Hash Cnt 3 Hash Cnt 4 Space Overhead Large Differences: 1.25x - 1.4x Set Difference
Connection to Coding • Mystery: IBF decode similar to peeling procedure used to decode Tornado codes. Why? • Explanation: Set Difference is equivalent to coding with insert-delete channels • Intuition: Given a code for set A, send checkwords only to B. Think of B as a corrupted form of A. • Reduction: If code can correct D insertions/deletions, then B can recover A and the set difference. • Reed Solomon <---> Polynomial Methods • LDPC (Tornado) <---> Difference Digest
Random Subsets Fast Elimination Sparse X + Y + Z = . . αd Y = . . Pure X = . . Roughly upper triangular and sparse
Difference Digests • Consists of two data structures: • Invertible Bloom Filter (IBF) • Efficiently computes the set difference. • Needs the size of the difference • Strata Estimator • Approximates the size of the set difference. • Uses IBF’s as a building block.
Strata Estimator Estimator B C A 1/16 • Divide keys into sampled subsets containing ~1/2k • Encode each subset into an IBF of small fixed size • log(n) IBF’s of ~20 cells each IBF 4 ~1/8 IBF 3 ~1/4 Consistent Partitioning IBF 2 ~1/2 IBF 1
Strata Estimator Estimator 1 Estimator 2 • Attempt to subtract & decode IBF’s at each level. • If level k decodes, then return:2kx (the number of ID’s recovered) … … IBF 4 IBF 4 4x IBF 3 IBF 3 Host 1 Host 2 IBF 2 IBF 2 Decode IBF 1 IBF 1
KeyDiff Service • Promising Applications: • File Synchronization • P2P file sharing • Failure Recovery Application Application Add( key ) Remove( key ) Diff( host1, host2 ) Key Service Key Service Application Key Service
Difference Digest Summary • Strata Estimator • Estimates Set Difference. • For 100K sets, 15KB estimator has <15% error • O(log n) communication, O(n) computation. • Invertible Bloom Filter • Identifies all ID’s in the Set Difference. • 16 to 28 Bytes per ID in Set Difference. • O(d) communication, O(n+d) computation • Worth it if set difference is < 20% of set sizes
Connection to Sparse Recovery? • If we forget about subtraction, in the end we are recovering a d-sparse vector. • Note that the hash check is key for figuring out which cells are pure after differencing. • Is there a connection to compressed sensing. Could sensors do the random summing? The hash summing? • Connection the other way: could use compressed sensing for differences?
Comparison with Information Theory and Coding • Worst case complexity versus average • It emphasize communication complexity not computation complexity: we focus on both. • Existence versus Constructive: some similar settings (Slepian-Wolf) are existential • Estimators: We want bounds based on difference and so start by efficiently estimating difference.
Aside: IBFs in Digital Hardware Stream of set elements Logic (Read, hash, Write) a , b, x, y Hash 3 Hash 1 Hash 2 Strata Hash Bank 3 Bank 1 Bank 2 Hash to separate banks for parallelism, slight cost in space needed. Decode in software
Part 2: Towards a theory of Cloud Complexity O2 ? O1 O3 Complexity of reconciling “similar” objects?
Example: Synching Files X.ppt.v2 X.ppt.v3 ? X.ppt.v1 Measures: Communication bits, computation
So far: Two sets, one link, set difference {a,b,c} {d,a,c}
Mild Sensitivity Analysis: One set much larger than other Small difference d ? Set A Set B (|A|) bits needed, not O (d) : Patrascu 2008 Simpler proof: DKS 2011
Asymmetric set difference in LBFS File System (Mazieres) C99 C98 C97 1 chunk difference File B ? . . . C3 C5 C1 C3 C2 C1 . . . C98 C97 C99 File A Chunk Set B at Server LBFS sends all chunk hashes in File A: O|A|
More Sensitivity Analysis: small intersection: databasejoins Small intersection d ? Set B Set A (|A|) bits needed, not O (d) : Follows from results on hardness of set disjointness
Sequences under Edit Distance (Files for example) Edit distance 2 A ? A B C C D D E E F F G File A File B Insert/delete can renumber all file blocks . . .
Sequence reconciliation (with J. Ullman) Edit distance 1 A A H1 B C H2 C D H2 D E H3 E F H3 F File A File B Send 2d+1 piece hashes. Clump unmatched pieces and recurse. O( d log (N) ) 2
21 years of Sequence Reconciliation! • Schwartz, Bowdidge, Burkhard (1990): recurse on unmatched pieces, not aggregate. • Rsync: widely used tool that breaks file into roughly piece hashes, N is file length. UCSD, Lunch Princeton, kids
Sets on graphs? {b,c,d} {a,b,c} {d,c,e} {a,f,g}
Generalizes rumor spreading which has disjoint singleton sets {b} {a} {d} {g} CLP10,G11,: O( E n log n /conductance)
Generalized Push-Pull (with N. Goyal and R. Kannan) {b,c,d} Pick random edge Do 2 party set reconciliation {a,b,c} {d,c,e} Complexity: C + D, C as before, D = Sum (U – S ) i i
Sets on Steiner graphs? R1 {b} U S {a} U S Only terminals need sets. Push-pull wasteful!
Butterfly example for Sets S1 S2 S2 S1 X D = Diff(S1 ,S2) S1 Y D D Set difference instead of XOR within network