1 / 25

Beyond Set Disjointness : The Communication Complexity of Finding the Intersection

Beyond Set Disjointness : The Communication Complexity of Finding the Intersection. Grigory Yaroslavtsev http://grigory.us. Joint with Brody, Chakrabarti , Kondapally and Woodruff. Communication Complexity [Yao’79]. Shared randomness. Bob: . Alice: . ….

garron
Download Presentation

Beyond Set Disjointness : The Communication Complexity of Finding the Intersection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Beyond Set Disjointness: The Communication Complexity of Finding the Intersection GrigoryYaroslavtsev http://grigory.us Joint with Brody, Chakrabarti, Kondapally and Woodruff

  2. Communication Complexity [Yao’79] Shared randomness Bob: Alice: … • = min. communication (error ) • min. -round communication (error )

  3. Set Intersection = ? (-Intersection) = ?

  4. This talk Let • (-Intersection) = [Brody, Chakrabarti, Kondapally, Woodruff, Y.; PODC’14] • (-Intersection) = [Saglam-Tardos FOCS’13; Brody, Chakrabarti, Kondapally, Woodruff, Y.’13] { times (-Intersection) = for

  5. -Disjointness • , iff • [Razborov’92; Hastad-Wigderson’96] [Folklore + Dasgupta, Kumar, Sivakumar; Buhrman’12, Garcia-Soriano, Matsliah, De Wolf’12] • [Saglam, Tardos’13] • [Braverman, Garg, Pankratov, Weinstein’13]

  6. Applications • : exact Jaccard index ( for -approximate use MinHash[Broder’98; Li-Konig’11; Path-Strokel-Woodruff’14]) • Rarity, distinct elements, joins,… • Multi-party set intersection (later) • Contrast:

  7. 1-round -protocol

  8. Hashing Expected # of elements =# of buckets

  9. Secondary Hashing where = # of hash functions

  10. 2-Round -protocol Total communication = = O()

  11. Collisions

  12. Collisions

  13. Collisions • Second round: • For each bucket send -bit equality check (total -communication) • Correct intersection computed in buckets where • Expected # of items in incorrect buckets • Use 1-round protocol for incorrect buckets • Total communication

  14. Main protocol Expected # of elements =# of buckets

  15. Verification tree -degree … buckets = leaves of the verification tree

  16. Verification bottom-up

  17. Verification bottom-up Incorrect Incorrect Correct EQ() Incorrect Correct

  18. Verification bottom-up Incorrect Incorrect Correct EQ() Correct Correct Incorrect Correct

  19. Verification bottom-up … … …

  20. Analysis of Stage • = [node at stage computed correctly] • Set = • Run equality checks and basic intersection protocols with success probability • Key lemma: [# of restarts per leaf • Cost of Equality = • Cost of Intersection in leafs = • [protocol succeeds] =

  21. Lower Bound • (-Intersection) = [Brody, Chakrabarti, Kondapally, Woodruff, Y.’13] • iff, where • = solving independent instances of • reduces to -Intersection: • Given and • Construct sets with elements and

  22. Communication Direct Sums “Solving m copies of a communication problem requires m times more communication”: • For arbitrary [… Braverman, Rao 10; Barak Braverman, Chen, Rao 11, ….] • In general, can’t go beyond

  23. Specialized Communication Direct Sums Information cost Communication complexity • [Bar Yossef, Jayram, Kumar,Sivakumar’01] Disjointness • Stronger direct sum for bounded-round complexity of Equality-type problems (a.k.a. “union bound is optimal”) [Molinaro, Woodruff, Y.’13]

  24. Extensions • Multi-party: players, , where • Boost error probability to • Average per player (using coordinator): in rounds • Worst-case per player (using a tournament) in rounds

  25. Open Problems • (-Intersection) = • Better protocols for the multi-party setting

More Related