270 likes | 477 Views
Complexity Theory Lecture 2. Lecturer: Moni Naor. Recap of last week. Computational Complexity Theory: What, Why and How Overview: Turing Machines, Church-Turing Thesis Diagonalization an the Time-Hierarchy (also Space) Communication Complexity Definition of protocol
E N D
Complexity TheoryLecture 2 Lecturer:Moni Naor
Recap of last week • Computational Complexity Theory: What, Why and How • Overview: Turing Machines, Church-Turing Thesis • Diagonalization an the Time-Hierarchy (also Space) • Communication Complexity • Definition of protocol • Combinatorial Rectangles • Fooling sets an lower bounds • Connection to Time-Space tradeoffs of Turing Machines • Rank lower bound • Covers an non-determinism This week: • Non-deterministic and probabilistic communication complexity • Space Complexity
Non-deterministic Communication Complexity • A non-deterministic function: a:X {0,1} maps each element to one of the sets {0}, {1} or {0,1}. When evaluating the function the result should be one of the elements in the subset. Let f:X £ Y with range Z. A non-deterministic protocolfor verifying that f(x,y)=z is a binary tree where: • Each internal node v is labeled with a non-deterministic function av:X {0,1} or bv:Y {0,1} • Each leaf is either labeled `accept’ or with `failure’ The inputs (x,y) and the non-deterministic choices define a path from the root to a leaf. • If f(x,y)=z then there should be at least one path leading to an `accept’ leaf • If f(x,y)≠z then there should be no path leading to an `accept’ leaf • The cost of a protocol on input (x,y) is the length of the longest path taken on input (x,y) • The cost of a protocol is the maximum path length • The non-deterministic communication complexity of f for verifying z is the cost of the best protocol. Denote it by Nz(f) • For Boolean f call N1(f) the nondeterministic communication complexity and N0(f) co-nondeterministic communication complexity
Example Equality: Alice and Bob each hold x,y 2 {0,1}n • want to decide whether x=y or not. N0(Equality)= log n What about N1(Equality)? Homework: show that for every Boolean function f and z 2 {0,1} Nz(f) ¸ log D(f)
Covers and Non-determinism There is a 1-1 correspondence with a z-cover: a collection of not necessary disjoint combinatorial rectangles in f:X x Y where for each (x,y) such that f(x,y)=z there is a rectangle that cover it. Let Cz(f) be the size of the smallest z-cover: A non-deterministic protocol for verifying that f(x,y)=z: Alice: Guess a rectangle R intersecting row x, send name to Bob Bob: verify that R intersects column y and report to Alice Accept only if Bob approves • The number of rectangles is the number of leaves Theorem:Nz(f) · log Cz(f)+1 and Cz(f) · 2Nz(f)
Lower bounds • A set Sµ X x Y is a fooling set for f if there exists a z 2 Z where • For every(x,y) 2 S, f(x,y)=z • For everydistinct (x1,y1), (x2,y2) 2 S either • f(x1,y2)≠z or • f(x2,y1)≠z • The fooling set argument is still applicable So N1(Equality) ¸ n Homework: for x,y 2 {0,1}n let GT(x,y) be 1 iff x>y. Show that N1(GT) and N0(GT) are (n)
Deterministic vs. Nondeterministic Communication Complexity • There can be a large gap between D(f) and Nz(f), but not for both0 and 1: Theorem: for any Boolean f:X x Y {0,1} we have D(f) · N0(f) N1(f) Proof: Property: if R0 is a 0-monochromatic rectangle and R1 is a 1-monochromatic rectangle, then either: • R0 and R1 do not intersect in columns, or • R0 and R1 do not intersect in rows Corollary: if R0 is a 0-monochromatic rectangle and C1 is a collection of 1-monochromatic rectangles, then either: • R0 intersects in columns less than half of the rectangles in C1, or • R0 intersects in rows less than half of the rectangles in C1
The protocol Use the 0-cover of size C0(f) and 1-cover of size C1(f) to construct a protocol Maintain a decreasing list L of potential 0-monochromatic rectangles containing (x,y) Start with the full 0-cover At each step: If L is empty declare the result to be 1 Alice: search for a 1-rectangle in the 1-cover that • Contains row x • Intersects in rows at most half the rectangles in L If found report the name (logC1(f) bits) and update L If not found: Bob: search for a 1-rectangle in the 1-cover that • Contains column y • Intersects in columns at most half the rectangles in L If found report the name and update L If not found declare the result to be 0 Row intersection column intersection 1 0 0 0
Proof Lemma: the protocol is correct • If f(x,y)=0 then the 0-rectangle containing (x,y) is never deleted from L • If f(x,y)=1 then the 1-rectangle containing (x,y) can always be used by either Alice or Bob (from corollary) Lemma: the complexity of the protocol is at most logC1(f)logC0(f) • Each iteration L shrinks by ½ - at most logC0(f)rounds • Specifying the 1-rectangle logC1(f)bits • Corollary: if R0 is a 0-rectangle and C1 is a collection of 1-rectangles, then either: • R0 intersects in columns less than half of the rectangles in C1, or • R0 intersects in rows less than half of the rectangles in C1
The result is tight k-Disjointness: let 1 · k ¿ n/2 let x,y µ {1,…,n} be subsets of size k. let • k-DISJ(x,y)=1 if |x y|¸ 1 and • k-DISJ(x,y)=0 otherwise • Note |X|=|Y|= (nk ) • N1(k-DISJ)=O(log n) • N0(k-DISJ)=O(k+loglog n) • Using a functionh:{1,…,n} {1,…,2k} which is perfect hash for x[y • Possible to get lower bound of log (nk ) on D(k-DISJ) using the rank technique
Perfect Hash Functions A family H of functions is (n,k,ℓ) perfect if • 8h 2 H h:{1,…,n} {1,…, ℓ} • 8S ½ {1,…,n} 9h 2 H such that h is 1-1 on S A non-deterministic protocol for k-disjointness using an (n,2k,2k) family H of functions Alice: guess h 2 H and send description hoping that h is perfect for x [ y compute h(i) for all i 2 x and send 2k bit vector Cx where Cx[j]=1 iff 9i 2 x such that h(i)=j Bob: compute h(i) for all i 2 y and see whether Cx and Cy intersect Accept only if do not intersect If |x y|¸ 1 always reject. If |x y|¸ 0 and h is perfect for x [ y – accept Complexity: log|H| + 2k
Existence of Perfect Hash Families:The Probabilistic Method • For a fixed S ½ {1,…,n} of size 2k and choose random h:{1,…,n} {1,…,2k} Pr[h is perfect for S] = k!/(2k)2k¼ e-2k • Suppose we choose m random h:{1,…,n} {1,…,2k} Let event AS be ``no h in the collection is perfect for S” Pr[AS] · (1- e-2k)m We are interested in showing Pr[[S AS] < 1 This implies that there is a choice of the that is a perfect family of hash function Pr[[S AS] ·S Pr[AS] · (n2k) Pr[AS] The probabilistic method! Union Bound
The parameters • Set m= e2k log (n2k). Then Pr[AS] · (1- e-2k)m · (1- e-2k) e2k log (n2k)=1/(n2k) • This means that communication complexity is 2k +log m = 2k +2k +log (k log n) 2 O(k+log log n) • Classical constructions: Fredman Komlos Szemeredi • More modern one: Pugh
Open Problem: rank lower bound tight? Open Problem: Is there a fixed c>0 such that for all f:X x Y {0,1} D(f) is O(logc rank(f)) Not true for non-Boolean functions: f(x,y)= xi yi At least as hard as Inner Product over GF[2]
Probabilistic Communication Complexity • Alice an Bob have each, in addition to their inputs, access to random strings of arbitrary length rA and rB (respectively) A probabilistic protocol P over domain X x Y with range Z is a binary tree where • Each internal node v is labeled with either av(x, rA ) or bv(y, rB ) • Each leaf is labeled with an element z 2 Z Take all probabilities over the choice of rA and rB • P computes f with zero error if for all (x,y) Pr[P(x,y)=f(x,y)]=1 • P computes f with error if for all (x,y) Pr[P(x,y)=f(x,y)]¸ 1- • For Boolean f,P computes f with one-sided error if for all (x,y) s.t. f(x,y)=0 Pr[P(x,y)=0]=1 and for all (x,y) s.t. f(x,y)=1 Pr[P(x,y)=1]¸ 1-
Measuring Probabilistic Communication Complexity For input (x,y) can consider as the cost of protocol P on input (x,y) either • worst-case depth • average depth over rA and rB Cost of a protocol: maximum cost over all inputs (x,y) The appropriate measure of probabilistic communication complexity: • R0(f): minimum (over all protocols) of the average cost of a randomizedprotocol thatcomputes f with zero error. • R(f): minimum (over all protocols) of the worst-case cost of a randomizedprotocol thatcomputes f with error. • Makes sense: if 0< <½ . R(f) = R1/3(f): • R1(f): minimum (over all protocols) of the worst-case cost of a randomizedprotocol thatcomputes f with one-sided error. • Makes sense: if 0< <1. R1(f) = R½1(f):
Equality • Idea: pick a family of hash functions H={h|h:{0,1}n {1…m}} such that for allx≠y, for random h 2RH Pr[(h(x)=h(y)]· Protocol: Alice: pick randomh 2RH and send <h, h(x)> Bob: compare h(x) to h(y) and announce the result This is a one-sided error protocol with cost log|H|+ log m Constructing H: Fact: over any two polynomials of degree d agree on at most d points Fix prime q such that n2· q · 2n2 map x to a polynomial Wx of degree d=n/log q over GF[q] H={hz|z 2 GF[q]} and hz(x)=Wx(z) = d/q= n/q log q · 1/n log n
Relationship between the measures Error reduction (Boolean): • for one-sided errors: k repetitions reduces to k hence Rk1(f) · kR1(f) • for two-sided errors: k repetitions and taking majority reduces the error using Chernoff bounds Derandomization: R(f) = R1/3(f) 2(log D(f)) General Idea: find a small collection of assignments to where the protocol behaves similarly. Problem: to work for all pairs of inputs need to repeat ~n times Instead: jointly evaluate for each leaf ℓ the probability of reaching it, on the given input: Pℓ [x,y] = PℓA[x|Bob follows the path] ¢ PℓB[y|Alice follows the path] Chernoff: ifPr[xi] = 1] =1/2 - then Pr[i=1k xi > k/2] · e-22k Alice computes and sends. Accuracy: log R(f) bits Bob computes
Public coins model • What if Alice and Bob have access to a joint source of bits. Possible view: distribution over deterministic protocols Let Rpub(f): be the minimum cost of a public coins protocol computing f correctly with probability at least 1- for any input (x,y) Example: Rpub(Equality)=(log ) Theorem: for any Boolean f: R+d(f) is Rpub(f)+O(log n + log 1/d) Proof: choose t = 8n/d2 assignments to the public string…
Simulating large sample spaces Collection that should resemble probability of success on ALL inputs • Want to find among all possible public random strings a small collection of strings on which the protocol behave similarly on all inputs • Choose m random strings • For input (x,y) event Ax,y is more than (+) of the m strings fail the protocol Pr[Ax,y] · e-22t < 2-2n Pr[[x,y AS] ·S Pr[AS] <22n 2-2n=1 Bad Good 1-
Distributional Complexity Let be a probability distribution on X x Y and >0. The (,)-distributional complexity of f (D(f)) is the cost of the best deterministic protocol that is correct on 1- of the inputs weighted by Theorem: for any f: Rpub(f)=max D(f)) Is the given protocol correct on the given input Inputs Von Neumann’s Minimax Theorem: For all matricesM: maxv minq pT M q = minq maxp pT M q Protocols of depth d
Discrepancy and distributional complexity For f:X x Y {0,1} and rectangle R and be a probability distribution on X x Y let Discm(R,f)=|Pr[f(x,y)=1 and (x,y) 2 R] -Pr[f(x,y)=0 and (x,y) 2 R]| Discm(f)=maxR Discm(R,f)=| Theorem: For f:X x Y {0,1}, distribution on X x Y and >0 D1/2-e(f)¸ log(2 / Discm(f))
Inner Product • Let x,y 2 {0,1}n IP(x,y)= i=1n xi yi mod 2 Theorem: R(IP) is (n). more accurately R1/2-pub(IP)¸n/2 – log(1/) Show that Discuniform(IP) = 2-n/2. Therefore D1/2-euniform(IP) ¸ n/2-log(1/) And R1/2-pub(IP)¸n/2 – log(1/) Let H(x,y)=(-1)IP(x,y) Claim:||H||=√2n H HT = 2nIn For rectangle S £ T: Discuniform(S £ T, IP) =( 1/22n )x 2 S ,y 2 T H(x,y) = ( 1/22n )|1S¢ H ¢ 1T| |1S¢ H ¢ 1T| ·|| 1S||¢ || H ||¢ || 1T || =√|S|¢√2n¢√|T| · 2n/2 ¢ 2n/2 ¢ 2n/2 ||A|| =max|v|=1 ||A v|| = max{| eignevalue of AAT} Cauchy-Schwartz
Summary of Communication Complexity Classes • For a `robust’ class f • Closed under composition • Example: polylog communication • Nf is not equal Co-Nf • Pf= Nf Å Co-Nf • BPf is not contained in Nf Å Co-Nf
Summary of techniques and ideas • Probabilistic method: to show that a combinatorial object exists: choose a random one and show that it satisfies desired properties with Prob>0 • Constructive vs. non-constructive methods • Union bound: define a collection of bad events A1, A2, … An Pr[[i Ai] ·i Pr[Ai] · n Pr[Ai] • Simulating large sample spaces by small ones • Reversing roles via Minimax Theorem