Complexity Theory Lecture 2

Complexity TheoryLecture 2 Lecturer:Moni Naor

Recap of last week • Computational Complexity Theory: What, Why and How • Overview: Turing Machines, Church-Turing Thesis • Diagonalization an the Time-Hierarchy (also Space) • Communication Complexity • Definition of protocol • Combinatorial Rectangles • Fooling sets an lower bounds • Connection to Time-Space tradeoffs of Turing Machines • Rank lower bound • Covers an non-determinism This week: • Non-deterministic and probabilistic communication complexity • Space Complexity

Non-deterministic Communication Complexity • A non-deterministic function: a:X {0,1} maps each element to one of the sets {0}, {1} or {0,1}. When evaluating the function the result should be one of the elements in the subset. Let f:X £ Y with range Z. A non-deterministic protocolfor verifying that f(x,y)=z is a binary tree where: • Each internal node v is labeled with a non-deterministic function av:X {0,1} or bv:Y {0,1} • Each leaf is either labeled àccept’ or with `failure’ The inputs (x,y) and the non-deterministic choices define a path from the root to a leaf. • If f(x,y)=z then there should be at least one path leading to an àccept’ leaf • If f(x,y)≠z then there should be no path leading to an àccept’ leaf • The cost of a protocol on input (x,y) is the length of the longest path taken on input (x,y) • The cost of a protocol is the maximum path length • The non-deterministic communication complexity of f for verifying z is the cost of the best protocol. Denote it by Nz(f) • For Boolean f call N1(f) the nondeterministic communication complexity and N0(f) co-nondeterministic communication complexity

Example Equality: Alice and Bob each hold x,y 2 {0,1}n • want to decide whether x=y or not. N0(Equality)= log n What about N1(Equality)? Homework: show that for every Boolean function f and z 2 {0,1} Nz(f) ¸ log D(f)

Covers and Non-determinism There is a 1-1 correspondence with a z-cover: a collection of not necessary disjoint combinatorial rectangles in f:X x Y where for each (x,y) such that f(x,y)=z there is a rectangle that cover it. Let Cz(f) be the size of the smallest z-cover: A non-deterministic protocol for verifying that f(x,y)=z: Alice: Guess a rectangle R intersecting row x, send name to Bob Bob: verify that R intersects column y and report to Alice Accept only if Bob approves • The number of rectangles is the number of leaves Theorem:Nz(f) · log Cz(f)+1 and Cz(f) · 2Nz(f)

Lower bounds • A set Sµ X x Y is a fooling set for f if there exists a z 2 Z where • For every(x,y) 2 S, f(x,y)=z • For everydistinct (x1,y1), (x2,y2) 2 S either • f(x1,y2)≠z or • f(x2,y1)≠z • The fooling set argument is still applicable So N1(Equality) ¸ n Homework: for x,y 2 {0,1}n let GT(x,y) be 1 iff x>y. Show that N1(GT) and N0(GT) are (n)

Deterministic vs. Nondeterministic Communication Complexity • There can be a large gap between D(f) and Nz(f), but not for both0 and 1: Theorem: for any Boolean f:X x Y  {0,1} we have D(f) · N0(f) N1(f) Proof: Property: if R0 is a 0-monochromatic rectangle and R1 is a 1-monochromatic rectangle, then either: • R0 and R1 do not intersect in columns, or • R0 and R1 do not intersect in rows Corollary: if R0 is a 0-monochromatic rectangle and C1 is a collection of 1-monochromatic rectangles, then either: • R0 intersects in columns less than half of the rectangles in C1, or • R0 intersects in rows less than half of the rectangles in C1

The protocol Use the 0-cover of size C0(f) and 1-cover of size C1(f) to construct a protocol Maintain a decreasing list L of potential 0-monochromatic rectangles containing (x,y) Start with the full 0-cover At each step: If L is empty declare the result to be 1 Alice: search for a 1-rectangle in the 1-cover that • Contains row x • Intersects in rows at most half the rectangles in L If found report the name (logC1(f) bits) and update L If not found: Bob: search for a 1-rectangle in the 1-cover that • Contains column y • Intersects in columns at most half the rectangles in L If found report the name and update L If not found declare the result to be 0 Row intersection column intersection 1 0 0 0

Proof Lemma: the protocol is correct • If f(x,y)=0 then the 0-rectangle containing (x,y) is never deleted from L • If f(x,y)=1 then the 1-rectangle containing (x,y) can always be used by either Alice or Bob (from corollary) Lemma: the complexity of the protocol is at most logC1(f)logC0(f) • Each iteration L shrinks by ½ - at most logC0(f)rounds • Specifying the 1-rectangle logC1(f)bits • Corollary: if R0 is a 0-rectangle and C1 is a collection of 1-rectangles, then either: • R0 intersects in columns less than half of the rectangles in C1, or • R0 intersects in rows less than half of the rectangles in C1

The result is tight k-Disjointness: let 1 · k ¿ n/2 let x,y µ {1,…,n} be subsets of size k. let • k-DISJ(x,y)=1 if |x  y|¸ 1 and • k-DISJ(x,y)=0 otherwise • Note |X|=|Y|= (nk ) • N1(k-DISJ)=O(log n) • N0(k-DISJ)=O(k+loglog n) • Using a functionh:{1,…,n}  {1,…,2k} which is perfect hash for x[y • Possible to get lower bound of log (nk ) on D(k-DISJ) using the rank technique

Perfect Hash Functions A family H of functions is (n,k,ℓ) perfect if • 8h 2 H h:{1,…,n}  {1,…, ℓ} • 8S ½ {1,…,n} 9h 2 H such that h is 1-1 on S A non-deterministic protocol for k-disjointness using an (n,2k,2k) family H of functions Alice: guess h 2 H and send description hoping that h is perfect for x [ y compute h(i) for all i 2 x and send 2k bit vector Cx where Cx[j]=1 iff 9i 2 x such that h(i)=j Bob: compute h(i) for all i 2 y and see whether Cx and Cy intersect Accept only if do not intersect If |x  y|¸ 1 always reject. If |x  y|¸ 0 and h is perfect for x [ y – accept Complexity: log|H| + 2k

Existence of Perfect Hash Families:The Probabilistic Method • For a fixed S ½ {1,…,n} of size 2k and choose random h:{1,…,n}  {1,…,2k} Pr[h is perfect for S] = k!/(2k)2k¼ e-2k • Suppose we choose m random h:{1,…,n}  {1,…,2k} Let event AS be ``no h in the collection is perfect for S” Pr[AS] · (1- e-2k)m We are interested in showing Pr[[S AS] < 1 This implies that there is a choice of the that is a perfect family of hash function Pr[[S AS] ·S Pr[AS] · (n2k) Pr[AS] The probabilistic method! Union Bound

The parameters • Set m= e2k log (n2k). Then Pr[AS] · (1- e-2k)m · (1- e-2k) e2k log (n2k)=1/(n2k) • This means that communication complexity is 2k +log m = 2k +2k +log (k log n) 2 O(k+log log n) • Classical constructions: Fredman Komlos Szemeredi • More modern one: Pugh

Open Problem: rank lower bound tight? Open Problem: Is there a fixed c>0 such that for all f:X x Y  {0,1} D(f) is O(logc rank(f)) Not true for non-Boolean functions: f(x,y)=  xi yi At least as hard as Inner Product over GF[2]

Probabilistic Communication Complexity • Alice an Bob have each, in addition to their inputs, access to random strings of arbitrary length rA and rB (respectively) A probabilistic protocol P over domain X x Y with range Z is a binary tree where • Each internal node v is labeled with either av(x, rA ) or bv(y, rB ) • Each leaf is labeled with an element z 2 Z Take all probabilities over the choice of rA and rB • P computes f with zero error if for all (x,y) Pr[P(x,y)=f(x,y)]=1 • P computes f with  error if for all (x,y) Pr[P(x,y)=f(x,y)]¸ 1- • For Boolean f,P computes f with one-sided  error if for all (x,y) s.t. f(x,y)=0 Pr[P(x,y)=0]=1 and for all (x,y) s.t. f(x,y)=1 Pr[P(x,y)=1]¸ 1-

Measuring Probabilistic Communication Complexity For input (x,y) can consider as the cost of protocol P on input (x,y) either • worst-case depth • average depth over rA and rB Cost of a protocol: maximum cost over all inputs (x,y) The appropriate measure of probabilistic communication complexity: • R0(f): minimum (over all protocols) of the average cost of a randomizedprotocol thatcomputes f with zero error. • R(f): minimum (over all protocols) of the worst-case cost of a randomizedprotocol thatcomputes f with  error. • Makes sense: if 0< <½ . R(f) = R1/3(f): • R1(f): minimum (over all protocols) of the worst-case cost of a randomizedprotocol thatcomputes f with one-sided  error. • Makes sense: if 0< <1. R1(f) = R½1(f):

Equality • Idea: pick a family of hash functions H={h|h:{0,1}n {1…m}} such that for allx≠y, for random h 2RH Pr[(h(x)=h(y)]· Protocol: Alice: pick randomh 2RH and send <h, h(x)> Bob: compare h(x) to h(y) and announce the result This is a one-sided  error protocol with cost log|H|+ log m Constructing H: Fact: over any two polynomials of degree d agree on at most d points Fix prime q such that n2· q · 2n2 map x to a polynomial Wx of degree d=n/log q over GF[q] H={hz|z 2 GF[q]} and hz(x)=Wx(z)  = d/q= n/q log q · 1/n log n

Relationship between the measures Error reduction (Boolean): • for one-sided errors: k repetitions reduces to k hence Rk1(f) · kR1(f) • for two-sided errors: k repetitions and taking majority reduces the error using Chernoff bounds Derandomization: R(f) = R1/3(f) 2(log D(f)) General Idea: find a small collection of assignments to where the protocol behaves similarly. Problem: to work for all pairs of inputs need to repeat ~n times Instead: jointly evaluate for each leaf ℓ the probability of reaching it, on the given input: Pℓ [x,y] = PℓA[x|Bob follows the path] ¢ PℓB[y|Alice follows the path] Chernoff: ifPr[xi] = 1] =1/2 -  then Pr[i=1k xi > k/2] · e-22k Alice computes and sends. Accuracy: log R(f) bits Bob computes

Public coins model • What if Alice and Bob have access to a joint source of bits. Possible view: distribution over deterministic protocols Let Rpub(f): be the minimum cost of a public coins protocol computing f correctly with probability at least 1- for any input (x,y) Example: Rpub(Equality)=(log ) Theorem: for any Boolean f: R+d(f) is Rpub(f)+O(log n + log 1/d) Proof: choose t = 8n/d2 assignments to the public string…

Simulating large sample spaces Collection that should resemble probability of success on ALL inputs • Want to find among all possible public random strings a small collection of strings on which the protocol behave similarly on all inputs • Choose m random strings • For input (x,y) event Ax,y is more than (+) of the m strings fail the protocol Pr[Ax,y] · e-22t < 2-2n Pr[[x,y AS] ·S Pr[AS] <22n 2-2n=1 Bad  Good 1-

Distributional Complexity Let  be a probability distribution on X x Y and >0. The (,)-distributional complexity of f (D(f)) is the cost of the best deterministic protocol that is correct on 1- of the inputs weighted by  Theorem: for any f: Rpub(f)=max D(f)) Is the given protocol correct on the given input Inputs Von Neumann’s Minimax Theorem: For all matricesM: maxv minq pT M q = minq maxp pT M q Protocols of depth d

Discrepancy and distributional complexity For f:X x Y  {0,1} and rectangle R and be a probability distribution  on X x Y let Discm(R,f)=|Pr[f(x,y)=1 and (x,y) 2 R] -Pr[f(x,y)=0 and (x,y) 2 R]| Discm(f)=maxR Discm(R,f)=| Theorem: For f:X x Y  {0,1}, distribution  on X x Y and >0 D1/2-e(f)¸ log(2 / Discm(f))

Inner Product • Let x,y 2 {0,1}n IP(x,y)= i=1n xi yi mod 2 Theorem: R(IP) is (n). more accurately R1/2-pub(IP)¸n/2 – log(1/) Show that Discuniform(IP) = 2-n/2. Therefore D1/2-euniform(IP) ¸ n/2-log(1/) And R1/2-pub(IP)¸n/2 – log(1/) Let H(x,y)=(-1)IP(x,y) Claim:||H||=√2n H HT = 2nIn For rectangle S £ T: Discuniform(S £ T, IP) =( 1/22n )x 2 S ,y 2 T H(x,y) = ( 1/22n )|1S¢ H ¢ 1T| |1S¢ H ¢ 1T| ·|| 1S||¢ || H ||¢ || 1T || =√|S|¢√2n¢√|T| · 2n/2 ¢ 2n/2 ¢ 2n/2 ||A|| =max|v|=1 ||A v|| = max{|  eignevalue of AAT} Cauchy-Schwartz

Summary of Communication Complexity Classes • For a `robust’ class f • Closed under composition • Example: polylog communication • Nf is not equal Co-Nf • Pf= Nf Å Co-Nf • BPf is not contained in Nf Å Co-Nf

Summary of techniques and ideas • Probabilistic method: to show that a combinatorial object exists: choose a random one and show that it satisfies desired properties with Prob>0 • Constructive vs. non-constructive methods • Union bound: define a collection of bad events A1, A2, … An Pr[[i Ai] ·i Pr[Ai] · n Pr[Ai] • Simulating large sample spaces by small ones • Reversing roles via Minimax Theorem

Complexity Theory Lecture 2

Complexity Theory Lecture 2

Presentation Transcript

Tracing Complexity Theory

Lecture 2: Origin of complexity

Complexity Theory

CS151 Complexity Theory

CS151 Complexity Theory

Segment: Computational game theory Lecture 1b: Complexity

Lecture 2: Algorithm Complexity

Complexity Theory

Complexity Theory Lecture 1

CS151 Complexity Theory

CS151 Complexity Theory

CS151 Complexity Theory

CS151 Complexity Theory

CS151 Complexity Theory

CS151 Complexity Theory

CS151 Complexity Theory

Complexity Theory Lecture 10

Complexity Theory Lecture 12

CS151 Complexity Theory

Complexity Theory Lecture 11

CS151 Complexity Theory