Tight Hardness Results for Some Approximation Problems [mostly Håstad]

Tight Hardness Results forSome Approximation Problems[mostly Håstad] Adi AkaviaDana MoshkovitzS. Safra

“Road-Map” for Chapter I Gap-3-SAT  expander Gap-3-SAT-7  Parallel repetition lemma par[, k]  XY isNP-hard

Maximum Satisfaction Def: Max-SAT • Instance: • A set of variables Y = { Y1, …, Ym } • A set of Boolean-functions (local-tests) over Y = { 1, …, l } • Maximization: • We define () = maximum, over all assignments to Y, of the fraction of i satisfied • Structure: • Various versions of SAT would impose structure properties on Y, Y’s range and 

Max-E3-Lin-2 Def: Max-E3-Lin-2 • Instance: a system of linear equationsL = { E1, …, En } over Z2each equation of exactly 3 variables(whose sum is required to equal either 0 or 1) • Problem: Compute (L)

2-Variables Functional SAT Def[ XY ]:  over • variables X,Y of range Rx,Ry respectively • each  is of the form xy: RxRy an assignment A(xRx,yRy)satisfies xyiffxy(A(x))=A(y) [ Namely, every value to X determines exactly 1 satisfying value for Y] Thm: distinguishing between •  A satisfies all  •  A satisfies < fraction of  IS NP-HARD as long as |Rx|,|Ry|> -0(1)

Proof Outline Gap-3-SAT ? Gap-3-SAT-7 Def: 3SAT is SAT where every iisa disjunction of 3 literals. Def: gap-3SAT-7 is gap-3SAT with theadditional restriction, that everyvariable appears in exactly 7 local-tests Theorem: gap-3SAT-7 is NP-hard par[, k] XY isNP-hard

Expanders Gap-3-SAT ? Expanders Gap-3-SAT-7 Def: a graph G(V,E) is a c-expander if for everySV, |S| ½|V|: |N(S)\S|  c·|S| [where N(S) denotes the set of neighbors of S] Lemma: For every m, one can construct in poly-time a 3-regular, m-vertices, c-expander, for some constant c>0 Corollary: a cut between S and V\S, for |S|  ½|V| must contain > c·|S| edges par[, k] XY isNP-hard

Reduction Using Expanders Assume ’ for which (’) is either 1 or 1-20/c.  is ’ with the following changes: • an occurrence of y in i is replaced by a variable xy,i • Let Gy, for every y, be a 3-regular, c-expander over all occurrences xy,iof y • For every edge connecting xy,i to xy,j in Gy, add to  the clauses (xy,i  xy,j) and (xy,i   xy,j) It is easy to see that: • ||  10 |’| • Each variable xy,iof appears in exactly 7i   constructible by the Lemma ensuring equality

Correctness of the Reduction •  is completely satisfiable iff ’ is • In case ’ is unsatisfiable: (’) <1-20/cLet A be an optimal assignment to Let Amajassign xy,i the value assigned by A to the majority, over j, of variables xy,jLet FA and FAmaj be the sets of   unsatisfied by A and Amaj respectively: ||·(1-()) = |FA| = |FAFAmaj|+|FA\FAmaj| |FAFAmaj|+½c|FAmaj\FA|  ½c|FAmaj| andsince Amaj is in fact an assignment to ’()  1- ½c(1- (’))/10 < 1- ½c(20/c)/10= 1-

Notations Def: For a 3SAT formula  over Boolean variables Z, • Let Zk be the set of all k-sequences of ’s variables • Let k be the set of all k-sequences of ’s clauses Def: For any VYk and Ck, let • RYbe the set of all assignments to V • RXbe the set of all satisfying assignments to C Def: For any set of k variables VZk, and a set of k clauses Ck, denote V CV is a choice of one variable of each clause in C.

Parallel SAT Def: for a 3SAT formula  over Boolean variables Z, define par[, k]: par[, k]has two types of variables: • yV for every set VYk,where yV‘s range is the set RYof all assignments to V • xC for every set Ck,where xC‘s range is the set RXof all satisfying assignments to all clauses in C par[, k] has a local-test [C,V]for each V Cwhich accepts if xC’s value restricted to V is yV’s value (namely, if the assignments to T[C] and T[V] are consistent) |RY|=2k |RX|=7k

Gap Increases with k Gap-3-SAT Gap-3-SAT-7 Parallel repetition lemma par[, k] Note that if () = 1 then (par[, k]) = 1 On the other hand, if  is not satisfiable: Lemma: (par[, k])  ()c·kfor some c>0 Proof: first note that1-(par[, 1])  (1-())/3 now, to prove the lemma, apply the Parallel-Repetition lemma [Raz] to par[, 1] XY isNP-hard In any assignment tos variables, any unsatisfied clause in”induces“ at least 1 (out of corresponding 3) unsatisfied par[, 1]

Conclusion: XY isNP-hard • Denote: •  = par[, k] • X={xC} • Y={yV} Then, • distinguish between: •  A satisfies all  •  A satisfies < fraction of  IS NP-HARD as long as |Rx|,|Ry|> -0(1)

“Road-Map” for Chapter II XY isNP-hard Long code L  () = ()  LLC-Lemma: (L) = ½+/2  (par[,k]) > 42

Main Theorem Thm: gap-Max-E3-Lin-2(1-, ½+)is NP-hard. That is, for every constant 0<<¼ it is NP-hard to distinguish between the case where 1- of the equations are satisfiable and the case where ½+ are. [ It is therefore NP-Hard to approximateMax-E3-Lin-2 to within factor 2- for any constant 0<<¼]

This bound is tight • A random assignment satisfies half of the equations. • Deciding whether a set of linear equations have a common solution is in P (Gaussian elimination).

XY isNP-hard Long code L  LLC-Lemma: (L) = ½+/2  (par[,k]) > 42 Distributional Assignments Let  be a SAT instance over variablesZ of range R. Let (R) be all distributions over R Def: a distributional-assignment to  is A: Z (R)Denote by () themaximum over distributional-assignments A of theaverage probability for  to be satisfied,if variables` values are chosen according to A Clearly()  (). Moreover Prop: ()  ()

Distributional-assignment to  1 1 0 1 0 0 0 1 x1 x3 x2 xn OR: 1 0 1 1 0 0 1 0 x1 x3 x2 xn

Restriction and Extension Def: For any yYover RY and xXover RXs.t xy • The natural restrictionof an aRXto Ry is denoted a|y • The elevationof a subset FP[RY] to RX is the subset F*P[RX] of all members a of RX for which xy(a) FF* = { a | a|y F }

XY isNP-hard Long-Code Long code L In the long-code the set of legal-words consists of all monotone dictatorships This is the most extensive binary code, as its bits represent all possible binary values over n elements LLC-Lemma: (L) = ½+/2  (par[,k]) > 42

Long-Code • Encoding an element e[n]: • Eelegally-encodes an element e if Ee = fe T F F T T

Long-Code over Range R |BP[R]| = 2|R|-1-1 BP[R]  the set of all subsets of R of size ≤½|R| • Our long-code: in our context there’re two types of domains “R”:Rx and Ry . Def: an R-long-code has 1 bit for each F P[R] namely, any Boolean function: P[R]  {-1, 1} Def: a legal-long-code-word of an element eR, is a long-code ERe: P[R]  {-1, 1} that assigns eF to every subset F  P[R]

Linearity of a Legal-Encoding An assignment A : BP[R]  {-1,1}, if legal, is a linear-function, i.e.,  F, G  BP[R]:f(F)  f(G)  f(FG) Unfortunately, any character is linear as well!

XY isNP-hard The Variables of L Long code L Consider  (xy) forlarge constant k (to befixed later) L has 2 types of variables: • a variable z[y,F] for every variable yY and a subset F  BP[Ry] • a variable z[x,F] for every variable xX and a subset F  BP[Rx] LLC-Lemma: (L) = ½+/2  (par[,k]) > 42

The Distribution  Def: denote by  the distribution over all subset of Rx, which assigns probability to a subset H as follows:Independently, for each a  Rx, let • aH with probability 1- • aH with probability  One should think ofas a multiset of subsets in which every subsetHappears with the appropriate probability

Linear equation L‘s multiplicative-equations are the union, over all xy  , of the following:FP[RY], GP[RX] and H(RX)z(y,F) z(x, G) = z(x, F*GH)

Multiplicative representation General Fourier Analysis facts Revised Representation Representation by Fourier Basis Multiplicative Representation: • True  -1 • False  1 • L: • z[X,*], z[Y,*]  {-1, 1} • z[X,f] • z[Y, g] • z[X,’f•g•h’] = 1 Claim 2 Claim 1 Claim 3:The expected success of the distributional assignment  on [C,V]par[,k] is at least 4 2 (par[,k]) > 42 

XY isNP-hard L  Prop: if () = 1 then (L) = 1- Proof:Let A be a satisfying assignment to . Assign all variables of Laccording to the legal encoding of A’s values.A linear equation of L, corresponding to X,Y,F,G,H, would be unsatisfied exactly if xH, which occurs with probability  over the choice of H. LLC-Lemma: (L) = ½+/2() > 42 LLC-Lemma: (L) = ½+/2  (par[,k]) > 42 Note: independent of k! (Later we use that fact to define k large enough for our needs).  = 2(L) -1

Hardness of approximating Max-E3-Lin-2 Main Theorem: For any constant >0: gap-Max-E3-Lin-2(1-,½+) is NP-hard. Proof: Let’ be a gap-3SAT-7(1, 1-)By proposition(’) = 1 (L’)  1- 

Lemma  Main Theorem Prop:Let  be a constant >0 s.t.:(1-)/(½+/2)  2-Let k be large enough s.t.:43 > (’)c·kThen (’) < 1 (L’)  ½+/2  ½+  Proof: Assume, by way of contradiction, that (L)  ½+/2 then:43 > (’)c·k  () > 42, which implies that  > .Contradiction! of the parallel repetition lemma

Long-Code as an inner product space Def: { A : BP[R]  {-1,1} } is an inner-product space: A,B{ A : BP[R]  {-1,1} }

An Assignment  to L For any variable xXThe set z[x,*] of variables ofL represent the long-code of xLet be the Fourier-Coefficient <|z[X,*],s> For any variable yYThe set z[y,*] of variables ofL represent the long-code of yLetbe the Fourier-Coefficient <|z[Y,*],s>

The Distributional Assignment. Def: Let  be a distributional-assignmentto  as follows: • For any variable x • Choose a set SRx with probability , • Uniformly choose a random assignment aS. • For any variable y • Choose a set SRy with probability , • Uniformly choose a random assignment bS.

Longcode and Fourier Coeficients go to claim2 Auxiliary Lemmas: 1. For any F,GBP[R] and S  R, S(F·G) = S(F)·S(G). 2. For any FBP[R] and s,s’  R, s(F)·s’(F) = ss’(F) 3. For any random F (uniformly chosen) and S, E[ s(F) ]=0 and E[ (F) ]=1. =xf(x) apply multiplication’s commutative & associative properties (f)·(f)=xf(x)·xf(x)= xf(x)2·x(f)=1·x(f) xs, f(x)is 1or-1with probability ½

Home Assignment • Given an assignment to a Longcode A:BP[R]  {-1, 1}, show that for any (constant)  > 0, there is a constant h(), which depends on , however does not depend on R such that:| {e  R | (Ee, A) > ½ +  } |  h()where (A1, A2)is the fraction of bits A1 and A2differ on.

What’s Ahead: We show ‘s expected success on xy is > 42 in two steps: First we show (claim 1) that ‘s success probability, for any xy is Then show (claim 3) that value to be 42

General Fourier Analysis facts Claim 1 Multiplicative representation go to claim3 Representation by Fourier Basis Claim 1 Claim 1: The success probability of  on xy  is Proof:That success probability is at least and if S’=S|y there is at least one bS s.t. b|y  S’ So, ‘s success probability is at least |S|-1times the case in which the chosen S’ and S satisfy: S|y = S’, i.e. at least Claim 2 Claim 3:The expected success of the distributional assignment  on [C,V]par[,k] is at least 4 2 (par[,k]) > 42 

Lemma’s Proof - Claim 2 (1) go to claim3 General Fourier Analysis facts Multiplicative representation Claim 2: Proof: The test accepts iffz[y,F]•z[x, G]•z[x,F*•G•H] = 1 By our assumption, this happens with probability/2+½. Now, according to the definition of the expectation: Exy, F, G, H[z[y,F]•z[x, G]•z[x,F*•G•H]] = 1•(½+/2) + (-1)•(1 -(½+/2)) =  Representation by Fourier Basis Claim 2 Claim 1 Claim 3:The expected success of the distributional assignment  on [C,V]par[,k] is at least 4 2 (par[,k]) > 42 

Lemma’s Proof - Claim 2 (2) We next show that Hence,

Lemma’s Proof - Proposition

Lemma’s Proof - Claim 3 Claim 3: The expected success of the distributional assignment on xy  is at least 42 Proof:Claim 1 gives us the initial lower bound for the expected success:

Lemma’s Proof - Claim 3 As we’ve already seen, . Hence, our lower-bound takes the form of Or alternatively, Which allows us to use the known inequality E[x2]E[x]2 and get

Lemma’s Proof - Claim 3 Byauxiliary lemmas(4||)-1/2  e-2||  (1-2)||, i.e.||-1/2 (4)1/2 ·(1-2)||, which yields the following bound That is, Now applying claim 2 results the desired lower bound

Lemma’s Proof -Conclusion We showed that there is an assignment scheme with expected success of at least 42 ,  There exists an assignment that satisfies at least 42 of the tests in   () > 42 Q.E.D.

Home Assignment Show it is NP-hard, for any  > 0, given a 3SAT instance , to distinguish between the case where () = 1, and the case in which () < 7/8+ Hint: Let ’s variables be as in L, and ’s clauses to take the formF OR G OR F*GHfor f and g chosen in the same way as in L,while h is chosen as follows: • H(b) = 1 for b such that F(b|V) and G(b) are both FALSE • For all other b’s, independently for each b, H(b)=1 with probability , and -1 with probability 1-

Tight Hardness Results for Some Approximation Problems [mostly Håstad]