Slides by Asaf Shapira & Michael Lewin & Boaz Klartag & Oded Schwartz.

Introduction to PCP Slides by Asaf Shapira & Michael Lewin & Boaz Klartag & Oded Schwartz. Adapted from things beyond us.

Introduction In this lecture we’ll cover: • Definition of PCP • Prove some classical inapproximabillity results. • Give a review on some other recent ones.

Review: Decision, Optimization Problems • A decision problem is a Boolean function ƒ(X), or alternatively a languageL  {0, 1}* comprising all strings for which ƒ is TRUE: L = { X  {0, 1}* | ƒ(X) } • An optimization problem is a function ƒ(X, Y) which, given X, is to be maximized (or minimized) over all possible Y’s: maxy[ ƒ(X, Y) ] • A threshold version of max-ƒ(X, Y) is the languageLt of all strings X for which there exists Y such that ƒ(X, Y)  t (transforming an optimization problem into decision)

Review: The Class NP The classical definition of the class NP is as follows: We say that a language L  {0, 1}* belongs to the class NP, if there exists a Turing machine VL[referred to as a verifier] such that X  L there exists a witnessY such thatVL(X, Y) accepts, in time |X|O(1) That is, VL can verify a membership-proof of X in L in time polynomial in the length of X

Review: NP-Hardness • A language L is said to be NP-hard if an efficient (polynomial-time) procedure for L can be utilized to obtain an efficient procedure for any NP-language • This definition allows efficient reduction that use the more general, Cook reduction. An efficient algorithm, translating any NP problem to a single instance of L - thereby showing that LNP-hard - is referred to as Karp reduction.

Review: Characterizing NP Thm[Cook, Levin]:For any L NP there is an algorithm that, on input X, constructs in time |X|O(1), a set of Boolean functions, local-testsL,X = { j1, ..., jl }over variables y1,...,ym s.t.: • each of j1, ..., jl depends on o(1) variables • and X  L there exists an assignment A: { y1, ..., ym } a { 0, 1 }satisfying all j1, ..., jl [ note that m and l must be at most polynomial in |X| ]

Approximation - Some definitions Definition: g-approximation A g-approximationof a maximization (similar for minimization) function f, is an algorithm that on input X, outputs f’(X) such that: f’(X)  f(X)/g(|X|). Definition: PTAS (polynomial time approximation scheme) We say that a maximization function f, has a PTAS, if for every g, there is a polynomial pg and a g-approximationfor f, whose running time is pg(|X|).

Approximation - NP-hard? • We know that by using Cook/Karp reductions, we can show many decision problems to be NP-hard. • Can an approximation problem be NP-Hard? • One can easily show, that if there is g,for which there is a g-approximating for TSP, P=NP.

Strong, PCP Characterizations of NP Thm[AS,ALMSS]: For any L NP there is a polynomial-time algorithm that, oninput X,outputs L,X = { j1, ..., jl }over y1,...,ym s.t. • each of j1, ..., jl depends on O(1) variables • X  L $ assignment A: { y1, ..., ym } a { 0, 1 } satisfying all L,X • X  L  " assignment A: { y1, ..., ym } a { 0, 1 }satisfies < ½ fraction of L,X

Probabilistically-Checkable-Proofs • Hence, Cook-Levin theorem states that a verifier can efficiently verify membership-proofs for any NP language • PCP characterization of NP, in contrast, states that a membership-proof can be verified probabilistically • by choosing randomly one local-test, • accessing the small set of variables it depends on, • accept or reject accordingly • erroneously accepting a non-member only with small probability

Gap Problems • A gap-problem is a maximization (or minimization) problem ƒ(X, Y), and two thresholds t1 > t2 X must be accepted if maxY[ ƒ(X, Y) ]  t1 X must be rejected if maxY[ ƒ(X, Y) ]  t2 other X’s may be accepted or rejected (don’t care) (almost a decision problem, relates to approximation)

Reducing gap-Problems to Approximation Problems • Using an efficient approximation algorithm for ƒ(X, Y) to within a factor g,one can efficiently solve the corresponding gap problem gap-ƒ(X, Y), as long as t1 /t2 > g2 • Simply run the approximation algorithm.The outcome clearly determines which side of the gap the given input falls in.(Hence, proving a gap problem NP-hard translates to its approximation version, for appropriate factors )

gap-SAT • Def: gap-SAT[D, v, ] is as follows: • instance: a set  = { j1, ..., jl }of Boolean-functions (local-tests)over variables y1,...,ym of range 2V • locality: each of j1, ..., jl depends on at mostD variables • Maximum-Satisfied-Fraction is the fraction of  satisfied by an assignment A: { y1, ..., ym } a 2vif this fraction • = 1  accept • < reject • D, v and  may be a function of l

The PCP Hierarchy Def:L  PCP[ D, V,  ]if L is efficiently reducible to gap-SAT[ D, V,  ] • Thm[AS,ALMSS] NP  PCP[ O(1), 1, ½] [ The PCP characterization theorem above ] • Thm[ RaSa ] NP  PCP[ O(1), m, 2-m ] for m  logc n for some c > 0 • Thm[ DFKRS ]NP  PCP[ O(1), m, 2-m ] for m  logc n for any c < 1 • Conjecture[BGLR]NP  PCP[ O(1), m, 2-m ] for m  log n

Optimal Characterization • One cannot expect the error-probability to be less than exponentially small in the number of bits each local-test looks at • since a random assignment would make such a fraction of the local-tests satisfied • One cannot hope for smaller than polynomially small error-probability • since it would imply less than one local-test satisfied, hence each local-test, being rather easy to compute, determines completely the outcome [ the BGLR conjecture is hence optimal in that respect]

Approximating MAX-CLIQUE is NP-hard We will reduce gap-SAT to gap -CLIQUE. Given an expression  = { j1, ..., jl }of Boolean-functions over variables y1,...,ym of range 2V, Each of j1, ..., jl depends on at mostD variables, We must determine whether all the functions can be satisfied or only a fraction less than . We will construct a graph, G , such that it has a clique of size r there exists an assignment, satisfying r of the functions y1,...,ym.

Definition of G For each ji , G has a vertex for every satisfying assignment of ji j1 . ji . . jl      All assignmentsto ji’s variables      Satisfying ji Not satisfying ji     

Definition of G Two vertices are connected if the assignments are consistent j1 . ji . . jl      Consistent values      NOT Consistent Different values of same variable     

Properties of G Lemma:w(G) = lX  L • Consider an assignment A satisfying For each i consider A's restriction to ji‘s variables The corresponding l vertexes form a clique in G • Any clique of size m in G implies an assignment satisfying m of j1, ..., jl

Hardness of approximation of Max-Clique Each of the following theorems gives a hardness of approximation result of Max-Clique: • Thm[AS,ALMSS] NP  PCP[ O(1), 1, ½] • Thm[ RaSa ] NP  PCP[ O(1), m, 2-m ] for m  logc n for some c > 0 • Thm[ DFKRS ]NP  PCP[ O(1), m, 2-m ] for m  logc n for any c > 0 • Conjecture[BGLR]NP  PCP[ O(1), m, 2-m ] for m  log n

Hardness of approximation of Max-3SAT We will show that if Life Is Meaningful (PNP) Max-3Sat does not have a PTAS. Given an instance of gap-SAT,  = { j1, ..., jl }, we will transform each of the ji‘s into a 3-SAT expression i. • As each of the ji‘s depends on up to D variables. The equivalent i expressions require exp(D) clauses. Since D = O(1) we still remain with a blow up of O(1) • We define the equivalent 3-SAT expression to be:  = • The number of clauses in   exp(D)  l

Hardness of approximation of Max-3SAT • If X  L then there is an assignment satisfying all l Boolean functions of . Such an assignment satisfies all clauses of . • If X  L then no assignment satisfies more then l Boolean functions of .Therefore no assignment satisfies more than || - l. • Therefore solving Gap-3SAT with thresholds t1 = 1 and t2 = 1 - l/||  1 - /exp(D) is NP-Hard. • We conclude that there can be no PTAS for Max-3SAT. • Gap-3SAT is NP-Hard with thresholds 1 and 7/8+. Can be solved with thresholds 1 and 7/8.

More Results Related to PCP The PCP theorem has ushered in a new era of hardness of approximation results. Here we list a few: • We showed that Max-Clique ( and equivalently Max-Independent-Set ) do not has a PTAS. It is known in addition, that to approximate it with a factor of n1- is hard unless co-RP = NP. • Chromatic Number - It is NP-Hard to approximate it within a factor of n1- unless co-RP = NP. There is a simple reduction from Max-Clique which shows that it is NP-Hard to approximate with factor n. • Chromatic Number for 3-colorable graph - NP-Hard to approximate with factor 5/3-(i.e. to differentiate between 4 and 3). Can be approximated within O(nlogO(1) n).

More Results Related to PCP • Set Cover - NP-Hard to approximate it within a factor of ln n. Cannot be approximated within factor (1-)ln n unless NP  Dtime(nloglogn).

More Results Related to PCP • Maximum Satisfying Linear Sub-System - The problem: Given a linear system Ax=b (A is n x m matrix ) in field F, find the largest number of equations that can be satisfied by some x. • If all equations can be satisfied the problem is in P. • If F=Q NP-Hard to approximate by factor m. Can be approximated in O(m/logm). • If F=GF(q) can be approximated by factor q (even a random assignment gives such a factor). NP-Hard to approximate within q-. Also NP-Hard for equations with only 3 variables. • For equations with only 2 variables. NP-Hard to approximated within 1.0909 but can be approximated within 1.383

Slides by Asaf Shapira & Michael Lewin & Boaz Klartag & Oded Schwartz.