570 likes | 912 Views
Fall 2010. The Chinese University of Hong Kong. CSCI 3130: Formal languages and automata theory. NP and NP-completeness. Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130. Some more problems. A clique is a subset of vertices that are all interconnected. 2. 1.
E N D
Fall 2010 The Chinese University of Hong Kong CSCI 3130: Formal languages and automata theory NP and NP-completeness Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130
Some more problems A clique is a subset of vertices that are all interconnected 2 1 {1, 4}, {2, 3, 4}, {1} are cliques An independent set is a subset of vertices so that no pair is connected 4 3 {1, 2}, {1, 3}, {4} are independent sets there is no independent set of size 3 Graph G A vertex cover is a set of vertices that touches (covers) all edges {2, 4}, {3, 4}, {1, 2, 3} are vertex covers
Boolean formula satisfiability • A boolean formula is an expression made up of variables, ands, ors, and negations, like • The formula is satisfiable if one can assign values to the variables so the expression evaluates to true (x1∨x2 ) ∧ (x2∨x3∨x4) ∧ (x1) Above formula is satisfiable because this assignment makes it true: x1 = Fx2 = Fx3 = T x4 = T
3SAT SAT = {〈f〉: fis a satisfiable Boolean formula} 3SAT = {〈f〉: fis a satisfiable Boolean formula in conjunctive normal form with 3 literals per clause} literal: xi or xi (x1∨x2∨x2 ) ∧ (x2∨x3∨x4) clause literals CNF: AND of ORs of literals (conjunctive normal form) 3CNF: CNF with 3 literals per clause (repetitions are allowed)
Status of these problems CLIQUE = {〈G, k〉: G is a graph with a clique of k vertices} IS = {〈G, k〉: G is a graph with an independent set of k vertices} VC = {〈G, k〉: G is a graph with a vertex cover of k vertices} SAT = {〈f〉: f is a satisfiable Boolean formula} problem CLIQUE 3SAT VC IS 2O(n) 2O(n) running time of best-known algorithm 2O(n) 2O(n) What do these problems have in common?
Checking solutions efficiently • We don’t know how to solve them efficiently • But if someone told us the solution, we would be able to verify it very quickly Example: 12 Is (G, 5) in CLIQUE? 9 1,5,9,12,14 1 14 5 13 6 15 2 7 4 10 8 3 11
Example: Formula satisfiability (x1∨x2 ) ∧ (x2∨x3∨x4) ∧ (x1) f = Finding a solution: Verifying a solution: FFTT Try all possible assignments FFFF TFFF substitute FTFF TTFF FFFT FTFT TFFT TTFT x1 = F x2 = Fx3 = T x4 = T FFTF FTTF TFTF TTTF FFTT FTTT TFTT TTTT evaluate formula (F ∨T) ∧ (F∨T∨F) ∧ (T) f = For n variables, there are 2n possible assignments can be done in linear time Takes exponential time
The class NP • A verifier for L is a TM V such that • s is a potential solution for x • We say V runs in polynomial time if on every input x, it runs in time polynomial in |x| (for every s) x ∈ L V accepts 〈x, s〉 for some s NP is the class of all languages that have polynomial-time verifiers
Examples 3SAT is in NP: V := On input a 3CNF f and a candidate assignment a, If a satisfies f accept, otherwise reject. running time = O(m + n) ✔ m= number of clauses and n = number of variables CLIQUE is in NP: V := On input 〈G, k〉 and a set of vertices C, If C has size k and all edges between vertices of C are present in G, accept, otherwise reject. running time = O(n2) ✔
P versus NP because the verifier can ignore the solution Conceptually, finding solutions can only be harder than checking them P is contained in NP decidable NP (efficiently checkable) IS SAT VC CLIQUE P (efficient) PATH L01
Millenium prize problems • Recall how in 1900, Hilbert gave 23 problems that guided mathematics in the 20th century • In 2000, the Clay Mathematical Institute gave 7 problems for the 21st century computer science 1 P versus NP 2 The Hodge conjecture 3 The Poincaré conjecture 4 The Riemann hypothesis 5 Yang–Mills existence and mass gap 6 Navier–Stokes existence and smoothness 7 The Birch and Swinnerton-Dyer conjecture Perelman 2006 (refused money) Hilbert’s 8th problem $1,000,000
P versus NP • The answer to the questionis not known. But one reason it is believed to be negative is because, intuitively, searching is harder than verifying • For example, solving homework problems (searching for solutions) is harder than grading (verifying the solution is correct) Is P equal to NP? $1,000,000
Searching versus verifying Mathematician: Given a mathematical claim, come up with a proof for it. Scientist: Given a collection of data on some phenomena, find a theory explaining it. Engineer: Given a set of constraints (on cost, physical laws, etc.) come up with a design (of an engine, bridge, etc.) which meets them. Detective: Given the crime scene, find “who’s done it”.
P and NP P = languages that can be decided on a TM with polynomial running time (problems that admit efficient algorithms) languages whose solutions can be verified on a TM with polynomial running time NP = (solutions can be checked efficiently) decidable We believe that NP is bigger than P, but we are not 100% sure NP P
Evidence that NP is bigger than P • These (and many others) are in NP • Their solutions, once found, are easy to verify • But no efficient algorithms are known for any of them • The fastest known programs take time ≈2n CLIQUE = {〈G, k〉: G is a graph with a clique of k vertices} IS = {〈G, k〉: G is a graph with an independent set of k vertices} VC = {〈G, k〉: G is a graph with a vertex cover of k vertices} SAT = {〈f〉: fis a satisfiable Boolean formula}
Equivalence of certain NP languages • We strongly suspect that problems like CLIQUE, SAT, etc. require time ≈2n to solve • We do not know how to prove this, but what we can prove is that If any one of them can be solved on a polynomial-time TM, then all of them can be solved
Equivalence of some NP languages • All these problems are as hard as one another • Moreover, they are at the “frontier” of NP • They are at least as hard as any problem in NP NP clique independent set vertex-cover P satisfiability
Polynomial Time Reduction How to show that a problem R is not easier than a problem Q? Informally, if R can be solved efficiently, we can solve Q efficiently. • Formally, we say Q polynomially reduces to R if: • Given an instance q of problem Q • There is a polynomial time transformation to an instance f(q) of R • q is a “yes” instance if and only if f(q) is a “yes” instance Then, if R is polynomial time solvable, then Q is polynomial time solvable. If Q is not polynomial time solvable, then R is not polynomial time solvable.
Polynomial-time reductions • What do we mean when we say, for example, • We mean that • Or, we can convert any polynomial-time TM for IS into one for CLIQUE “IS is at least as hard as CLIQUE” If CLIQUE has no polynomial-time TM, then neither does IS
Polynomial-time reductions IS = {〈G, k〉: G is a graph with an independent set of k vertices} • Theorem CLIQUE = {〈G, k〉: G is a graph with a clique of k vertices} 2 1 If IS has a polynomial-time TM, so does CLIQUE 4 3 {1, 4}, {2, 3, 4}, {1} are independent sets {1, 2}, {1, 3}, {4} are cliques
Polynomial-time reductions • Proof: Suppose IS has an poly-time TM A • We want to use it to solve CLIQUE If IS has a polynomial-time TM, so does CLIQUE A for IS accept ifG has clique of size k 〈G’, k’ 〉 〈G, k〉 reject if not accept ifG’ has IS of size k reject if not
Reducing CLIQUE to IS • We look for a polynomial-time TM R that turns the question: into: R “Does G have a clique of size k?” G, k G’, k’ “Does G’ have an IS of size k’?” 2 2 1 1 flip all edges G G’ 4 4 3 3 k’ = k cliques of size k ISs of size k’
Reducing CLIQUE to IS On input 〈G, k〉 Construct G’ by flipping all edges of G Set k’ = k Output 〈G’, k’〉 R G, k G’, k’ cliques in G independent sets in G’ If G’ has an IS of size k, then G has a clique of size k ✓ If G’ does not have an IS of size k, then G has no clique of size k ✓
Reduction recap • We showed thatby converting an imaginary TM for IS into one for CLIQUE • To do this, we came up with a reduction that transforms instances of CLIQUE into ones of IS If IS has a polynomial-time TM, so does CLIQUE
Polynomial-time reductions • Language Lpolynomial-time reduces to L’ if there exists a polynomial-time TM R that takes an instance x of L into instance y of L’ s.t. x ∈ L if and only if y ∈ L’ L (CLIQUE) L’ (IS) R = 〈G, k〉 x y = 〈G’, k’〉 x ∈ L y ∈ L’ (G has clique of size k) (G’ has IS of size k’)
The meaning of reductions • Saying L reduces to L’ means L is no harder than L’ • In other words, if we can solve L’, then we can also solve L • Therefore If Lreduces toL’and L’∈P, then L∈P acc R poly-time TM for L’ x y rej x ∈ L y ∈ L’ TM accepts
The direction of reductions • The direction of the reduction is very important • Saying “A is easier than B” and “B is easier than A” mean different things • However, it is possible that L reduces to L’andL’ reduces to L • This means that L and L’ are as hard as one another • For example, IS and CLIQUE reduce to one another
The Cook-Levin Theorem (x1∨x2 ) ∧ (x2∨x3∨x4) ∧ (x1) • So every problem in NP is easier than SAT • But SAT itself is in NP, so SAT must be the “hardest problem” in NP: Every L∈NP reduces to SAT SAT = {f: fis a satisfiable Boolean formula} E.g. SAT NP P If SAT∈P, then P = NP
NP-completeness • A language C is NP-complete if: • Cook-Levin Theorem: 1. C is in NP, and 2.For every L in NP, L reduces to C. C NP 3SAT is NP-complete P
Our picture of NP NP-complete A B SAT A reduces to B NP IS CLIQUE P any CFL PATH 0n1n
More NP-complete problems NP-complete A B 3SAT A reduces to B IS CLIQUE NP P PATH 0n1n In practice, most of the NP-problems are either in P (easy) or NP-complete (probably hard)
Interpretation of Cook-Levin Theorem • Optimistic view: • Pessimistic view: If we manage to solve SAT, then wecan also solve CLIQUE, scheduling, and almost anything Since we do not believe P = NP, it is unlikely that we will ever have a fast algorithm for SAT
Clique • Theorem CLIQUE = {(G, k): G is a graph with a clique of k vertices} CLIQUE is NP-hard VC IS A clique is a subset of vertices so that all pairs are connected 2 1 CLIQUE 3SAT {1, 2, 3}, {1, 4}, {4} are cliques ✓ SAT 4 3
Reducing 3SAT to CLIQUE • Proof: We give a reduction from 3SAT to CLIQUE 3SAT = {f: f is a satisfiable Boolean formula in 3CNF} CLIQUE = {(G, k): G is a graph with a clique of k vertices} R 3CNF formula f (G, k) G has a cliqueof size k f is satisfiable
Reducing 3SAT to CLIQUE • Example: f = (x1∨x1∨x2 ) ∧ (x1∨x2∨x2) ∧ (x1∨x2∨x3) x1 x1 x1 x1 x2 x2 x2 x2 x3 Put a vertex for every literal Put an edge for every consistent pair
Reducing 3SAT to CLIQUE R 3CNF formula f (G, k) R: On input f, where f is a 3CNF formula with m clauses Construct the following graph G: G has 3m vertices, divided into m groups, one for each literal in f If a and b are in different groups and a ≠ b, put an edge (a, b) Output(G, m)
Reducing 3SAT to CLIQUE R 3CNF formula f (G, m) G has a cliqueof size m f is satisfiable x1 x1 x1 x1 x2 x2 x2 x2 x3 f =(x1∨x1∨x2 ) ∧ (x1∨x2∨x2) ∧ (x1∨x2∨x3) T T F F F T F F T
Reducing 3SAT to CLIQUE R 3CNF formula f (G, m) G has a cliqueof size m f is satisfiable x1 x1 x1 x1 x2 x2 x2 x2 x3 f =(x1∨x1∨x2 ) ∧ (x1∨x2∨x2) ∧ (x1∨x2∨x3) F F T T F F T T T
Reducing 3SAT to CLIQUE • Every satisfying assignment of f gives a clique of size m in G • Conversely, every clique of size m in Ggives a consistent satisfying assignment of f. R 3CNF formula f (G, m) f is satisfiable G has a clique of size m VC IS ✓ CLIQUE ✓ 3SAT ✓ SAT
Vertex cover • Theorem VC = {(G, k): G is a graph with a vertex cover of size k} 2 1 4 3 VC is NP-hard VC IS ✓ A vertex cover is a set of vertices that touches (covers) all edges CLIQUE ✓ 3SAT ✓ {2, 4}, {3, 4}, {1, 2, 3} are vertex covers SAT
Reducing CLIQUE to VC • Proof: We describe a reduction from IS to VC • Example 2 1 R (G, k) (G’, k’) 4 3 G has an IS of size k G’ has a VC of size k’ vertex covers independent sets {2, 4}, {3, 4}, {1, 2, 3}, {1, 2, 4}, {1, 3, 4}, {2, 3, 4}, {1, 2, 3, 4} ∅, {1}, {2}, {3}, {4}, {1, 2}, {1, 3}
Reducing IS to VC • Claim • Proof 2 1 S is an independent set of G if and only if S is a vertex cover of G 4 3 VC IS ∅ {1} {2} {3} {4} {1, 2} {1, 3} {2, 4} {3, 4} {1, 2, 3} {1, 2, 4} {1, 3, 4} {2, 3, 4} {1, 2, 3, 4} S is an independent set of G no edge has both endpoints in S every edge has an endpoint in S S is a vertex cover of G
Reducing IS to VC VC R (G, k) (G’, k’) ✓ IS ✓ CLIQUE R: On input (G, k), ✓ Output (G, n – k). 3SAT ✓ SAT G has an IS of size k G has a VC of size n – k
The ubiquity of NP-complete problems • We saw a few examples of NP-complete problems, but there are many more • A surprising fact of life is that most CS problems are either in P or NP-complete • A 1979 book by Garey and Johnsonlists 100+ NP-complete problems
Practicing Reductions Instance: A set X and a size s(x) for each x in X. Question: Is there a subset X’ X such that PARTITION SUBSET-SUM Instance: A set X and a size s(x) for each x in X, and an integer B. Question: Is there a subset X’ X such that
Practicing Reductions HAMILTONIAN CYCLE Instance: A graph G=(V,E). Question: Does G contains a Hamiltonian cycle, i.e. a cycle which visits every vertex exactly once? HAMILTONIAN PATH Instance: A graph G=(V,E). Question: Does G contains a Hamiltonian path, i.e. a path which visits every vertex exactly once?
Techniques for Proving NP-completeness • Restriction • Show that a special case is already NP-complete. • Local replacement • Replace each basic unit by a different structure. • Component design • Design “components” with specific functionality.