CSC 3130: Automata theory and formal languages

Fall 2009 The Chinese University of Hong Kong CSC 3130: Automata theory and formal languages NP-complete problems Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130

Polynomial-time reductions • Language Lpolynomial-time reduces to L’ if there exists a polynomial-time computable map R that takes an instance x of L into instance y of L’ s.t. x ∈ L if and only if y ∈ L’ L (CLIQUE) L’ (IS) R (G, k) x y (G’, k’) x ∈ L y ∈ L’ (G has clique of size k) (G’ has IS of size k’)

The Cook-Levin Theorem Every L∈NP reduces to SAT SAT = {f: f is a satisfiable Boolean formula} (x1∨x2 ) ∧ (x2∨x3∨x4) ∧ (x1) (x1∨x2 ) ∧ (x1) ∧ (x2) f = f = is satisfiable is not satisfiable x1 = T x2 = Fx3 = T x4 = T

NP-hardness • Language L is NP-hard if every L’in NP reduces to L • Intuitively, NP-hard means “harder than all of NP” • The Cook-Levin Theorem says SAT NP SAT is NP-hard P

NP-complete • L is NP-complete if L is NP-hard and L∈ NP • Intuitively, NP-complete means “hardest in NP” • Recall that SAT ∈ NP, so SAT is NP-complete SAT NP If SAT∈P, then P = NP P

Proof of Cook-Levin Theorem • To prove it, we have to describe a reduction R: Every L∈NP reduces to SAT R w Boolean formula f f is satisfiable w ∈ L

Proof of Cook-Levin Theorem • All we know about L: It has a poly-time NTM M • Let’s look at computation tableau of M on input w S w M … q0 w1 # # S-th configuration symbol at time T # T Since M is nondeterministic, there may be many possible tableaus … qacc # #

Proof of Cook-Levin Theorem S … q0 n = length of input w w1 # # # height of tableau is p(n) for some polynomial p T u width is at most p(n) k possible tableau symbols … qacc # # true, if the (T, S) cell of tableau contains u 1 ≤ S ≤ p(n) xT, S, u = 1 ≤ T ≤ p(n) false, if not 1 ≤ u ≤ k

Proof of Cook-Levin Theorem • We will design a formula f such that: R w Boolean formula f f is satisfiable w ∈ L variables of f : xT, S, u assignment to xT, S, u way to fill up the tableau accepting computation tableau satisfying assignment f is satisfiable M accepts w

Proof of Cook-Levin Theorem • We want to construct (in time poly(n)) a formula f : true, if the (T, S) cell of tableau contains u 1 ≤ S ≤ p(n) xT, S, u = 1 ≤ T ≤ p(n) false, if not 1 ≤ u ≤ k true, if the xs represent a valid accepting tableau f(x1, 1, 1, ..., xp(n), p(n), k) = false, if not Symbols in computation tableau come from some alphabet {a1,...,ak}

Proof of Cook-Levin Theorem S f = fcell ∧ f0 ∧ fmove ∧ facc … q0 w1 # # # T fcell: “Every cell contains exactly one symbol” u f0: “The first row is #q0w1w2...wk☐...☐#” … qacc # # fmove: “The moves between rows follow the transitions of M” facc: “qacc appears somewhere in the last row”

Proof of Cook-Levin Theorem • Desired meaning • Implementation: fcell: “Every cell contains exactly one symbol” or: “Exactly one of xS, T, 1 ∨ ... ∨ xS, T, k is true” fcell = fcell1, 1 ∧ ... ∧ fcellp(n), p(n) where fcellT, S = (xT, S, 1 ∨ ... ∨ xT, S, k) ∧ (xT, S, 1 ∧ xT, S, 2) ∧ (xT, S, 1 ∧ xT, S, 3) ∧ ... ∧ (xT, S, k-1 ∧ xT, S, k) at least one symbol no two symbols

Proof of Cook-Levin Theorem • Desired meaning • Implementation: f0: “The first row is #q0w1w2...wk☐...☐#” facc: “qacc appears somewhere in the last row” f0 = x1, 1, # ∧ x1, 1, q0 ∧ x1, 1, w1 ∧ ... ∧ x1, p(n),# facc = xp(n), 1, qacc ∨ xp(n), 2, qacc ∨ ... ∨ xp(n), p(n), qacc

Valid and invalid windows valid windows invalid windows … 6a3b0x0 … … 0a6b0x0…0 … 6q2a0b0 … … 0a6b0q2 …0 … 6a3q2a0 … … 0q5a6x0…0 … 6q2q2a0 … … 0q2q2x3…0 q2 a/xL … 6#3b0a0 … … 0#6b0q5 …0 … 6a3q2a0 … … 0q5a6b0…0 q5 … 6a3a0☐0 … … 0x6a0☐0…0 … 6a3q2a0 … … 0a6q5x0…0

Proof of Cook-Levin Theorem • Desired meaning • Implementation: q0 b a # # q3 b a # fmove: “The moves between rows follow transitions of M” c b q7 # fmove2, 2 … qacc # # fmove = fmove1, 1 ∧ ... ∧ fmovep(n)-3, p(n)-3 ∨ fmoveT, S = (xT, S, a1 ∧ xT, S+1, a2 ∧ xT, S+2, a3 ∧ xT+1, S, a4 ∧ xT+1, S+1, a5 ∧ xT+2, S+1, a6) over all valid windows a1 a2 a3 a4 a5 a6

Other NP-complete problems CLIQUE = {(G, k): G is a graph with a clique of k vertices} IS = {(G, k): G is a graph with an independent set of k vertices} VC = {(G, k): G is a graph with a vertex cover of k vertices} CLIQUE, IS and VC are NP-complete CLIQUE IS VC SAT NP

Proving NP-hardness • To show L is NP-hard, it is enough to reduce from some L’ we already know is NP-hard • For now we can take L’= SAT • To show L is NP-complete, we also need to argue that L is in NP • This is usually the easy part roadmap: VC IS CLIQUE 3SAT SAT

3SAT SAT = {f: f is a satisfiable Boolean formula} 3SAT = {f: f is a satisfiable Boolean formula in conjunctive normal form with 3 literals per clause} (x2∨(x1∧x2 ))∧(x1∧(x1∨x2 )) literal: xi or xi gates CNF: AND of ORs of literals (x1∨x2∨x2 ) ∧ (x2∨x3∨x4) (conjunctive normal form) clause literals 3CNF: CNF with 3 literals per clause (repetitions are allowed)

NP-hardness of 3SAT • Theorem • Proof: We describe a reduction R from SAT 3SAT is NP-hard R Boolean formula f 3CNF formula f’ f’ is satisfiable f is satisfiable

Reducing SAT to 3SAT • Example: f = (x2∨(x1∧x2 ))∧(x1∧(x1∨x2 )) x4x5 x7 x7 = x4 ∧ x5 x10 T T T TT T F FT F T FT F F TF T T FF T F TF F T FF F F T AND (x4∨x5∨x7) x8 x9 (x4∨x5∨x7) OR NOT x6 x7 (x4∨x5∨x7) AND AND (x4∨x5∨x7) x4 x5 x3 NOT NOT OR (x4∨x5∨x7) ∧(x4∨x5∨x7) x2 x1 x2 x1 x1 x2 ∧(x4∨x5∨x7) ∧(x4∨x5∨x7) We give extra variables to every gate (“wire”)

Turning gates into 3CNFs z z z Gj: AND OR NOT x y x y x x y z = y ∧ x x y z = y ∨ x x z = x z z z T T T TT T F FT F T FT F F TF T T FF T F TF F T FF F F T T T T TT T F FT F T TT F F FF T T TF T F FF F T FF F F T T T FT F T F T TF F F (x∨z)∧(x∨z) (x∨y∨z)∧(x∨y∨z) ∧(x∨y∨z)∧(x∨y∨z) (x∨y∨z)∧(x∨y∨z) ∧(x∨y∨z)∧(x∨y∨z) fj: (x∨x∨z)∧(x∨x∨z)

Reducing SAT to 3SAT R Boolean formula f 3CNF formula f’ R: On input f, where f is a boolean formula Construct and output the following 3CNF formula f’: Add variable xn+j for each gate Gj in f Write 3CNF fj for each gate Gj, j = {1, ..., t} Let f’ = fn+1 ∧ fn+2 ∧ ... ∧ ft ∧ (xn+t ∨xn+t ∨xn+t ) requires thatoutput of f is true

Reducing SAT to 3SAT • Every satisfying assignment of f extends uniquely to a satisfying assignment of f’ • Conversely, every satisfying assignment of f’ must contain a satisfying assignment of f R Boolean formula f 3CNF formula f’ f’ is satisfiable f is satisfiable

Clique • Theorem CLIQUE = {(G, k): G is a graph with a clique of k vertices} CLIQUE is NP-hard VC IS A clique is a subset of vertices so that all pairs are connected 2 1 CLIQUE 3SAT {1, 2, 3}, {1, 4}, {4} are cliques ✓ SAT 4 3

Reducing 3SAT to CLIQUE • Proof: We give a reduction from 3SAT to CLIQUE 3SAT = {f: f is a satisfiable Boolean formula in 3CNF} CLIQUE = {(G, k): G is a graph with a clique of k vertices} R 3CNF formula f (G, k) G has a cliqueof size k f is satisfiable

Reducing 3SAT to CLIQUE • Example: f = (x1∨x1∨x2 ) ∧ (x1∨x2∨x2) ∧ (x1∨x2∨x3) x1 x1 x1 x1 x2 x2 x2 x2 x3 Put a vertex for every literal Put an edge for every consistent pair

Reducing 3SAT to CLIQUE R 3CNF formula f (G, k) R: On input f, where f is a 3CNF formula with m clauses Construct the following graph G: G has 3m vertices, divided into m groups, one for each literal in f If a and b are in different groups and a ≠ b, put an edge (a, b) Output(G, m)

Reducing 3SAT to CLIQUE R 3CNF formula f (G, m) G has a cliqueof size m f is satisfiable x1 x1 x1 x1 x2 x2 x2 x2 x3 f =(x1∨x1∨x2 ) ∧ (x1∨x2∨x2) ∧ (x1∨x2∨x3) T T F F F T F F T

Reducing 3SAT to CLIQUE R 3CNF formula f (G, m) G has a cliqueof size m f is satisfiable x1 x1 x1 x1 x2 x2 x2 x2 x3 f =(x1∨x1∨x2 ) ∧ (x1∨x2∨x2) ∧ (x1∨x2∨x3) F F T T F F T T T

Reducing 3SAT to CLIQUE • Every satisfying assignment of f gives a clique of size m in G • Conversely, every clique of size m in Ggives a consistent satisfying assignment of f. R 3CNF formula f (G, m) f is satisfiable G has a clique of size m VC IS ✓ CLIQUE ✓ 3SAT ✓ SAT

Vertex cover • Theorem VC = {(G, k): G is a graph with a vertex cover of size k} 2 1 4 3 VC is NP-hard VC IS ✓ A vertex cover is a set of vertices that touches (covers) all edges CLIQUE ✓ 3SAT ✓ {2, 4}, {3, 4}, {1, 2, 3} are vertex covers SAT

Reducing CLIQUE to VC • Proof: We describe a reduction from IS to VC • Example 2 1 R (G, k) (G’, k’) 4 3 G has an IS of size k G’ has a VC of size k’ vertex covers independent sets {2, 4}, {3, 4}, {1, 2, 3}, {1, 2, 4}, {1, 3, 4}, {2, 3, 4}, {1, 2, 3, 4} ∅, {1}, {2}, {3}, {4}, {1, 2}, {1, 3}

Reducing IS to VC • Claim • Proof 2 1 S is an independent set of G if and only if S is a vertex cover of G 4 3 VC IS ∅ {1} {2} {3} {4} {1, 2} {1, 3} {2, 4} {3, 4} {1, 2, 3} {1, 2, 4} {1, 3, 4} {2, 3, 4} {1, 2, 3, 4} S is an independent set of G no edge has both endpoints in S every edge has an endpoint in S S is a vertex cover of G

Reducing IS to VC VC R (G, k) (G’, k’) ✓ IS ✓ CLIQUE R: On input (G, k), ✓ Output (G, n – k). 3SAT ✓ SAT G has an IS of size k G has a VC of size n – k

The ubiquity of NP-complete problems • We saw a few examples of NP-complete problems, but there are many more • A surprising fact of life is that most CS problems are either in P or NP-complete • A 1979 book by Garey and Johnsonlists 100+ NP-complete problems

CSC 3130: Automata theory and formal languages