320 likes | 466 Views
P and NP. Computational Complexity. Recall from our sorting examples at the start of class that we could prove that any sort would have to do at least some minimal amount of work (lower bound) We proved this using decision trees (ch 7.8) (the following decision tree slides are repeated).
E N D
Computational Complexity • Recall from our sorting examples at the start of class that we could prove that any sort would have to do at least some minimal amount of work (lower bound) • We proved this using decision trees (ch 7.8) • (the following decision tree slides are repeated)
// Sorts an array of 3 items void sortthree(int s[]) { a=s[1]; b=s[2]; c=s[3]; if (a < b) { if (b < c) { S = a,b,c; } else { if (a < c) { S = a,c,b; } else { S = c,a,b; }} } else if (b < c) { if (a < c) { S = b,a,c; } else { S = b,c,a; }} else { S = c,b,a; } } a < b Yes No b < c b < c Yes No No Yes a,b,c a < c a < c c,b,a Yes No Yes No a,c,b c,a,b b,a,c b,c,a
Decision Trees • A decision tree can be created for every comparison-based sorting algorithm • The following is a decision tree for a 3 element Exchange sort • Note that “c < b” means that the Exchange sort compares the array item whose current value is c with the one whose current value is b – not that it compares s[3] to s[2].
b < a Yes No c < a c < b No Yes No Yes b < a c < a a < b c < b No Yes Yes Yes Yes No c,b,a b,c,a b,a,c c,a,b a,c,b a,b,c
Decision Trees • So what does this tell us… • Note that there are 6 leaves in each of the examples given (each N=3) • In general there will be N! leaves in a decision tree corresponding to the N! permutations of the array • The number of comparisons (“work”) is equal to the depth of the tree (from root to leaf) • Worst case behavior is the path from the root to the deepest leaf
Decision Trees • Thus, to get a lower bound on the worst case behavior we need to find the shortest tree possible that can still hold N! leaves • No comparison-based sort could do better • A tree of depth d can hold 2d leaves • So, what is the minimal d where 2d >= N! • Solving for d we get d >= log2(N!) • The minimal depth must be at least log2(N!)
Decision Trees • According to Lemma 7.4 (p. 291): log2(N!) >= n log2(n) – 1.45n • Putting that together with the previous result d must be at least as great as (n log2(n) – 1.45n) • Applying Big-O d must be at least O(n log2(n)) • No comparison-based sorting algorithm can have a running time better than O(n log2(n))
Decision Trees for other problems? • Unfortunately, this technique only applies directly to comparison-based sorts • What if the problem is Traveling Salesperson? • The book has only shown exponential time (or worse) algorithms for this problem • What if your boss wants a faster implementation? • Do you try and find one? • Do you try and prove one doesn’t exist?
Polynomial-Time Algorithms • A polynomial-time algorithm is one whose worst-case running time is bounded above by a polynomial function • Poly-time examples: 2n, 3n, n5, n log(n), n100000 • Non-poly-time examples: 2n, 20.000001n, n! • Poly-time is important because for large problem sizes, all non-poly-time algorithms will take forever to execute
Intractability • In Computer Science, a problem is called intractable if it is impossible to solve it with a polynomial-time algorithm • Let me stress that intractability is a property of the problem, not just of any one algorithm to solve the problem • There can be no poly-time algorithm that solves the problem if the problem is to be considered intractable • And just because one non-poly-time algorithm exists for the problem does not make it intractable
Three Categories of Problems • We can group problems into 3 categories: • Problems for which poly-time algorithms have been found • Problems that have been proven to be intractable • Proven that no poly-time algorithms exist • Problems that have not been proven to be intractable, but for which poly-time algorithms have never been found • No one has found a poly-time algorithm, but no one has proven that one doesn’t exist either • The interesting thing is that most problems in CS fall into the 1st & 3rd categories and very few into the second
Poly-time Category • Any problem for which we have found a poly-time algorithm • Sorting, searching, matrix multiplication, chained matrix multiplication, shortest paths, minimal spanning tree, etc.
Intractable Category • Two types of problems • Those that require a non-polynomial amount of output • Determining all Hamiltonian Circuits • (n – 1)! Circuits in worst case • Those that produce a reasonable amount of output, but the processing time is just too long • Very few of these • Some are undecidable problems • Halting Problem • Presburger Arithmetic
Unknown Category • Not proven to be Intractable but no Poly-time algorithm known • Many problems belong in the category • 0-1 Knapsack, Traveling Salesperson, m-coloring, CNF-Satisfiability, etc. • In general, any problem that we had to solve using backtracking or bounded backtracking falls into this category
The Theory of NP • There is a close and interesting relationship among many of the problem in the Unknown Category • It will be more convenient to develop this theory restricting ourselves to decision problems • Problems that have a yes/no answer • We can always convert non-decision problems into decision problems • In the Traveling Salesperson Problem instead of just asking for the optimal tour, we can instead ask if the optimal tour is no greater than some number d • In graph coloring instead of just asking the minimal number of colors we can instead ask if the minimal number is less than m
The Set P • The set P is the set of all decision problems that can be solved by polynomial-time algorithms • What problems are in P? • Obviously all the ones we have found poly-time solutions for (sorting, etc) • What about problems like Traveling Salesperson?
The Set NP • For Traveling Salesperson: • Verify a tour has a total weight of no greater than x. • For M-Coloring: • Verify that a coloring of a graph uses at most M colors. • NP = Set of all decision problems that can be verified by a polynomial time algorithm • NP = Set of all decision problems that can be solved by a polynomial-time non-deterministic algorithm
Non-deterministic Algorithms • 2 Stages • Guessing (Nondeterministic) stage • Simply guesses a solution to the instance • Verification (Deterministic) stage • Takes a proposed solution from the guessing stage and decides yes/no • Note that the purpose of this type of algorithm is for theory and classification – there are usually much better ways to actually implement the algorithm • Non-deterministic algorithm solves a decision problem if …
The Set NP • The set NP is the set of all decision problems that can be solved by a polynomial-time non-deterministic algorithm • The verification stage can be accomplished in poly-time
NP • There are thousands of problems that have been proven to be in NP • Further note that all problems in P are also in NP • The guessing stage can do anything it wants • The verification stage can just run the algorithm • The only problems proven to not be in NP are the intractable ones • And there are only a few of these
P and NP • Here is the way the picture of the sets is usually drawn • We know that P is a subset of NP • We don’t know if it is a proper subset NP P
P and NP • No one has ever proven that there exists a problem in NP that is not in P • So, NP – P could be an empty set • If it is then we say that P = NP • The question of whether P = NP is one of the more intriguing and important questions in all of Computer Science
P = NP? • To prove that P NP we would have to find a single problem in NP that is not in P • To prove P = NP we would have to find a poly-time algorithm for each problem in NP • If you prove either you get an A in the class, not to mention famous • And if you prove P = NP then you also become rich!
NP-Complete Problems • Recall: “To prove P = NP we would have to find a poly-time algorithm for each problem in NP” • This can be greatly simplified by looking at the class of NP-Complete Problems. • Examples: CNF-Sat, TSP, 0-1 Knapsack, M-Coloring, Clique,…
CNF-Satisfiability Problem • Given a logical expression in CNF, determine whether there is some truth assignment (some set of assignments of true and false to the variables) that makes the whole expression true • (x1x2) (x2x3) (x2) • Yes, x1 = T, x2 = F, x3 = F • (x1x2) x1x2 • No
CNF-Satisfiability Problem • It is easy to write a poly-time algorithm that takes as input a logical expression in CNF and a set of true assignments a verifies if the expression is true for that assignment • Therefore, the problem is in the set NP • Further, no one has ever found a poly-time algorithm for this entire problem and no one has ever proven that it cannot be solved in poly-time • So we do not know if it is in P or not
CNF-Satisfiability Problem • However, in 1971 Stephen Cook published a paper proving that if CNF-Satisfiability is in P, then P = NP • WOW! If we can just prove that this simple problem is in P then suddenly all NP problems are in P • And all sorts of things like encryption will break!
NP-Complete Problems • Class of NP problems that all have the same “difficulty” • To prove an NP problem is NP-Complete you must show that every other problem in NP can be “reduced” to this problem. • Cook showed that CNF-Sat was NP-Complete
More NP-Complete Problems • Now we can use transitivity of a reduction to get: • A problem C is NP-complete if: • It is in NP and • For some other NP-complete problem B, B reduces to C • Researchers have spent the last 30 years creating transformation for these problems and we now have a list of hundreds of NP-complete problems • If any of these NP-complete problems can be proven to be in P then P = NP • And, additionally, we also have a way to solve all the NP problems (poly-time transformations to the poly-time problem)
The State of P and NP • All this sounds promising, but… • Over the last 40 years no one has been able prove that any problem from NP is not in P • Over the last 40 years no one has been able to prove that any problem from NP-complete is in P • Proving either of these things would give us an answer to the open problem of P = NP? • Most people seem to believe that P NP