790 likes | 1.07k Views
Problem Complexity and NP-Complete Problems. Presented by Ming-Tsung Hsu. “I can’t find an efficient algorithm, I guess I’m just too dumb”. “I can’t find an efficient algorithm, because no such algorithm is possible”.
E N D
Problem Complexity and NP-Complete Problems Presented by Ming-Tsung Hsu
“I can’t find an efficient algorithm, I guess I’m just too dumb”
“I can’t find an efficient algorithm, because no such algorithm is possible”
“I can’t find an efficient algorithm, but neither can all these famous people”
Overview Complexity theory • Part of the theory of computation • Dealing with the resources required to solve a given problem • time (how many steps it takes to solve a problem) • space (how much memory it takes) • Computational difficulty of computable functions
Relations between Problems, Algorithms, and Programs Problem Algorithm . . . . Algorithm . . . . Program Program . . . . Program Program • A single "problem" is an entire set of related questions, where each question is a finite-length string • A particular question is called an instance
Time Complexity • Number of steps that it takes to solve an instance of the problem • Function of the size of the input • Using the most efficient algorithm • Exact number of steps will depend on exactly what machine or language is being used • To avoid that problem, generally use Big O notation
Time Complexity (cont’d) • Worst-case • an upper bound on the running time for any input • Average-case • we shall assume that all inputs of a given size are equally likely • Best-case • to get the lower bound
Time Complexity (cont’d) • Sequential search in a list of size n • worst-case : n times • best-case : 1 times • average-case :
Asymptotic Notation - O-notation (Big O, Order of) • Asymptotic upper bound • Definition: For a given function g(n), denoted by O(g(n)) is the set of functions: O(g(n)) = {there exist positive constants c and n0 such that f(n) <= cg(n) for all n >= n0}
Asymptotic Notation - Ω-notation • Asymptotic lower bound • Definition: For a given function g(n), denoted byΩ(g(n)) is the set of functions:(g(n)) = {there exist positive constants c and n0such that f(n) >= cg(n) for all n >= n0}
Asymptotic Notation – -notation • Definition: For a given function g(n), denoted by(g(n)) is the set of functions:(g(n)) = {there exist positive constants c1, c2, and n0 such that c1g(n) <= f(n) <= c2g(n) for all n >= n0}
Cost and Complexity • Algorithm complexity can be expressed in Order notation, e.g. “at what rate does work grow with N?”: • O(1) Constant • O(logN) Sub-linear • O(N) Linear • O(NlogN) Nearly linear • O(N2) Quadratic • O(XN) Exponential • But, for a given problem, how do we know if a better algorithm is possible?
Practical Complexities 109 instructions/second computer
Impractical Complexities 109 instructions/second computer
Faster Computer Vs Better Algorithm Algorithmic improvement more useful than hardware improvement. E.g. 2n to n3
The Problem of Sorting For example, in discussing the problem of sorting: • Algorithms: • Bubble-sort – O(N2) • Merge-sort – O(N Log N) • Quick-sort – O(N Log N) • Can we do better than O(N Log N)? What is the problem complexity?
Algorithm vs. Problem Complexity • Algorithm complexity is defined by analysis of an algorithm • Problem complexity is defined by • An upper bound – defined by an algorithm (worst case) • A lower bound – defined by a proof (A lot of problems are still unknown)
The Upper Bound • Defined by an algorithm • Defines that we know we can do at least this good • Perhaps we can do better • Lowered by a better algorithm • “For problem X, the best algorithm was O(N3), but my new algorithm is O(N2).”
The Lower Bound • Defined by a proof • Defines that we know we can do no better than this • It may be worse • Raised by a better proof • “For problem X, the strongest proof showed that it required O(N), but my new, stronger proof shows that it requires at least O(N2).” • Might get harder and harder
Upper and Lower Bounds • The Upper bound is the best algorithmic solution that has been found for a problem • “What’s the best that we know we can do?” • The Lower bound is the best solution that is theoretically possible. • “What cost can we prove is necessary?”
Changing the Bounds Lowered by betteralgorithm Upper bound Lower bound Raised by betterproof
Closed Problems The upper and lower bounds are identical Upper bound The inherent complexity of problem Lower bound
Closed Problems (cont’d) • Better algorithms are still possible • Better algorithms will not provide an improvement detectable by “Big O” • Better algorithms can improve the constantcosts hidden in “Big O” characterizations
Open Problems The upper and lower bounds differ. Lowered by betteralgorithm Upper bound Unknown Who Falls Short? Lower bound Raised by betterproof
Open Problems (cont’d) • D. Harel. Algorithmics: The Spirit of Computing. Addison-Wesley, 2nd edition,1992 • “. . . if a problem gives rise to a. . . gap, the deficiency is not in the problem but in our knowledge about it. We have failed either in finding the best algorithm for it or in proving that a better one does not exist, or in both”
Examples - Closed • Problem: Searching an unordered list of n items • Upper bound: O(n) comparisons (from linear search) • Lower bound: O(n) comparisons • No gap; so we know the problem complexity: O(n) • Problem: Searching an ordered list of n items • Upper bound: O(log n) comparisons (from binary search) • Lower bound: O(log n) comparisons • No gap; so we know the problem complexity: O(log n)
Examples - Closed (cont’d) • Problem: Sorting n arbitrary elements • Upper bound: O(n × log n) comparisons (from, e.g., merge sort) • Lower bound: O(n × log n) comparisons • No gap; so we know the problem complexity: O(n × log n) • Problem: Towers of Hanoi for n disks (n > 1) • Upper bound: O(2n) moves • Lower bound: O(2n) moves • No gap; so we know the problem complexity: O(2n)
Examples - Open • Problem: Multiplication of two integers, where n is the number of digits • Upper bound: O(nlognloglogn) • Lower bound: O(n) • There’s a gap (but only a small one) • Problem: Finding the minimum spanning tree in a graph of n edges and m vertices. • Upper bound: O(nlogn) or O(n+mlogm) • Lower bound: O(n) • There’s a gap (but only a small one) • Do not be fooled by the last two examples into thinking that all gaps are small!
Tractable vs. Intractable Problems are tractable ifthe upper bounds and lower bounds have only polynomial factors • O (log N) • O (N) • O (NK) where K is a constant Problems are intractable ifthe upper bounds and lower bounds have an exponential factor(are solvable in theory, but can't be solved in practice) • O (N!) • O (NN) • O (2N)
Terminology Polynomial algorithms are reasonable Polynomial problems are tractable Exponential algorithms are unreasonable Exponential problems are intractable
Problems Algorithms Polynomial Exponential Tractable Reasonable Intractable Unreasonable Terminology
Definitions of P, NP • A decision problem is a problem where the answer is always “YES/NO” • An optimization problem is a problem can be many possible solutions. Each solution has a value, and we wish to fond a solution with the optimal (maximum or minimum) value • We can re-cast an optimization problem to a decision problem by using loop • EX: Find Max Clique in a graph Is there a clique which size is k=n, n-1, …?
Definitions (cont’d) • A deterministic algorithm is an algorithm which, in informal terms, behaves predictably. If it runs on a particular input, it will always produce the same correct output, and the underlying machine will always pass through the same sequence of states • A non-deterministic algorithm has two stages: • 1.Guessing (in nondeterministic polynomial timewith correct guessing) Stage • 2. Verification (in deterministic polynomial time) Stage
Definitions (cont’d) • EX: A: (3, 11, 2, 5, 8, 16, …, 200 ), Is the “x=5” in A? • deterministic algorithm • for i=1 to n if A(i) = x then print (i) and return truenextreturn false • non-deterministic algorithm • jchoice (1:n)if A(j) = x then print (j); success endifprint (‘0’); failure
Certificates • Returning true: in order to show that the solution can be made, we only have to show one solution that works • This is called a certificate. • Returning false: in order to show that the solution cannot be made, we must test all solution.
Oracles • If we could make the ‘right decision’ at all decision points, then we can determine whether a solution is possible very quickly! • If the found solution is valid, then True • If the found solution is invalid, then False • If we could find the certificates quickly, NPC problems would become tractable – O(N) • This (magic) process that can always make the right guess is called an Oracle.
Determinism vs. Nondeterminism • Nondeterministic algorithms produce an answer by a series of “correct guesses” • Deterministic algorithms (like those that a computer executes) make decisions based on information.
Definitions (cont’d) • The complexity class P is the set of decision problems that can be solved by a deterministic machine (algorithm) in polynomial time • The complexity class NP is the set of decision problems that can be solved by a non-deterministic machine (algorithm) in polynomial time • Since deterministic algorithms are just a special case of deterministic ones P⊆NP • The big question: Does P = NP?
Problems that Cross the Line • What if a problem has: • An exponential upper bound • A polynomial lower bound • We have only found exponential algorithms, so it appears to be intractable. • But... we can’t prove that an exponential solution is needed, we can’t prove that a polynomial algorithm cannot be developed, so we can’t say the problem is intractable...
Reduction • A problem P can be reduced to another problem Q if any instance of P can be rephrased to an instance of Q, the solution to which provides a solution to the instance of P • This rephrasing is called a transformation • If P is polynomial-time reducible to Q, we denote this P ∝ Q • Intuitively: If P reduces in polynomial time to Q, P is “no harder to solve” than Q • EX: Maximum ∝ Sorting but Sorting cannot be reduced toMaximum
The Boolean Satisfiability Problem (SAT) • An instance of the problem is a Boolean expression written using only “AND”, “OR”, “NOT”, “variables”, and “parentheses” • Question: given the expression, is there some assignment of “TRUE” and “FALSE” values to the variables will make the entire expression true? • Clause: “OR” together a group of literals • Conjunctive normal form (CNF): formulas that are a conjunction (AND) of clauses • SAT ∈ NP O(2nm) (n: # of literals, m: # of clauses)
SAT (cont’d) • EX: CNF: a1 V a2 V a3 (clause1)& ā1 (clause2) (0, 0, 1) satisfied& ā2 (clause3)
Cook’s Theorem • NP = P iff SAT ∈ P • pf:() ∵ SAT ∈ NP SAT∈P() Difficult (every NP problem can be reduced to SAT problem)
NP-Complete and NP-Hard • A problem P is NP-Complete if P ∈ NP and every NP problem can be reduced to P • P ∈ NP and X ∝ P for all X ∈NP • If any NP-Complete problem can be solved in polynomial time, then NP = P • SAT problem is an NP-complete problem • A problem P is NP-Hard if every NP problem can be reduced to P • We say P is NP-Complete if P is NP-Hard and P NP
NP-Complete “NP-Complete” comes from: • Nondeterministic Algorithm in Polynomial time • Complete - “Solve one, Solve them all” There are more NP-Complete problems than provably intractable problems.