440 likes | 598 Views
P vs. NP. A mathematical issue, not a legal oneP and NP:Each is a set of computational problemsEach is described differentlyAre they actually the same set?A million dollar problemA Clay Millennium PrizeMost everyone thinks P ? NPthe problem is to prove itOn August 6, Vinay Deolalikar proposed a proof.
E N D
2. P vs. NP A mathematical issue, not a legal one
P and NP:
Each is a set of computational problems
Each is described differently
Are they actually the same set?
A million dollar problem
A Clay Millennium Prize
Most everyone thinks P ? NP
the problem is to prove it
On August 6, Vinay Deolalikar proposed a proof
3. Taking this proposed proof seriously People claim proofs all the time
Every couple months on ArXiV, P=NP
But:
D. is a Principal Research Scientist at HP.
Steve Cook:
“This appears to be a relatively serious claim...”
Dick Lipton:
“...this is a serious effort...”
Moshe Vardi:
“This looks like a serious paper...”
However:
It doesn't look like the proof goes through.
4. Finding flaws can take time Four-color Theorem
Proven 1879
Bug found 1890
Proven 1976 (using a computer)
Hilbert's 21st problem
Solved 1908
Counterexample 1990
Hilbert's 16th problem, special case
Solved 1923
Gaps 1980
Solved 1991
5. Finding flaws in internet time August 6: Manuscript is sent to 22 people, including Ron Fagin, and put on webpage
7: Blog post [Greg Baker]
8: Slashdot, Lipton’s blog
9: Wikipedia article about D. (deleted later)
10: Wiki for technical discussion established
Based on comment thread on Lipton’s blog
About 340 edits since
Fields Medalists are involved
15: Commemorative blogpost:
The P?NP “Proof” Is One Week Old
6. Updates in internet time First draft, Aug 6
Overwritten several times
Second draft Aug 9 to Aug 10
Draft 2 + e, Aug 9 to Aug 11
Third draft, Aug 11 to Aug 17
All drafts removed after Aug 17
D. says: the paper has been sent out for refereeing
Three-page synopsis, Aug 13
Only current public version
7. Elements of the proposed proof Finite Model Theory
Part of mathematical logic
Impact on database theory, combinatorics, and complexity theory
Ron Fagin is the founder of FMT
Ron will introduce P vs. NP, and explain the role of FMT
Random k-SAT
Analogs in statistical physics
Ryan Williams was a key player in the on-line discussions
Post-doc in K53, IBM Raviv Fellow
Ryan gave a beautifully simple counter-argument to this part
8. Discovery vs. Verification Two important tasks for a scientist are discovery of solutions, and verification of other people’s solutions.
It is easier to check that a solution, say to a puzzle, is correct, rather than to find the solution.
That is, verifying a solution is easier than discovering it.
Example: Sudoku
9. Sudoku
10. Sudoku
11.
The P vs. NP question asks whether verification is easier than discovery
12. What is P? Polynomial Time
The class of problems where the solution can discovered “quickly”
In time polynomial in the size of the input
Example 1: Given a number, is it even?
Example 2: Given a graph, is it connected?
13. What is NP? Nondeterministic Polynomial time
The class of problems where the solution can verified “quickly”
In time polynomial in the size of the input
Example 1: Sudoku
A filled-in puzzle gives a quick verification.
Example 2: 3-colorability
14. 3-colorability
15. 3-colorability
16. Quick verification of 3-colorability
17. Quick verification of 3-colorability
18. Does P = NP? For our examples (Sudoku and 3-coloring), it is not known if they are in P.
19. P vs. NP Problems in P: efficient discovery of a solution
Problems in NP: efficient verification of a solution
The problem of whether P = NP asks:
Assume it easy to verify a solution.
Is it easy to discover a solution?
Can always discover a solution by brute-force search
But there are an exponential number of solutions to check
Can we do better?
Consider the needle in a haystack metaphor.
20. NP-complete problems NP-complete problems are the “hardest” problems in NP
Examples: Sudoku and 3-colorability
If there is a fast (polynomial-time) algorithm for one NP-complete problem, then there is a fast algorithm for every problem in NP!
For example, a fast algorithm for Sudoku implies P=NP.
21. Why is a proof that P ? NP important? A number of important problems in industry (such as flight scheduling, chip layout, and protein folding) are NP-complete. A proof that P ? NP would tell us that we cannot expect to get optimal answers in practice.
Cryptography is based on the assumption that P ? NP. Proving that P ? NP is a stepping stone towards provably secure cryptography.
A proof that P ? NP would give us deep insight into the nature of computation, which would have many ripple effects
For example, Wiles’ proof of Fermat’s Last Theorem led to other fundamental advances in number theory.
22. Maybe P = NP? Then the world is fundamentally different than is commonly believed.
Bad news: P = NP would destroy the “standard model” of complexity theory
Much previous research would become useless.
Good news: P = NP would probably imply that we can solve problems efficiently that we can’t now.
Bad news: P = NP would probably imply that current cryptographic systems can be broken.
Radically new approaches to security would be needed.
23. The P vs. NP problem has been called “one of the deepest questions ever asked by human beings”.
The blog author who said this “bet his house” against Deolalikar’s proof.
24. SAT Given a logical formula that is an “and of ors”, is there a solution (an assignment of 0’s and 1’s to the variables that makes the formula true)?
Example: (x1 OR NOT(x2) OR x3) AND (x2 OR NOT(x3)) AND (NOT(x1) OR NOT(x4))
A solution: x1 = 1, x2 = 0, x3 = 0, x4 = 0.
The set of such solutions is called the solution space.
Cook’s Theorem (1971): SAT is NP-complete.
k-SAT: each clause has exactly k members. This problem is also NP-complete for k = 3.
25. Strategy of Deolalikar’s Proof If k-SAT were in P, then the solution spaces for all k-SAT formulas would have a “simple structure”.
For some k-SAT formulas, the solution spaces for these formulas do not have a simple structure.
Therefore, k-SAT is not in P, and so P ? NP.
The proof Deolalikar gives for the first bullet uses finite model theory
26. Existential second-order logic 3-colorability can be expressed quite informally as:
? a coloring (“the coloring is a 3-coloring of the graph”)
A little more formally as:
?R?G?B (“Every point is in exactly one of the sets R, G, or B, and no two points that are connected by an edge are both in R, or both in G, or both in B”)
This formula can be expressed formally in existential second-order logic (?SO)
So 3-colorability can be expressed in ?SO.
27. Capturing NP with logic Fagin’s Theorem (1974): NP = ?SO
Example: 3-colorability
Surprising, since characterizing a complexity class in terms of logic, where there is no notion of machine, computation, polynomial, or time.
28. How about P? Fagin’s Theorem captures NP in terms of logic.
Can we also capture P in terms of logic?
Answer: Yes (sort of).
29. Capturing P with logic There is a logic called “least fixpoint logic” (LFP).
It is richer than first-order logic (it involves “recursion”).
Immerman-Vardi Theorem (1982): P = LFP (over ordered structures)
30. Back to Deolalikar’s proof strategy Recall that the first part of Deolalikar’s proof strategy says that if k-SAT were in P, then the solution spaces for all k-SAT formulas would have a “simple structure”.
Deolalikar’s proof of this first part proceeds as follows:
Assume that k-SAT is in P.
So k-SAT can be expressed in LFP, by the Immerman-Vardi Theorem.
LFP implies a simple structure for solution spaces.
So solution spaces for k-SAT formulas have a simple structure.
Unfortunately, Deolalikar’s proof of step 3 works only for a fragment of LFP (the “monadic case”).
This was pointed out by Immerman in Lipton’s blog.
So k-SAT is not necessarily covered in step 3.
31. Strategy of Deolalikar’s Proof If k-SAT were in P, then the solution spaces for all k-SAT formulas would have a “simple structure”.
For some k-SAT formulas, the solution spaces for these formulas do not have a simple structure.
Therefore, k-SAT is not in P, and so P ? NP.
We just saw that there was an error in Deolalikar’s proof of the first bullet.
But maybe the first bullet can be proven another way.
Ryan will now discuss the second bullet.
32. Strategy of Deolalikar’s Proof If k-SAT were in P, then the solution spaces for all k-SAT formulas would have a “simple structure”.
For some k-SAT formulas, the solution spaces for these formulas do not have a simple structure.
Therefore, k-SAT is not in P, and so P ? NP.
Deolalikar proposes to choose certain random k-SAT formulas, and use known properties of their solution spaces
33. 33 Random k-SAT Recall k-SAT: Satisfiability of Boolean formulas as AND of ORs
n variables (0-1), m clauses, each clause has k literals
F = (x1 OR NOT(x2) OR x3) AND (x2 OR NOT(x3) OR NOT(x4)) AND (NOT(x1) OR NOT(x2) OR NOT(x3))
Here we have n=4, m=3, k=3
Given a formula F, is F satisfiable? Is there a setting of variables that makes F evaluate to 1?
Random k-SAT: Fix n, m, k, and choose m clauses at random
Study the percentage of random formulas that are satisfiable “At random” means that all possible clauses are equally likely to be chosen“At random” means that all possible clauses are equally likely to be chosen
34. 34 Random k-SAT This lovely graph is due to Bart Selman of Cornell University.
Here, k=3.
On the x-axis, we have the clause-to-variable ratio.
The y-axis stands for the fraction of all formulas that are satisfiable, and the relative running time of the usual industrial strength SAT solver for solving the instance.
It certainly looks like the truly hard instances of Sat lie here, around the phase transition point.
(This point is empirically 4.26, but it has not been rigorously proven.)
Phase transition: analogous to the physical transition from liquid to gas: it occurs at a certain critical “temperature”This lovely graph is due to Bart Selman of Cornell University.
Here, k=3.
On the x-axis, we have the clause-to-variable ratio.
The y-axis stands for the fraction of all formulas that are satisfiable, and the relative running time of the usual industrial strength SAT solver for solving the instance.
It certainly looks like the truly hard instances of Sat lie here, around the phase transition point.
(This point is empirically 4.26, but it has not been rigorously proven.)
Phase transition: analogous to the physical transition from liquid to gas: it occurs at a certain critical “temperature”
35. Random k-SAT What do the formulas undergoing this transition from “almost all satisfiable” to “almost all unsatisfiable” look like, on average?
(Mezard et al. Science 2002) For random k-SAT, there are actually three phases: 1. a “replica-symmetric” phase where the solutions are all in one big “cluster” together, then
2. a “replica-symmetry-breaking satisfiable” (RSB) phase with exponentially many clusters of solutions, each cluster being “far” from all the others, and finally
3. a “replica-symmetry-breaking unsatisfiable” phase with no solutions.
Here the distance measure is Hamming distance:e.g. (1,1,1,1) and (0,0,0,0) have distance 4, (1,0,0,0) and (0,0,0,0) have distance 1
Cluster: Just means that the satisfying assignments are all very tightly “close” together.
Hamming distance: the distance between two assignments is the number of bits in which they differCluster: Just means that the satisfying assignments are all very tightly “close” together.
Hamming distance: the distance between two assignments is the number of bits in which they differ
36. The RSB Satisfiable Phase of k-SAT Exponentially many clusters of solutions, each cluster being “far” from all the others
Deolalikar’s proof focuses on analyzing formulas arising from this RSB satisfiable phase.Certainly some complex-looking structure here… Can this be the reason that k-SAT is hard?
So think of the space of all satisfying assignments as points in n-dimensional space, and we connect two points with a line if they are within distance 1 of each other. Then here is a “cartoon” of what a typical solution space looks like in the RSB phase.... We have many clusters which are all “far” from each other in space.
The RSB satisfiable phase has been rigorously shown to exist for k >= 9
Deolalikar’s proof focuses on formulas arising from this RSB satisfiable phase. These are the ones he considers to have a solution spaces with “complex” structure.
And indeed it is this RSB satisfiable phase that is considered to contain “hard to satisfy” formulas, since empirically, the known SAT algorithms tend to get tripped up on these formulas (there is some debate among the statistical physicists about this, though!)So think of the space of all satisfying assignments as points in n-dimensional space, and we connect two points with a line if they are within distance 1 of each other. Then here is a “cartoon” of what a typical solution space looks like in the RSB phase.... We have many clusters which are all “far” from each other in space.
The RSB satisfiable phase has been rigorously shown to exist for k >= 9
Deolalikar’s proof focuses on formulas arising from this RSB satisfiable phase. These are the ones he considers to have a solution spaces with “complex” structure.
And indeed it is this RSB satisfiable phase that is considered to contain “hard to satisfy” formulas, since empirically, the known SAT algorithms tend to get tripped up on these formulas (there is some debate among the statistical physicists about this, though!)
37. Now The Scrutiny Begins… What is the real meaning of “simple structure”?
To try to understand the proof, researchers substituted problems known to be in P in place of k-SAT, to see what goes wrong.
Any real proof of P ? NP which relies on the difficulty of k-SAT cannot also work if we replace k-SAT in the proof with an easy problem!
This is a common heuristic that mathematicians use to “sanity check” their proofs. People tried many different tricks to try to get ahold of what’s going on in the proof. One thing they stumbled on was the following.
Make sure your proof doesn’t prove too much! (In particular, make sure it does not prove false statements)People tried many different tricks to try to get ahold of what’s going on in the proof. One thing they stumbled on was the following.
Make sure your proof doesn’t prove too much! (In particular, make sure it does not prove false statements)
38. Strategy of Deolalikar’s Proof (Again) If k-SAT were in P, then the solution spaces for all k-SAT formulas would have a “simple structure”.
For some k-SAT formulas, the solution spaces for these formulas do not have a simple structure.
Therefore, k-SAT is not in P, and so P ? NP.
39. The SAT0 Objection SAT0: Formulas that are satisfied when you set every variable to zero.
This problem is definitely in P. Very easy.
However, we can show that for every k-SAT formula, there is a SAT0 formula with an isomorphic solution space. All distances between solutions are preserved.
So whatever complex structure you may have in the solution space of a random k-SAT formula, there are always SAT0 formulas with analogous structure!
40. The SAT0 Objection
Take any k-SAT formula F and one of its solutions (A1,…,An) where Ai ? {0,1} for all i
Create the formula F’ as follows:for every Ai = 1, change all xi in F to NOT(xi), and all NOT(xi) to xi Now what does the solution space to F’ look like? Well, a little thought shows that it is nothing more than a translation of the solution space of F! Furthermore, the all-zero assignment satisfies F’, so we have turned a “hard” formula into an “easy” one, without changing the solution space!Now what does the solution space to F’ look like? Well, a little thought shows that it is nothing more than a translation of the solution space of F! Furthermore, the all-zero assignment satisfies F’, so we have turned a “hard” formula into an “easy” one, without changing the solution space!
41. The SAT0 Objection
What does this say?
The difficulty of k-SAT doesn’t arise from distinguishing satisfiable formulas with “simple structure” from those with “complex structure”, but rather from distinguishing satisfiable formulas from unsatisfiable formulas.
Still, this is just intuition...
42. The intuition is realized Theorem (Proved by "vloodin" and Terence Tao) Under the notion of “simple" given in the paper, k-SAT does have simple solution spaces!
Proof Idea: First show that all SAT0 formulas have "simple" solution spaces, then use the SAT0 objection to translate this space over for an arbitrary k-SAT instance.
So unfortunately the proof breaks in its current form.
43. Can we salvage something from it? Terence Tao's car analogy (paraphrased):
…the paper is like a lengthy blueprint for a revolutionary new car, that somehow combines a high-tech engine with advanced fuel injection to get 200 miles to the gallon.
The LFP objections are like a discovery of serious wiring faults in the engine… but the inventor claims this can be fixed using a weak engine
The solution space objections are like a discovery that, according to blueprints, the car would run just as well if gasoline was replaced with ordinary tap water… D.’s response to this has been roughly “That objection is invalid – everyone knows cars can’t run on water.”
The theorem (on the previous slide) is like a discovery that the fuel is in fact being sent to a completely different component of the car than the engine…"
Can any parts of this car be salvaged? We were excited because the proof strategy is new. The mere fact that no one tried it before gave it a chance of working. We were excited because the proof strategy is new. The mere fact that no one tried it before gave it a chance of working.
44. Concluding remarks Deolalikar’s proof seems to be not only wrong, but unfixable.
Hardness and solution space complexity seem to be orthogonal.
New research question: can random k-SAT be used to prove complexity results?
There is a new world of community refereeing.
Good: every part of the proof had corresponding experts
Bad: those experts spent a great deal of time
The community is still learning how to work effectively in this new world.