240 likes | 243 Views
Phase Transitions in Proof Complexity and Satisfiability Search. Paul Beame University of Washington. Dimitris Achlioptas Michael Molloy Microsoft Research U. Toronto. with. Satisfiability. F ( x 1 x 2 x 4 ) ( x 1 x 3 ) ( x 3 x 2 ) ( x 4 x 3 )
E N D
Phase Transitions in Proof Complexity and Satisfiability Search Paul Beame University of Washington Dimitris Achlioptas Michael Molloy Microsoft ResearchU. Toronto with
Satisfiability F(x1 x2 x4)(x1x3)(x3x2)(x4 x3) satisfying assignment for F:x1, x2, x3, x4 Given F does such an assignment exist?
Satisfiability Algorithms • Incomplete Algorithms • will (likely) find a satisfying assignment but will simply give up if one is not found • Complete Algorithms • will either find a satisfying assignment or determine that no such assignment exists
Satisfiability Algorithms • Incomplete Algorithms • Local search GSAT [Selman,Levesque,Mitchell 92] Walksat [Kautz,Selman 96] • Belief Propagation SP [Braunstein, Mezard, Zecchina 02] • Complete Algorithms • Backtracking search DPLL [Davis,Putnam 60][Davis,Logeman,Loveland 62] DPLL + “clause learning” GRASP, SATO, zchaff
Simplification and Satisfaction F(x1 x2 x4)(x1x3)(x3x2)(x4 x3) satisfying assignment for F:x1, x2, x3, x4 • Simplifying F after setting literal x3 to true F (x1 x2 x4)(x1x3)(x3 x2)(x4x3) F|x3(x1 x2 x4)(x2)(x4) • F is satisfied if all clauses disappear under simplification given the assignment 1-clauses
Backtracking search/DPLL DPLL(F) whileFcontains a 1-clause l’F F|l’ if F has no clauses output ‘satisfiable’ halt if F has an empty clause backtrack else selecta literal l = some x or x DPLL(F|l) if backtrack then DPLL(F|l) Residual formula
Some standard select choices for DPLL algorithms • UC: Unit Clause/Ordered DLL • Choose variables in a fixed order • Always set True first • UCwm: Unit Clause with majority • Choose variables in a fixed order • Apply a majority vote among 3-clauses for assigning each value • GUC: Generalized Unit Clause • Choose a variable v in a shortest clause C • Set v to satisfy C
Random k-CNF formulas • Distribution Fk,n(r) • Randomly choose rn clauses over n variables independently, each of size k • Each size k clause is equally likely • Threshold value rk* • r rk*, almost certainly satisfiable • r rk*, almost certainly unsatisfiable • Hardest problems near threshold
probability satisfiable 1 0 4.267 DPLL on random 3-CNF* Proof complexity shows 2Q(n/r)time is required for unsatisfiable formulas for r r3* [B,Karp,Saks,Pitassi 98] [Ben-Sasson 02] # of DPLL backtracks What about satisfiable formulas below threshold? ratio of clauses to variables [Mitchell,Selman,Levesque 92] r * n = 50 variables
Exponential lower bounds for 3-CNF formulas below ratio 4.267 TheoremLetA{UC, UCwm, GUC}. Let r3UC = 3.81 r3UCwm= 3.83 r3GUC= 4.01 w.h.p. algorithm A takes exponentialtime on a random FF3,n(r) for r r3A
Exponential lower bounds for satisfiable formulas below the k-CNF threshold TheoremThere exist lk2k/k and uk2ks.t. for every k 4 and for FFk,n(r) with lk r uk w.h.p. • F is satisfiable • UC takes exponential time on F Note These formulas have huge numbers of satisfying assignments (more than 2 (1-) n out of a possible 2n) but still are hard
Ideas Part I: Use differential equations to analyze trajectory of algorithm as a function of the clause-variable ratio for r larger than lk Use resolution proof complexity to show that some residual formula along this trajectory requires large DPLL running time Part II: Show that formulas up to ratio uk are satisfiable [Achlioptas, Peres 03] uk=2kln 2 – (k+4)/2
Algorithmic behavior using simple select choices • On input FFk,n(r) before the first backtrack occurs, the residual formula F’ is distributed as F2Fk where • FjFj,n’(rj) for j=2,,k only has clauses of size k • Fj are mutually independent • Values of rj almost surely follow algorithm-dependent trajectories given by differential equations
Proof Complexity • Study of the number of symbols required for proofs of unsatisfiability (or tautology) in propositional logic • Does not address algorithmic issue • How would you find short proofs if they existed? • Existence of short proofs for every unsatisfiable formula is equivalent to NP = co-NP (and is implied by P=NP) • Generally believed that such proofs don’t exist • Active research area with rich theory and many open questions
Resolution • Start with clauses of CNF formula F • Resolution rule • Given (A x), (B x) can derive (A B) • The empty clause is derivable F is unsatisfiable • Proof size = # of clauses used
Resolution and DPLL • Running DPLL with anyselect rule on an unsatisfiable formula F generates a Resolution refutation of F • # of clauses running time
Backtracking search/DPLL DPLL(F) whileFcontains a 1-clause l’F F|l’ if F has no clauses output ‘satisfiable’ halt if F has an empty clause backtrack else selecta literal l = some x or x DPLL(F|l) if backtrack then DPLL(F|l) Residual formula
Long-running DPLL Executions Residual formula at each node is a mix of 2- and 3-clauses Residual formula at is unsatisfiable 2rn Every resolution Algorithm’s proof of unsatisfiability is exponentially long
Satisfiability for mixed random formulas: proven properties [Achlioptas et al 96] 1 [Kaporis et al 03] ? ? [Dubois 01] ? ? ? ? UNSAT 2-clause ratio ? ? ? ? SAT ? ? 3.52 2/3 2.28 4.501 3-clause ratio
Resolution proof complexity of mixed random formulas Theorem A random CNF formula FF2,n(r2) is • Satisfiable w.h.p. if r2<1 • Unsatisfiable w.h.p if r2>1 and has linear size resolution proofs [Chvatal-Reed 91], [Goerdt 91], [De La Vega 91] Theorem For any constant r30, w.h.p. GF3,n(r3)requires an exponential-size resolution proof of unsatisfiability [Chvatal,Szemeredi 88] Theorem For any constants r21 and r3 0, w.h.p. for FF2,n(r2) and GF3,n(r3) the combined formula FG requires an exponential-size resolution proof of unsatisfiability Easy Hard Easy Hard = Hard
Sharp Threshold in Resolution Proof Complexity • Define distribution Hn(r) on CNF formulas of the form H=FG where • GF3,n(r3) for some r32.28 and • FF2,n(r). • Then for HHn(r) w.h.p. • H is unsatisfiable • For r 1,H has O(n) size resolution proofs • For r 1,H requires 2W(n) size resolution proofs
Trajectory on 3-CNF UC Algorithm Trajectory 1 Provably UNSAT & Hard 2-clause ratio Provably SAT & Easy 4.51 3.81 3.52 4.267 3-clause ratio
UC trajectory for k 4 • Start with 2.752kn/kk-clauses • Wait until 3n/(k-1) variables remain • With high probability: • The 2-clauses remained satisfiable throughout • The residual formula overall is unsatisfiable • Its resolution complexity is exponential
Directions • What price completeness? • Closing gap for unsatisfiability of mixed formulas would yield an algorithm-dependent phase transition • Below rAalgorithm runs in linear time • Above rA algorithm requires exponential time • Backtracking algorithms for other random problems with phase transitions? • e.g. k-colorability on random graphs G(n,r/n) • Unsatisfiable phase exp(cn/rak)[B, Culberson, Mitchell, Moore 03]