240 likes | 251 Views
This study delves into phase transitions in proof complexity and satisfiability search algorithms, examining the behavior of satisfiability algorithms such as GSAT and Walksat along with complete algorithms like DPLL. It explores simplification techniques, residual formulas, hard thresholds for problem difficulty, and the use of differential equations for algorithm analysis. The research also touches on proof complexity, resolution, and the pursuit of short proofs for unsatisfiable formulas in propositional logic. The text aims to shed light on the complexities and theoretical frameworks of these algorithmic processes.
E N D
Phase Transitions in Proof Complexity and Satisfiability Search Paul Beame University of Washington Dimitris Achlioptas Michael Molloy Microsoft ResearchU. Toronto with
Satisfiability F(x1 x2 x4)(x1x3)(x3x2)(x4 x3) satisfying assignment for F:x1, x2, x3, x4 Given F does such an assignment exist?
Satisfiability Algorithms • Incomplete Algorithms • will (likely) find a satisfying assignment but will simply give up if one is not found • Complete Algorithms • will either find a satisfying assignment or determine that no such assignment exists
Satisfiability Algorithms • Incomplete Algorithms • Local search GSAT [Selman,Levesque,Mitchell 92] Walksat [Kautz,Selman 96] • Belief Propagation SP [Braunstein, Mezard, Zecchina 02] • Complete Algorithms • Backtracking search DPLL [Davis,Putnam 60][Davis,Logeman,Loveland 62] DPLL + “clause learning” GRASP, SATO, zchaff
Simplification and Satisfaction F(x1 x2 x4)(x1x3)(x3x2)(x4 x3) satisfying assignment for F:x1, x2, x3, x4 • Simplifying F after setting literal x3 to true F (x1 x2 x4)(x1x3)(x3 x2)(x4x3) F|x3(x1 x2 x4)(x2)(x4) • F is satisfied if all clauses disappear under simplification given the assignment 1-clauses
Backtracking search/DPLL DPLL(F) whileFcontains a 1-clause l’F F|l’ if F has no clauses output ‘satisfiable’ halt if F has an empty clause backtrack else selecta literal l = some x or x DPLL(F|l) if backtrack then DPLL(F|l) Residual formula
Some standard select choices for DPLL algorithms • UC: Unit Clause/Ordered DLL • Choose variables in a fixed order • Always set True first • UCwm: Unit Clause with majority • Choose variables in a fixed order • Apply a majority vote among 3-clauses for assigning each value • GUC: Generalized Unit Clause • Choose a variable v in a shortest clause C • Set v to satisfy C
Random k-CNF formulas • Distribution Fk,n(r) • Randomly choose rn clauses over n variables independently, each of size k • Each size k clause is equally likely • Threshold value rk* • r rk*, almost certainly satisfiable • r rk*, almost certainly unsatisfiable • Hardest problems near threshold
probability satisfiable 1 0 4.267 DPLL on random 3-CNF* Proof complexity shows 2Q(n/r)time is required for unsatisfiable formulas for r r3* [B,Karp,Saks,Pitassi 98] [Ben-Sasson 02] # of DPLL backtracks What about satisfiable formulas below threshold? ratio of clauses to variables [Mitchell,Selman,Levesque 92] r * n = 50 variables
Exponential lower bounds for 3-CNF formulas below ratio 4.267 TheoremLetA{UC, UCwm, GUC}. Let r3UC = 3.81 r3UCwm= 3.83 r3GUC= 4.01 w.h.p. algorithm A takes exponentialtime on a random FF3,n(r) for r r3A
Exponential lower bounds for satisfiable formulas below the k-CNF threshold TheoremThere exist lk2k/k and uk2ks.t. for every k 4 and for FFk,n(r) with lk r uk w.h.p. • F is satisfiable • UC takes exponential time on F Note These formulas have huge numbers of satisfying assignments (more than 2 (1-) n out of a possible 2n) but still are hard
Ideas Part I: Use differential equations to analyze trajectory of algorithm as a function of the clause-variable ratio for r larger than lk Use resolution proof complexity to show that some residual formula along this trajectory requires large DPLL running time Part II: Show that formulas up to ratio uk are satisfiable [Achlioptas, Peres 03] uk=2kln 2 – (k+4)/2
Algorithmic behavior using simple select choices • On input FFk,n(r) before the first backtrack occurs, the residual formula F’ is distributed as F2Fk where • FjFj,n’(rj) for j=2,,k only has clauses of size k • Fj are mutually independent • Values of rj almost surely follow algorithm-dependent trajectories given by differential equations
Proof Complexity • Study of the number of symbols required for proofs of unsatisfiability (or tautology) in propositional logic • Does not address algorithmic issue • How would you find short proofs if they existed? • Existence of short proofs for every unsatisfiable formula is equivalent to NP = co-NP (and is implied by P=NP) • Generally believed that such proofs don’t exist • Active research area with rich theory and many open questions
Resolution • Start with clauses of CNF formula F • Resolution rule • Given (A x), (B x) can derive (A B) • The empty clause is derivable F is unsatisfiable • Proof size = # of clauses used
Resolution and DPLL • Running DPLL with anyselect rule on an unsatisfiable formula F generates a Resolution refutation of F • # of clauses running time
Backtracking search/DPLL DPLL(F) whileFcontains a 1-clause l’F F|l’ if F has no clauses output ‘satisfiable’ halt if F has an empty clause backtrack else selecta literal l = some x or x DPLL(F|l) if backtrack then DPLL(F|l) Residual formula
Long-running DPLL Executions Residual formula at each node is a mix of 2- and 3-clauses Residual formula at is unsatisfiable 2rn Every resolution Algorithm’s proof of unsatisfiability is exponentially long
Satisfiability for mixed random formulas: proven properties [Achlioptas et al 96] 1 [Kaporis et al 03] ? ? [Dubois 01] ? ? ? ? UNSAT 2-clause ratio ? ? ? ? SAT ? ? 3.52 2/3 2.28 4.501 3-clause ratio
Resolution proof complexity of mixed random formulas Theorem A random CNF formula FF2,n(r2) is • Satisfiable w.h.p. if r2<1 • Unsatisfiable w.h.p if r2>1 and has linear size resolution proofs [Chvatal-Reed 91], [Goerdt 91], [De La Vega 91] Theorem For any constant r30, w.h.p. GF3,n(r3)requires an exponential-size resolution proof of unsatisfiability [Chvatal,Szemeredi 88] Theorem For any constants r21 and r3 0, w.h.p. for FF2,n(r2) and GF3,n(r3) the combined formula FG requires an exponential-size resolution proof of unsatisfiability Easy Hard Easy Hard = Hard
Sharp Threshold in Resolution Proof Complexity • Define distribution Hn(r) on CNF formulas of the form H=FG where • GF3,n(r3) for some r32.28 and • FF2,n(r). • Then for HHn(r) w.h.p. • H is unsatisfiable • For r 1,H has O(n) size resolution proofs • For r 1,H requires 2W(n) size resolution proofs
Trajectory on 3-CNF UC Algorithm Trajectory 1 Provably UNSAT & Hard 2-clause ratio Provably SAT & Easy 4.51 3.81 3.52 4.267 3-clause ratio
UC trajectory for k 4 • Start with 2.752kn/kk-clauses • Wait until 3n/(k-1) variables remain • With high probability: • The 2-clauses remained satisfiable throughout • The residual formula overall is unsatisfiable • Its resolution complexity is exponential
Directions • What price completeness? • Closing gap for unsatisfiability of mixed formulas would yield an algorithm-dependent phase transition • Below rAalgorithm runs in linear time • Above rA algorithm requires exponential time • Backtracking algorithms for other random problems with phase transitions? • e.g. k-colorability on random graphs G(n,r/n) • Unsatisfiable phase exp(cn/rak)[B, Culberson, Mitchell, Moore 03]