590 likes | 781 Views
Propositional Satisfiability (SAT). Toby Walsh Cork Constraint Computation Centre University College Cork Ireland 4c.ucc.ie/~tw/sat/. Outline. What is SAT? How do we solve SAT? Why is SAT important?. Propositional satisfiability (SAT).
E N D
Propositional Satisfiability(SAT) Toby Walsh Cork Constraint Computation Centre University College Cork Ireland 4c.ucc.ie/~tw/sat/
Outline • What is SAT? • How do we solve SAT? • Why is SAT important?
Propositional satisfiability (SAT) • Given a propositional formula, does it have a “model” (satisfying assignment)? • 1st decision problem shown to be NP-complete • Usually focus on formulae in clausal normal form (CNF) P iff Q, Q iff R, -(P iff R) P -> Q, -Q, P P v Q, P & -Q
Clausal Normal Form • Formula is a conjunction of clauses • C1 & C2 & … • Each clause is a disjunction of literals • L1 v L2 v L3, … • Empty clause contains no literals (=False) • Unit clauses contains single literal • Each literal is variable or its negation • P, -P, Q, -Q, … P v Q, -P v -Q P v -Q, Q v -R, P v -R P v Q v R, -P v -R
Clausal Normal Form • k-CNF • Each clause has k literals • 3-CNF • NP-complete • Best current complete methods are exponential • 2-CNF • Polynomial (indeed, linear time) P v Q, -P v -Q P v -Q, Q v -R, P v -R P v Q v R, -P v -R
How do we solve SAT? • Systematic methods • Truth tables • Davis Putnam procedure • Local search methods • GSAT • WalkSAT • Tabu search, SA, … • Exotic methods • DNA, quantum computing,
Procedure DPLL(C) (SAT) if C={} then SAT (Empty) if empty clause in C then UNSAT (Unit) if unit clause, {l} then DPLL(C[l/True]) (Split) if DPLL(C[l/True]) then SAT else DPLL(C[l/False])
GSAT [Selman, Levesque, Mitchell AAAI 92] • Repeat MAX-TRIES times or until clauses satisfied • T:= random truth assignment • Repeat MAX-FLIPS times or until clauses satisfied • v := variable which flipping maximizes number of SAT clauses • T := T with v’s value flipped
WalkSAT [Selman, Kautz, Cohen AAAI 94] • Repeat MAX-TRIES times or until clauses satisfied • T:= random truth assignment • Repeat MAX-FLIPS times or until clauses satisfied • c := unsat clause chosen at random • v:= var in c chosen either greedily or at random • T := T with v’s value flipped Focuses on UNSAT clauses
Why is SAT important? • Computational complexity • 1st problem shown NP-complete • Can therefore be used in theory to solve any NP-complete problem • Many direct applications
Some applications of SAT • Hardware design • Signals • Hi = True • Lo = False • Gates • AND gate = and connective • INVERTOR gate = not connective • ..
Some applications of SAT • Hardware design • State of the art • HP verified 1/7th of the DEC Alpha chip using a DP solver • 100,000s of variables • 1,000,000s of clauses • Modelling environment is one of the biggest problems
Some applications of SAT • Planning • But planning is undecidable in general • Even propositional STRIPS planning is PSPACE complete! • How can a SAT solver, which only solves NP-hard problems be used then?
Some applications of SAT • Planning as SAT • Put bound on plan length • If bound too small, UNSAT • Introduce new propositional variables for each time step
Some applications of SAT • Diagnosis as SAT • Otherwise know as “SAT in space” • Deep Space One spacecraft • Propositional theory to monitor, diagnose and repair faults • Runs in LISP!
Computational complexity • Study of “problem hardness” • Typically worst case • Big O analysis • Sorting is easy, O(n logn) • Chess and GO are hard, EXP-time • “Can I be sure to win?” • Need to generalize problem to n by n board Where do things start getting hard?
Computational complexity • Hierarchy of complexity classes • Polynomial (P), NP, PSpace, …. • NP-complete problems mark boundary of tractability • No known polynomial time algorithm • Though open if P=/=NP
NP-complete problems • Non-deterministic Polynomial time • If I guess a solution, I can check it in polynomial time • But no known easy way to guess solution correctly! • Complete • Representative of all problems in this class • If this problem can be solved in polynomial time, all problems in the class can be solved • Any NP-complete problem can be mapped into any other
NP-complete problems • Many examples • Propositional satisfiability (SAT) • Graph colouring • Travelling salesperson problem • Exam timetabling • …
SAT is NP-complete • Cook (1971) showed that all non-deterministic Turing machines can be reduced to SAT => There is a polynomial reduction of any problem in NP to SAT But not all SAT problems are equally hard!
SAT phase transition [Mitchell, Selman, Levesque AAAI-92] • Random k-SAT • sample uniformly from space of all possible k-clauses • n variables, l clauses • Rapid transition in satisfiability • 2-SAT occurs at l/n=1 [Chavatal & Reed 92, Goerdt 92] • 3-SAT occurs at 3.26 < l/n < 4.598
Random 3-SAT • Which are the hard instances? • around l/n = 4.3 What happens with larger problems? Why are some dots red and others blue?
Random 3-SAT Complexity peak coincides with solubility transition • l/n < 4.3 problems under-constrained and SAT • l/n > 4.3 problems over-constrained and UNSAT • l/n=4.3, problems on “knife-edge” between SAT and UNSAT
Random 3-SAT • Varying problem size, n • Complexity peak appears to be largely invariant of algorithm • backtracking algorithms like Davis-Putnam • local search procedures like GSAT
3SAT phase transition • Lower bounds (hard) • Analyse algorithm that almost always solves problem • Backtracking hard to reason about so typically without backtracking • Complex branching heuristics needed to ensure success • But these are complex to reason about
3SAT phase transition • Upper bounds (easier) • Typically by estimating count of solutions
3SAT phase transition • Upper bounds (easier) • Typically by estimating count of solutions • E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X]
3SAT phase transition • Upper bounds (easier) • Typically by estimating count of solutions • E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] No assumptions about the distribution of X except non-negative!
3SAT phase transition • Upper bounds (easier) • Typically by estimating count of solutions • E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem
3SAT phase transition • Upper bounds (easier) • Typically by estimating count of solutions • E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem The expected value of X can be easily calculated
3SAT phase transition • Upper bounds (easier) • Typically by estimating count of solutions • E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem E[X] = 2^n * (7/8)^l
3SAT phase transition • Upper bounds (easier) • Typically by estimating count of solutions • E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem E[X] = 2^n * (7/8)^l If E[X] < 1, then prob(X>=1) = prob(SAT) < 1
3SAT phase transition • Upper bounds (easier) • Typically by estimating count of solutions • E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem E[X] = 2^n * (7/8)^l If E[X] < 1, then 2^n * (7/8)^l < 1
3SAT phase transition • Upper bounds (easier) • Typically by estimating count of solutions • E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem E[X] = 2^n * (7/8)^l If E[X] < 1, then 2^n * (7/8)^l < 1 n + l log2(7/8) < 0
3SAT phase transition • Upper bounds (easier) • Typically by estimating count of solutions • E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem E[X] = 2^n * (7/8)^l If E[X] < 1, then 2^n * (7/8)^l < 1 n + l log2(7/8) < 0 l/n > 1/log2(8/7) = 5.19…
3SAT phase transition • Upper bounds (easier) • Typically by estimating count of solutions • To get tighter bounds than 5.19, can refine the counting argument • E.g. not count all solutions but just those maximal under some ordering
SAT phase transition • Shape of transition • “sharp” both for 2-SAT and 3-SAT [Friedut 99] • Backbone (dis)continuity • 2-SAT transition is "2nd order", continuous • 3-SAT transition is "1st order", discontinuous • backbone = truth assignments that are fixed when we satisfy as many clauses as possible [Monasson et al. 1998],…
2+p-SAT Morph between 2-SAT and 3-SAT • fraction p of 3-clauses • fraction (1-p) of 2-clauses [Monasson et al 1999]
2+p-SAT • Maps from P to NP • NP-complete for any p>0 • Insight into change from P to NP, continuous to discontinuous, …? [Monasson et al 1999]
2+p-SAT • Observed search cost • linear for p<0.4 • exponential for p>0.4 • But NP-hard for all p>0!
2+p-SAT Discontinuous 3SAT like Continuous 2SAT like
Simple bound • Are the 2-clauses UNSAT? • 2-clauses are more constraining than 3-clauses • For p<0.4, transition occurs at lower bound! • 3-clauses are not contributing
The real world isn’t random? • Very true! Can we identify structural features common in real world problems? • Consider graphs met in real world situations • social networks • electricity grids • neural networks • ...
Real graphs tend to be sparse dense random graphs contains lots of (rare?) structure Real graphs tend to have short path lengths as do random graphs Real graphs tend to be clustered unlike sparse random graphs L, average path length C, clustering coefficient (fraction of neighbours connected to each other, cliqueness measure) mu, proximity ratio is C/L normalized by that of random graph of same size and density Real versus Random
Small world graphs • Sparse, clustered, short path lengths • Six degrees of separation • Stanley Milgram’s famous 1967 postal experiment • recently revived by Watts & Strogatz • shown applies to: • actors database • US electricity grid • neural net of a worm • ...
An example • 1994 exam timetable at Edinburgh University • 59 nodes, 594 edges so relatively sparse • but contains 10-clique • less than 10^-10 chance in a random graph • assuming same size and density • clique totally dominated cost to solve problem
Small world graphs • To construct an ensemble of small world graphs • morph between regular graph (like ring lattice) and random graph • prob p include edge from ring lattice, 1-p from random graph real problems often contain similar structure and stochastic components?
Small world graphs • ring lattice is clustered but has long paths • random edges provide shortcuts without destroying clustering