760 likes | 906 Views
SAT Problem Definition KR with SAT Tractable Subclasses DPLL Search Algorithm Slides by: Florent Madelaine Roberto Sebastiani Edmund Clarke Sharad Malik Toby Walsh Kostas Stergiou. Material of lectures on SAT. SAT definitions Tractable subclasses
E N D
SAT Problem Definition KR with SAT Tractable Subclasses DPLL Search Algorithm Slides by: Florent Madelaine Roberto Sebastiani Edmund Clarke Sharad Malik Toby Walsh Kostas Stergiou KNOWLEDGE REPRESENTATION & REASONING - SAT
Material of lectures on SAT • SAT definitions • Tractable subclasses • Horn-SAT • 2-SAT • CNF • Algorithms for SAT • DPLL-based • Basic chronological backtracking algorithm • Branching heuristics • Look-ahead (propagation) • Backjumping and learning • Local Search • GSAT • WalkSAT • Other enhancements • Application of SAT • Planning as satisfiability • Hardware verification KNOWLEDGE REPRESENTATION & REASONING - SAT
What is SAT? Given a propositional formula in Conjunctive Normal Form (CNF), find an assignment to Boolean variables that makes the formula true: c1 = (x2 x3) c2 = (x1 x4) c3 = (x2 x4) A = {x1=0, x2=1, x3=0, x4=1} SATisfying assignment! KNOWLEDGE REPRESENTATION & REASONING - SAT
Why do we study SAT? • Fundamental problem from theoretical point of view • NP-completeness • First problem to be proved NP-complete (Cook’s theorem) • Reduction to SAT often used to prove NP-completeness for other problems • Studies on tractability • Numerous applications: • CAD, VLSI • Combinatorial Optimization • Bounded Model Checking and other type of formal software and hardware verification • AI, planning, automated deduction KNOWLEDGE REPRESENTATION & REASONING - SAT
Representing knowledge using SAT • Embassy ball (a diplomatic problem) King wants to invite PERU or exclude QATAR Queen wants to invite QATAR or ROMANIA King wants to exclude ROMANIA or PERU Who can we invite? KNOWLEDGE REPRESENTATION & REASONING - SAT
Representing knowledge using SAT • Embassy ball (a diplomatic problem) King wants to invite PERU or exclude QATAR Queen wants to invite QATAR or ROMANIA King wants to exclude ROMANIA or PERU (PQ) (QR) (RP) is satisfied by P=true, Q=true, R=false and by P=false, Q=false, R=true KNOWLEDGE REPRESENTATION & REASONING - SAT
Other applications of SAT • Hardware verification S = Cin (P Q), … KNOWLEDGE REPRESENTATION & REASONING - SAT
Formulation of a famous problem as SAT: k-Coloring The K-Coloring problem: Given an undirected graph G(V,E) and a natural number k, is there an assignment color: KNOWLEDGE REPRESENTATION & REASONING - SAT
i) At least one color to each node: (x1,1 x1,2… x1,k) ii) At most one color to each node: iii) Coloring constraints: Formulation of a famous problem as SAT: k-Coloring xi,j= node i is assigned the ‘color’ j (1 in, 1 jk) Constraints: KNOWLEDGE REPRESENTATION & REASONING - SAT
SAT Notation • Boolean Formula: • T and F are formulas • A propositional atom (variable) is a formula • If φ1and φ2 are formulas then φ1, φ1φ2, φ1φ2, φ1φ2, φ1φ2 are formulas • Atoms(φ): the set of atoms appearing in φ • Literal: either an atom p (positive literal) or its negation p(negative literal) • p and pare complementary literals • Clause: a disjunction L1… Ln, n 0 of literals. • Empty clausewhenn = 0 (the empty clause is falsein every interpretation). • Unit clause whenn = 1. KNOWLEDGE REPRESENTATION & REASONING - SAT
SAT Notation • Total truth assignment μ for φ: • μ: Atoms(φ) {Τ,F} • Partial Truth assignment μ for φ: • μ: A{Τ,F}, A Atoms(φ) • Set and formula representation of an assignment: • μcan be represented as a set of literals: • E.g. {μ(Α1) = Τ , μ(Α2) = F} => {A1 , A2} • μcan be represented as a formula: • E.g. {μ(Α1) = Τ , μ(Α2) = F} => {A1 A2} • both representations used for sets of clauses (formulas) KNOWLEDGE REPRESENTATION & REASONING - SAT
SAT Notation • μ|=φ (μsatisfiesφ): • μ|= Aiμ(Ai) = T • μ|= φ not μ|=φ • μ|= φ1 φ2μ|= φ1 μ|= φ2 • ... • φissatisfiable iffμ|=φ for some μ • φ1|=φ2(φ1entailsφ2) • iff for every μ, μ|= φ1=> μ|= φ2 • |=φ(φ is valid) • iff for every μ, μ|= φ • what does this mean forφ ? KNOWLEDGE REPRESENTATION & REASONING - SAT
SAT Notation • φ1andφ2are equivalent iff • for every μ, μ|= φ1iffμ|= φ2 • φ1andφ2are equisatisfiable iff • exists μ1 s.t.μ1|= φ1iff exists μ2 s.t.μ2|= φ2 • If φ1andφ2are equivalent then they are also equisatisfiable • but the opposite does not hold • Example: • φ1 φ2and (φ1 l) (l φ2), where l not in φ1 φ2, are equisatisfiable but not equivalent KNOWLEDGE REPRESENTATION & REASONING - SAT
Conjunctive Normal Form (CNF) • A formula A is in conjunctive normal form, or simply CNF,if it is • either T, or F, or a conjunction of disjunctions of literals: • (That is, a conjunction of clauses.) • A formula B is called a conjunctive normal form of a formula A ifB is equivalent to A and B is in conjunctive normal form. KNOWLEDGE REPRESENTATION & REASONING - SAT
Conjunctive Normal Form • Every sentence in propositional logic can be transformed into conjunctive normal form • i.e. a conjunction of disjunctions Simple Algorithm • Eliminate using the rule that (p q) is equivalent to (p q) • Use de Morgan’s laws so that negation applies to literals only • Distribute and to write the result as a conjunction of disjunctions KNOWLEDGE REPRESENTATION & REASONING - SAT
Conjunctive Normal Form - Example (p q) (r p) • Eliminate implication signs • (p q) (r p) • Apply de Morgan’s laws • (p q) (r p) • Apply associative and distributive laws • (p r p) (q r p) • (p r) (q r p) KNOWLEDGE REPRESENTATION & REASONING - SAT
Tractable Subclasses • SAT is NP-complete • therefore it generally is hard to solve! • Question: • In what ways can we restrict the expressiveness of SAT in order to achieve tractability? • Answer: • Horn-SAT • 2-SAT KNOWLEDGE REPRESENTATION & REASONING - SAT
Algorithms for SAT • The study of algorithms for SAT dates back to 1960! • one of the most widely studied NP-complete problems • There are five general approaches to SAT solving • Resolution-based (DP) • Complete Search (DPLL) • Decision Diagrams • IncompleteLocal Search • Stalmärck’s algorithm (breadth-first search) most widely used in practice and the ones we will study KNOWLEDGE REPRESENTATION & REASONING - SAT
Algorithms for SAT • How do we test if a problem is SAT or not? • Complete methods • Return “Yes” if SATisfiable • Return “No” if UNSATisfiable • Incomplete methods • If return “Yes”, problem is SATisfiable • Otherwise timeout/run forever, problem can be SAT or UNSAT KNOWLEDGE REPRESENTATION & REASONING - SAT
Algorithms for SAT • The first algorithm was based on resolution (Davis & Putnam, 1960) • exponential space complexity memory explosion! • The second algorithm was based on search (Davis, Logemann, Loveland, 1962) • usually referred to as DPLL (although Putnam was not involved) • still the basis of most modern complete SAT solvers • Some early DPLL-based SAT solvers: • Tableau (NTAB), POSIT, 2cl, CSAT • not used any more (many orders of magnitude slower than modern solvers) KNOWLEDGE REPRESENTATION & REASONING - SAT
Davis-Putnam Algorithm • Existential abstraction using resolution • Iteratively select a variable for resolution till no more variables are left. (a b c)(b -c f)(-b e)(a b) (a -b) (-a c) (-a -c) ∃b (a c e) (-c e f) ∃b (a)(-a c) (-a -c) ∃bc (a ef) ∃ba (c)(-c) ∃bcaef T∃bac () SAT UNSAT KNOWLEDGE REPRESENTATION & REASONING - SAT
Algorithms for SAT • The first algorithm was based on resolution (Davis & Putnam, 1960) • exponential space complexity memory explosion! • The second algorithm was based on search (Davis, Logemann, Loveland, 1962) • usually referred to as DPLL (although Putnam was not involved) • still the basis of most modern complete SAT solvers • Some early DPLL-based SAT solvers: • Tableau (NTAB), POSIT, 2cl, CSAT • not used any more (many orders of magnitude slower than modern solvers) KNOWLEDGE REPRESENTATION & REASONING - SAT
DPLL Solvers • DPLL-based solvers are relatively small pieces of software • a few thousand lines of code • but they involve quite complex algorithms and heuristics • The evolution of SAT solvers into the modern ultra-fast tools that can tackle large (and huge) real problems is based on the following enhancements of DPLL: • preprocessing • advanced propagation/deduction techniques for look-ahead and preprocessing • sophisticated branching heuristics • very detailed and fast implementations + smart memory management • backjumping and learning methods increasing order of importance? KNOWLEDGE REPRESENTATION & REASONING - SAT
DPLL preprocessing status = preprocess(); if (status!=UNKNOWN) return status; while(1) { decide_next_branch(); while (true) { status = deduce(); if (status == CONFLICT) { blevel = analyze_conflict(); if (blevel == 0) return UNSATISFIABLE; elsebacktrack(blevel); } else if (status == SATISFIABLE) return SATISFIABLE; else break; } } branching heuristics propagation/deduction backjumping/learning • DPLL is traditionally described in a recursive way • We will use this modern iterative description due to Zhang and Malik KNOWLEDGE REPRESENTATION & REASONING - SAT
Unit Propagation • Unit propagation (UP) is the core deduction method used by all DPLL-based solvers • a clause is calledunit if all but one of its literals have been assigned to false (i.e. it consists of a single literal) • UP repeatedly applies unit resolution (i.e. it resolves unit clauses) Let us look at an example most of the time is spent on doing UP!!! The efficient implementation of UP is of primary importance in a SAT solver KNOWLEDGE REPRESENTATION & REASONING - SAT
X X X X X DPLL examples Given in CNF: (x,y,z),(-x,y),(-y,z),(-x,-y,-z) more examples Decide() Deduce() Analyze_Conflict() KNOWLEDGE REPRESENTATION & REASONING - SAT
DPLL preprocessing status = preprocess(); if (status!=UNKNOWN) return status; while(1) { decide_next_branch(); while (true) { status = deduce(); if (status == CONFLICT) { blevel = analyze_conflict(); if (blevel == 0) return UNSATISFIABLE; elsebacktrack(blevel); } else if (status == SATISFIABLE) return SATISFIABLE; else break; } } branching heuristics UP backjumping/learning KNOWLEDGE REPRESENTATION & REASONING - SAT
Propagation / Deduction • Apart from UP several other deduction methods have been proposed and used during preprocessing (mainly) and search (less frequently) • Pure Literal rule • Binary Clause reasoning • Hyper Resolution • Failed Literal Detection • Equality Reduction • Krom Subsumption Resolution • Generalized Subsumption Resolution • … most of them are only used for preprocessing the formula because they are expensive One notable exception is the pure literal rule KNOWLEDGE REPRESENTATION & REASONING - SAT
Pure Literal Rule • The pure literal rule (Davis, Logemann, Loveland, 1962) states the following: • if a variable occurs only positively then it can be assigned to true • if a variable occurs only positively then it can be assigned to false • Example: Given in CNF: (x,y,z),(-x,y),(y,-w),(-x,y,-z) y is a pure literal it can be assigned true w is a pure literal it can be assigned false • Clauses with pure literals or tautologies can be removed! • a tautology is a clause of the form x –x y • The pure literal rule is expensive to apply during search KNOWLEDGE REPRESENTATION & REASONING - SAT
Pure Literal Rule • The pure literal rule can be sequentially applied • Consider the formula (u w x), (-w x y), (-u -x), (v w -y) v is a pure literal it can be assigned true The formula becomes (u w x), (-w x y), (-u -x) y is a pure literal it can be assigned true The formula becomes (u w x), (-u -x) w is a pure literal it can be assigned true The formula becomes (-u -x) both u and x are pure literals they can be assigned false KNOWLEDGE REPRESENTATION & REASONING - SAT
Other Deduction Methods • Weaker versions of UP • Binary UP resolves only unit and binary clauses • Can be used to solve a 2-SAT problem in quadratic time • Fixed-depth UP applies UP only up to a certain depth • Variants of Binary Resolution • BinRes, Equality Reduction, HyperBinRes • Failed Literal Detection • Hyper-Resolution • Krom Subsumption Resolution • Generalized Subsumption Resolution • Equivalence Reasoning • Etc. preprocessing propagation/deduction KNOWLEDGE REPRESENTATION & REASONING - SAT
Failed Literal Detection • Failed literal detection (Freeman, 1995) is a one-step lookahead with UP. • Say we force (assign) literal l and then perform UP. If this process yields a contradiction (empty literal) then we know that lis entailed by the current inputand we can force it (and then perform UP). • DPLL solversoften perform failed literal detection on a set of likely,heuristically selected, literalsat each node. • The SATZ system (Li & Anbulagan, 1997) wasthe first to show that very aggressive failed literal detectioncan pay off. • but doing it on all literals is too expensive KNOWLEDGE REPRESENTATION & REASONING - SAT
Binary Resolution • One “cheap” form of binary resolution consists of performing all possible resolutions of pairsof binary clauses • Such resolutions yield only new binaryclauses or new unit clauses • BinRes(Bacchus, 2002) repeatedly: • (a) adds to the formula all new binary or unit clauses producible byresolving pairs of binary clauses, and • (b) performs UP onany new unit clauses that appear (which in turn might producemore binary clauses causing another iteration of (a)), until either a contradiction is achieved, or nothingnew can be added by a step of (a) or (b). • BinRes ((a,b),(a,c),(b,c)) produces the new binary clauses (b,c),(a,c), and (c). Then unit propagation yields thefinal reduction. KNOWLEDGE REPRESENTATION & REASONING - SAT
Hyper Resolution • A hyper resolution rule resolves more than two clauses at the same time • HypBinRes is a rule of inference involving hyper-resolution It takes as input a single n-ary clause (n 2)(l1, l2, ..., ln) and n−1 binary clauses each of the form (li,l) (i = 1, . . . , n−1). Itproduces as output the new binary clause (l, ln). • For example, using HypBinRes hyperresolutionon the inputs (a, b, c, d), (h, a), (h, c), and (h, d), produces the new binaryclause (h, b) • HypBinRes is equivalent to a sequence of ordinary resolution steps (i.e., resolutionsteps involving only two clauses). However, such a sequence would generate clauses ofintermediate length while HypBinRes only generates the final binary clause KNOWLEDGE REPRESENTATION & REASONING - SAT
Krom Subsumption • Krom-subsumption resolution(van Gelder and Y. Tsuji, 1996) takes as input two clauses of the form x y and ¬x y Z and generates the clausey Z • where Z is a clause of arbitrary length • y Z subsumes (entails) ¬x y Z, therefore ¬x y Z can be deleted • GeneralizedSubsumption resolution takes two clauses x Y and ¬x Y Z and generatesY Z • We can derive propagation methods derived by repeatedly applying either form of resolution KNOWLEDGE REPRESENTATION & REASONING - SAT
Equality Reduction • If a formula F contains(a,b) as well as (a,b),then we can form a new formula EqReduce(F) by equalityreduction. • Equality reduction(Bacchus, 2002) involves: • (a) replacing allinstances of b in F by a (or vice versa), • (b) removing allclauses which now contain both a and a, • (c) removing allduplicate instances of a (or a) from all clauses. • This processmight generate new binary clauses • For example, EqReduce((a,b),(a,b),(a,b,c),(b,d),(a,b,d)) = ((a, d),(a,d)) • EqReduce(F) has a satisfying truth assignment iff F does. • And any truth assignment for EqReduce(F) can beextended to one for F by assigning b the same value as a. KNOWLEDGE REPRESENTATION & REASONING - SAT
DPLL HyperRes, BinRes, EqRed etc. status = preprocess(); if (status!=UNKNOWN) return status; while(1) { decide_next_branch(); while (true) { status = deduce(); if (status == CONFLICT) { blevel = analyze_conflict(); if (blevel == 0) return UNSATISFIABLE; elsebacktrack(blevel); } else if (status == SATISFIABLE) return SATISFIABLE; else break; } } branching heuristics UP backjumping/learning KNOWLEDGE REPRESENTATION & REASONING - SAT
Decision heuristics • DLIS (Dynamic Largest Individual Sum) For a given variable x: • Cx,p – # unresolved clauses in which x appears positively • Cx,n - # unresolved clauses in which x appears negatively • Let x be the literal for which Cx,p is maximal • Let ybe the literal for which Cy,n is maximal • If Cx,p > Cy,n choose x and assign it TRUE • Otherwise choose y and assign it FALSE • Requires l (#literals) queries for each decision. • (Implemented in some solvers e.g. Grasp) KNOWLEDGE REPRESENTATION & REASONING - SAT
Decision heuristics • DLCS (Dynamic Largest Combined Sum) For a given variable x: • Cx,p – # unresolved clauses in which x appears positively • Cx,n - # unresolved clauses in which x appears negatively • Let x be the literal for which Cx,p + Cx,n is maximal • If Cx,p > Cx,n and assign x to TRUE • Otherwise assign x to FALSE • Requires l (#literals) queries for each decision. • (Implemented in some solvers e.g. Grasp) KNOWLEDGE REPRESENTATION & REASONING - SAT
Decision heuristics • Bohm’s Heuristic • At each step of the backtrack search algorithm, the BOHM heuristic selects a variable with the maximal vector (H1(x),H2(x),…,Hn(x)) in lexicographic order. Each Hi(x) is computed as follows: Hi(x) = a max(hi(x), hi(x)) + b min(hi(x), hi(x)) • where hi(x) is the number of unresolved clauses with i literals that contain literal x. Hence, each selected literal gives preference to satisfying small clauses (when assigned value true) or to further reducing the size of small clauses (when assigned value false). • The values of aand are b chosen heuristically. KNOWLEDGE REPRESENTATION & REASONING - SAT
Decision heuristics Jeroslow-Wang method Compute for every clause w and every literal l: • J(l) := • One-sided JW: Choose a literal l that maximizes J(l) • Two-sided JW: Choose a variable x that maximizes J(x) + J(x) • Assign it to true if J(x) J(x) and false otherwise • This gives an exponentially higher weight to literals in shorter clauses. KNOWLEDGE REPRESENTATION & REASONING - SAT
Decision heuristics MOM (Maximum Occurrence of clauses of Minimum size). • Let f*(x) be the # of unresolved smallest clauses containing x. Choose x that maximizes: ((f*(x) + f*(x)) * 2k + f*(x) * f*(x) • k is chosen heuristically. • The idea: • Give preference to satisfying small clauses. • Among those, give preference to balanced variables (e.g. f*(x) =3,f*( x) = 3 is better than f*(x) = 1, f*(x) = 5). KNOWLEDGE REPRESENTATION & REASONING - SAT
Decision heuristics VSIDS (Variable State Independent Decaying Sum) 1. Each variable in each polarity has a counter initialized to 0. 2. When a clause is added, the counters are updated. 3. The unassigned variable with the highest counter is chosen. 4. Periodically, all the counters are divided by a constant. (Implemented in Chaff) KNOWLEDGE REPRESENTATION & REASONING - SAT
Decision heuristics VSIDS (cont’d) • Chaff holds a list of unassigned variables sorted by the counter value. • Updates are needed only when adding conflict clauses. • Thus - decision is made in constant time. KNOWLEDGE REPRESENTATION & REASONING - SAT
Decision heuristics VSIDS is a ‘quasi-static’ strategy: - static because it doesn’t depend on current assignment - dynamic because it gradually changes. Variables that appear in recent conflicts have higher priority. This strategy is a conflict-driven decision strategy. “..employing this strategy dramatically (i.e. an order of magnitude) improved performance ... “ KNOWLEDGE REPRESENTATION & REASONING - SAT
DPLL HyperRes, BinRes, EqRed etc. status = preprocess(); if (status!=UNKNOWN) return status; while(1) { decide_next_branch(); while (true) { status = deduce(); if (status == CONFLICT) { blevel = analyze_conflict(); if (blevel == 0) return UNSATISFIABLE; elsebacktrack(blevel); } else if (status == SATISFIABLE) return SATISFIABLE; else break; } } branching heuristics UP backjumping/learning KNOWLEDGE REPRESENTATION & REASONING - SAT
Conflict Analysis, Learning, Backjumping • When a conflicting clause is derived (i.e. a clause with all its literals 0), the solver must backtrack • conflict analysis finds the reason for a conflict and tries to resolve it • The DPLL algorithm uses chronological backtracking • it backtracks to the most recent decision point where a variable has not both of values its tried, and flips the current assignment • Example • Modern SAT solvers employ more advanced conflict analysis techniques to identify the actual reasons for the conflict • in this way they can achieve non-chronological backjumping KNOWLEDGE REPRESENTATION & REASONING - SAT
Conflict Analysis, Learning, Backjumping • Suppose the conflicting clause = (a x c) has been derived • i.e. a=1, x=0, c=1 • A set R of value assignments to variables in the problem is called a conflict assignment if after making these assignments and running UP, clause becomes unsatisfiable • assignment {a=1, x=0, c=1} is a trivial conflict assignment • But it is not of much use • Question: how can we derive more interesting conflict assignments? • Answer: determine why and at what decision level a=1, x=0, c=1 • Suppose we find that R={x=0, y=1, z=1} is also a conflict assignment for clause • the implied clause (x y z) which records the conflict assignment R is called a conflict clause KNOWLEDGE REPRESENTATION & REASONING - SAT
Conflict Analysis, Learning, Backjumping • Suppose that assignment x=0 of R={x=0, y=1, z=1} is chosen (or implied) at the current decision level v • assume that y=1 and z=1 are deduced at nodes v’ and v’’ respectively • suppose that v>v’>v’’ (i.e. v’’ is closest to the root) • After adding conflict clause (x y z) to the problem, we can backjump from v to v’ (skipping the nodes in between) • because whatever assignments we make there, the conflict at node v will still exist! • After we make the backjump, we can deduce x=1. Why? • because the added clause (x y z) will be a unit clause, forcing x=1 • without learning this clause, this deduction would not be possible • now we can avoid needless search and save time! KNOWLEDGE REPRESENTATION & REASONING - SAT
Conflict Analysis, Learning, Backjumping • During the conflict analysis information about conflicts is usually recorded and added to the problem as new (learned) clauses • these conflict clauses are redundant but they often help prune the search space in the future • this mechanism is called conflict-directed learning • Non-chronological backtracking is also called conflict-directed backjumping • originally proposed for CSPs (Prosser, 1993) • then incorporated in SAT solvers like GRASP (Silva and Sakallah, 1996) and rel_sat (Bayardo and Schrag, 1997) • Learning and conflict-directed backjumping can be analyzed using implication graphs or they can be viewed as a resolution process KNOWLEDGE REPRESENTATION & REASONING - SAT