630 likes | 775 Views
Non-clausal Reasoning. Fahiem Bacchus, Christian Thiffault, Toronto Toby Walsh, UCC & Uppsala (soon UNSW, NICTA, Uppsala). Every morning …. I read the plaque on the wall of this house … Dedicated to the memory of George Boole …
E N D
Non-clausal Reasoning Fahiem Bacchus, Christian Thiffault, Toronto Toby Walsh, UCC & Uppsala (soon UNSW, NICTA, Uppsala)
Every morning … I read the plaque on the wall of this house … Dedicated to the memory of George Boole … Professor of Mathematics at Queens College (now University College Cork)
George Boole (1815-1864) • Boolean algebra The Mathematical Analysis of Logic, Cambridge, 1847 The Calculus of Logic, Cambridge and Dublin Mathematical journal, 1848 • Reduce propositional logic to algebraic manipulations
George Boole (1815-1864) • Boolean algebra The Mathematical Analysis of Logic, Cambridge, 1847 The Calculus of Logic, Cambridge and Dublin Mathematical journal, 1848 • Reduce propositional logic to algebraic manipulations Crater on the moon named after him!
Propositional SATisfiability • Rapid progress being made • 10 years ago, < 50 vars • Today, > 1000 vars • Algorithmic advances • Learning • Watched literals • .. • Heuristic advances • VSIDS branching
Propositional SATisfiability • Efficient implementations • Chaff, Berkmin, Forklift, … • SAT competition has new winner almost every year • Practical applications • Hardware verification • Planning • …
SAT folklore • Need to solve in CNF • Everything is a clause • Efficient reasoning • Optimize code with simple data structures … • Effective reasoning • Conversion into CNF does not hinder unit propagation
Overturning SAT folklore • Deciding arbitrary Boolean formulae • Without converting into CNF • Efficient reasoning • Raw speed as good as optimized CNF solvers • Effective reasoning • More inference than unit propagation • Exploit structure • More exotic gates, … Similar ideas being explored in ATPG
Davis Putnam procedure DPLL(S) if S empty then SAT if S contains {} then UNSAT if S contains unit, l then DPLL(S u {l}) else chose literal, l if DPLL(S u {l}) then SAT else DPLL(S u {-l})
Unit Propagation • If the formula has a unit clause then the literal in that clause must be true • Set the literal to true and reduce the formula. • Unit propagation is the most commonly used type of constraint propagation • One of the most important parts of current SAT solvers
Unit Propagation (a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
Unit Propagation (a)(-a, b, c)(-b)(a, d, e)(-c, d, g) a=true
Unit Propagation (a)(-a, b, c)(-b)(a, d, e)(-c, d, g) a=true
Unit Propagation (a)(-a, b, c)(-b)(a, d, e)(-c, d, g) a=true
Unit Propagation (a)(-a, b, c)(-b)(a, d, e)(-c, d, g) a=true
Unit Propagation (a)(-a, b, c)(-b)(a, d, e)(-c, d, g) b=false
Unit Propagation (a)(-a, b, c)(-b)(a, d, e)(-c, d, g) b=false
Unit Propagation (a)(-a, b, c)(-b)(a, d, e)(-c, d, g) b=false
Unit Propagation (a)(-a, b, c)(-b)(a, d, e)(-c, d, g) c = true
Unit Propagation (a)(-a, b, c)(-b)(a, d, e)(-c, d, g) c = true
Unit Propagation (a)(-a, b, c)(-b)(a, d, e)(-c, d, g) c = true
Implementing Unit Propagation • UP is main (often only) inference rule applied at each search node. • Performing UP occupies most of the time in these solvers. • More efficient implementations of UP has been one of the recent advances.
Implementing Unit Propagation • Most DPLL solvers do not build an explicit representation of the reduced formula • Too expensive in time and space to do this. • Rather they keep original formula and mark the changes made • All changes generated by UP undone when we backtrack.
Tableau [Crawford and Auton 95] • We number the variables and clauses. • Each variable has • a field to store its current value, true, false or unvalued • the list of clauses it appears positively in • the list of clauses it appears negatively in • Each clause has • a list of its literals • a flag to indicate whether or not it is satisfied • the number of unvalued literals it contains
Tableau [Crawford and Auton 95] • Unit propagated literal put on a stack • pop the literal on top of the stack • mark the variable with the appropriate value. • mark each clause it appears positively in as satisfied. • for each clause it appears negatively in • if the clause is not already satisfied decrement the clause’s counter • if the counter is equal to 1, the clause is unit • find the single unvalued literal in the clause and add that literal to the UP stack. • remember all changes so that they can be undone on backtrack.
Watch literals [SATO, Chaff] • Tableau’s technique requires visiting each clause a variable appears in when we value a variable. • When clause learning is employed, and 100,000’s of long new clauses are added to the original formula this becomes slow. • The watch literal technique is more efficient.
Watch literals [SATO, Chaff] • For each clause, pick two literals to watch. • At least one of these literals must be false for the clause to be unit. • For each variable instead of lists of all of the clauses it appears in positively and negatively, we only have lists of the clauses it is a watch for. • reduces the total size of these lists from O(kn) to O(n)
Watch literals [SATO, Chaff] • When we assign a value to a variable we • Ignore the clauses it watches positively • For each clause it watches negatively, we search the clause: • if we find an unvalued literal or a true literal not equal to the other watch we replace this literal the watch • otherwise the clause is unit and we UP the other watch literal if it is not already true. • On backtrack we do nothing! • The new watch literals retain the property that at least one of them must become false if the clause is to become unit.
Convert into CNF Use efficient DPLL solver like Chaff Adapt DPLL solver to reason with non-CNF Exploit structure Permit complex gates (eg counting, XOR, ..) Solving non-CNF formulae
Encoding into CNF • Most common (and relatively efficient?) is that of [Tseitin 1970]. • Recusively converts a formula by adding a new variable for every subformula. • Linear space
A (C & D) Tseitin Encoding
A (C & D) V1 (C & D) (~V1, C), (~V1, D), (~C,~D,V1) Tseitin Encoding
A (C & D) V1 (C & D) (~V1, C), (~V1, D), (~C,~D,V1) V2 (A V1) (~V2,~A,V1), (A, V2), (~V1, V2) Tseitin Encoding
A (C & D) V1 (C & D) (~V1, C), (~V1, D), (~C,~D,V1) V2 (A V1) (~V2,~A,V1), (A, V2), (~V1, V2) Tseitin Encoding
Disadvantage of CNF • Structural information is lost • Flattens formulae into clauses. • In a Boolean circuit • Which variables are inputs? • Which are internal wires? • … • Additional variables are added. • Potentially increases the size of the DPLL search.
Structural Information • Not all structural information can be recovered [Lang & Marquis, 1989]. • Recovering structural information can improve performance [EqSatZ, LSAT]. • Why lose this information in the first place? • In addition, we can exploit more complex gates
Extra Variables • Potentially “increase” search space • Do not branch on any on the newly introduced “subformula” variables. • Theoretically this can increase exponentially the size of smallest DPLL proof [Jarvisalo et al. 2004] • Empirically solvers restricted in this way can perform poorly
Extra Variables • The alternative is unrestricted branching. • However, with unrestricted branching, a CNF solver can waste a lot of time branching on variables that have become “irrelevant”.
Irrelevant Variables A (C & D) A=false formula satisfied
Irrelevant Variables Solver must still determine that the remaining clauses are SAT A (C & D) V1 (C & D) V2 (A V1)
Converting to CNF is Unnecessary • Search can be performed on the original formula. • This has been noted in previous work on circuit based solvers, e.g. [Ganai et al. 2002] • Reasoning with the original formula may permit other efficiencies • E.g. exploiting structure, & complex gates
DPLL on formulae • View formulae as DAGs • Every node has a label (True/ False/ Unassigned) • Branch on the truth value of any unassigned node • Use Boolean logic to propagate truth values to neighbouring nodes • Contradiction when node is labeled both True and False • Find consistent labeling with truth values that assigns True to root (SAT) • Or exhaust all possibilities (UNSAT)
True \/ \/ False xor B A & & C C D D
Labeling unit propagation • Labeling a node assigning a truth value to corresponding var in CNF encoding • Propagating labels in the DAG unit propagation in the CNF encoding
Learning • Once a contradiction is detected a conflict clause can be learned • set of impossible node assignments • can use 1-UIP scheme (as in CNF solvers) • Learned clauses stored and used to unit propagate node truth values
Complex gates • Gates can have arbitrary degree • n-ary AND, n-ary OR, … • Gates can be complicated Boolean functions • n-ary XOR (which requires exponential number of CNF clauses) • cardinality gates (at least one, k out of n, ..)
Label propagation • Use lazy data structures as in CNF solvers • For example. assign one child as a true watch for an AND gate • Don’t check if AND gate can be labeled true until its true watch becomes true • Some benchmarks have AND gates with thousands of children • No intrinsic loss of efficiency in using the DAG over CNF.
Structure based optimizations • We can also exploit the extra structural information the DAG provides • Two such optimizations • Don’t care propagation to deal with irrelevant subformulae • Conflict clause reduction
Don’t Care labeling • Add a third “truth” value to the DAG: “don’t care” • A node C is don’t care wrt a particular parent P • If its truth value can no longer affect the truth value of P nor any of its P siblings. • Or P is don’t care. • A node C is don’t care if it is don’t care wrt to all of its parents • No need to branch on don’t cares!