300 likes | 453 Views
Data Structures for SAT Solvers The 2-Literal Representation. Gábor Kusper gkusper@aries.ektf.hu Eszterházy Károly College Eger, Hungary. Boolean Satisfiability (SAT). Identify truth assignment that satisfies boolean formula or prove it does not exist Well-known NP-complete problem. Outline.
E N D
Data Structures forSAT SolversThe 2-Literal Representation Gábor Kuspergkusper@aries.ektf.hu Eszterházy Károly College Eger, Hungary
Boolean Satisfiability (SAT) • Identify truth assignment that satisfies boolean formula or prove it does not exist • Well-known NP-complete problem
Outline • Notation • Data structures used by SAT solvers • Literal matrix (Scherzo) • Adjacency lists (GRASP, …) • Head/tail lists (SATO) • Watched literals (Chaff) • New data structure: • 2-Literal Matrix
Positive Literal Negative Literal Clause Conjunctive Normal Form (CNF) j = ( a +c ) ( b +c ) (¬a +¬b + ¬c )
unsat unresolved satisfied satisfied a assigned 0 b assigned 1 c and d unassigned Literal & Clause Classification j = (a+¬b)(¬a+b + ¬c)(a+ c + d)(¬a+¬b + ¬c)
Additional Definitions • Resolution Example: 1 = (¬a + b + c), 2 = (a + b + d) Resolution: res(1, 2, a) = (b + c + d) • Unit Propagation • An unresolved clause is unit if it has exactly one unassigned literal j = (a+c)(b+c)(¬a+¬b + ¬c) • A unit clause has exactly one option for being satisfied • c must be set to 0. • Boolean Constraint Propagation: iterated application of unit propagation
Data Structures • Literal matrix (Scherzo) • View CNF formula as a matrix, where the rows denote the clauses and the columns the variables • 2-Literal matrix, NEW • Adjacency lists (most SAT solvers) • Counter-based state maintenance • Keep counters of sat, unsat and unassigned (free) literals for each clause • Lazy data structures • Head/Tail lists (SATO) • Watched literals (Chaff)
State-of-the-art SAT Solvers • MiniSAT solver:http://www.cs.chalmers.se/Cs/Research/FormalMethods/MiniSat/ • Java SAT solver:http://www.sat4j.org/ • A paper about data structures:Efficient data structures for backtrack search SAT solversInês Lynce and João Marques-Silva
Literal Matrix • View CNF formula as a matrix, where the rows denote the clauses and the columns the variables • Assigned variables result in unsat literals • Satisfied clauses result in sat clauses • Each clause is an array of bits • Each clause contains counter of sat, unsat and unassigned sat literals • Used in the past in Binate Covering algorithms • E.g.: Scherzo, by Courdert et al., DAC’95 and DAC’96
1-Literal Matrix Representation • We can call the Literal Matrix to 1-Literal Matrix • We decode combination of 1-clause, each 1-clause correspond to a bit:01: -, 10: +01: a, 10:ā • The representation:00 sat 10 ā01 a 11 unsat
j = (a+¬b)(¬a+b + ¬c )(a+ c + d)(¬a+¬b + ¬c) j = (a+¬b)(¬a+b + ¬c)(a+ c + d)(¬a+¬b + ¬c) j = (a+¬b)(¬a+b + ¬c)(a+ c + d)(¬a+¬b + ¬c) a assigned 0 b assigned 1 a b c d a b c d a b c d a+¬b ¬a+b + ¬c a+ c + d ¬a+¬b + ¬c a+¬b a+ c + d a+¬b a+c + d + - x x - + - x + x + + - - - x x x x x sat x x + + sat x - x x sat x x + + sat 1-Literal Matrix
a b c d x - x x sat x x + + sat a b c d a +¬b ¬a +b + ¬c a + c + d ¬a +¬b + ¬c + - x x - + - x + x + + - - - x 1-Literal Matrix a assigned 0 b assigned 1
Definition of k-clause • A k-clause has k literal. • Example: j = ( a +c ) ( b +c ) (¬a +¬b + ¬c ) • 3-clauses in this formula are: • (¬a +¬b + ¬c ) • 2-clauses in this formula are: • (a + c) • (b + c) • There is no unit, i.e., 1-clause in this example.
2-Literal Matrix Representation • We decode combination of 2-clause. Each 2-clause correspond to a bit:1000: ae, 0100:aē,0010:āe, 0001: āē • Can code every boolean functions with two variables. • The representation:0 0000 sat 8 1000 ae1 0001 āē 9 1001 ae2 0010 āe A 1010 e3 0011 ā B 1011 āe4 0100 aē C 1100 a5 0101 ē D 1101 aē6 0110 ae E 1110 ae7 0111 āē F1111 unsat
a c b d a c b d a b c d a c b d a c b d 1101 0011 0001 1111 0100 0110 + x - x - - - x - - - x - - + x + + x + a+¬b ¬a+b + ¬c a+ c + d ¬a+¬b + ¬c a+¬b ¬a+b + ¬c a+ c + d ¬a+¬b + ¬c a+¬b ¬a+¬b + ¬c ¬a+b + ¬c a+ c + d + - x x - + - x + x + + - - - x + x - x - - + x + + x + - - - x + x - x - - - x - - + x + + x + 2-Literal Matrix ++ 1000 +- 0100 -+ 0010 -- 0001 xx 1111
a c b d a c b d 1101 0011 0001 1111 0100 0110 1100 0011 0000 1111 0100 0110 2-Literal Matrix ++ 1000 +- 0100 -+ 0010 -- 0001 xx 1111 (a + c) assigned 1 a assigned 1
Unit Propagation • public void unitPropagation(int column, BitSet unitToProp) { • if (nLiterals[column].equals(unSatLit)) • return; • BitSet clone = (BitSet)nLiterals[column].clone(); • clone.and(unitToProp); • if (clone.equals(nLiterals[column])) • subsumed = true; • nLiterals[column].or(unitToProp); • if (nLiterals[column].equals(unSatLit)) • numberOfEffectiveLiterals--; • }
n-Literal Matrix Representation • We decode combination of n-clause, each n-clause correspond to a bit. • It can code every boolean functions with n variables. • We need 2n bit. • The 1-literal and the 2-literal matrix have the same size.
1-Literal vs. 2-Literal Matrix • 1-Literal Matrix: • Advantages: • Easy to implement • Unit propagation results either in an sat clause or an unsat literal • Disadvantages: • Wasteful, on 4 bit we store only 9 different information
1-Literal vs. 2-Literal Matrix • 2-Literal Matrix: • Advantages: • Economical, on 4 bit we store 15 different information • One can propagate more (1110) or less (1000) information at once as a normal unit (1100) • Disadvantages: • Unit propagation by a 2-literal does not necessarily result in a sat clause or an unsat literal
Standard CNF Representation • Adjacency list representation: • Each clause contains: • A list of literals • Counter of sat, unsat and unassigned (free) literals • Each variable x keeps a list with all clauses with literals on x • Number of references kept in variables equals total number of literals, |L| • Used in some SAT solvers: • GRASP • rel-sat (some versions) • POSIT • etc.
Lazy Data Structures • Head/Tail Lists • Each clause contains a list of literals • Each unresolved clause is only referenced in twounassigned variables (but possibly in several assigned variables) • Each time a variable is assigned, referenced clauses either become unit, sat, unsat or a new reference becomes associated with another of the clause’s unassigned variables • Unit and unsat clauses can then be identified in constant time • Clause can be declared unit/unsat by inspection of two references • When backtracking, previous references are recovered • Knowledge of the order of literal assignments is maintained and it is essential
Examples of Lazy Structures unsatisfied literal clause literals @1 @3 @2 @4 literal references kept in variables unassigned literal satisfied literal literal assigned search decision depth d, @d Largest number of literal references in variables: |L| Smallest number of literal references in variables: 2|C|
H H H T T T @5 @1 @5 @3 @2 @4 @1 @3 @2 H H H H T T T T @1 @3 @2 @4 @1 @3 @2 @1 @3 @2 @4 Backtracking Unit clause @5 @5 @1 @1 @3 @3 @2 @2 @4 @4 Head/Tail Lists
Lazy Data Structures • Watched Literals • Each unresolved clause is only referenced in two unassigned variables (and not in any assigned variables) • Each time a variable is assigned, referenced clauses either become unit, sat, unsat or, of the two clause references, one becomes associated with another of the clause’s unassigned variables • Unit and unsat clauses can only be identified in linear time • Must visit all literals to confirm that clause is unit or unsat • When backtracking, do nothing • Knowledge of the order of literal assignments in clause is not (and cannot be) maintained
W W W W W W W W Unit clause @5 @1 @5 @3 @2 @4 @1 @3 @2 @5 @1 @5 @3 @2 @4 @5 @1 @3 @2 @4 W W W W @1 @3 @2 @4 @1 @3 @2 W W @1 @3 @2 @4 Backtracking Watched Literals
HT vs. WL • Head/Tail Lists: • Advantages: • Order relation between the two (H and T) references • More efficient identification of unit and unsat clauses • When one reference attempts to visit the other, clause is either unit or unsat • Better accuracy in characterizing the dynamic size of clauses • Disadvantages: • Larger overhead during backtracking • Worst-case number of references for each clause equals number of literals • Total (worst-case): |L| • Similar to adjacency lists in the worst-case
HT vs. WL • Watched Literals (WL): • Advantages: • Smaller overhead • Constant number (2) of references for each clause • Total (worst-case): 2|C| • Twice the number of clauses, and |C| << |L| • Disadvantages: • Lack of order relation between the two (W) references • Identification of new unit or unsat clauses is always linear in clause size • Worse accuracy in characterizing the dynamic size of clauses
Matrix vs. Lazy Data Structures • Matrix data structures: • Each clause is an array of bits • Lazy data structures: • Each clause is a list of literals • Matrix data structures: • Advantages: • Can identify not only unit clauses but also binary and ternary ones • Disadvantages: • It needs space also for not concrete literals • unit propagation is a |C| time method • backtrack is a |C| time method
Matrix vs. Lazy Data Structures • Lazy data structures: • Advantages: • Unit propagation is a |P| + |N| time method • |P|+|N| <= |C| • Disadvantages: • We don’t know the size of the clause, can identify only unit clauses