320 likes | 485 Views
Combining Component Caching and Clause Learning for Effective Model Counting . Tian Sang University of Washington Fahiem Bacchus (U Toronto), Paul Beame (UW), Henry Kautz (UW), & Toniann Pitassi (U Toronto). Why #SAT?. Prototypical #P complete problem Natural encoding for counting problems
E N D
Combining Component Caching and Clause Learning for Effective Model Counting Tian Sang University of Washington Fahiem Bacchus (U Toronto), Paul Beame (UW), Henry Kautz (UW), & Toniann Pitassi (U Toronto)
Why #SAT? • Prototypical #P complete problem • Natural encoding for counting problems • Test-set size • CMOS power consumption • Can encode probabilistic inference
Generality NP complete • SAT • #SAT • Bayesian Networks • Bounded-alternation Quantified Boolean formulas • Quantified Boolean formulas • Stochastic SAT #P complete PSPACE complete
Our Approach • Good old Davis-Putnam-Logemann-Loveland • Clause learning (“no good-caching”) • Bounded component analysis • Formula caching
DPLL with Clause Learning DPLL(F) while exists unit clause (y) F F F|y if F is empty, report satisfiable and halt if F contains the empty clause Add a conflict clause C to F return false choose a literal x return DPLL(F|x) || DPLL(F|x)
x1 y x2 false y 1-UIP scheme (t) x3 p q a b t Decision scheme (p q b) Conflict Graph Known Clauses (p q a) ( a b t) (t x1) (t x2) (t x3) (x1 x2 x3 y) (x2 y) Current decisions p false q false b true
Component Analysis • Can use DPLL to count models • Just don’t stop when first assignment is found • If formula breaks into separate components (no shared variables), can count each separately and multiply results: #SAT(C1 C2) = #SAT(C1) * #SAT(C2) (Bayardo & Shrag 1996)
Formula Caching • New idea: cache number of models of residual formulas at each node • Bacchus, Dalmao & Pitassi 2003 • Beame, Impagliazzo, Pitassi, & Segerlind 2003 • Matches time/space tradeoffs of best known exact probabilistic inference algorithms:
#SAT with Component Caching #SAT(F) // Returns fraction of all truth // assignments that satisfy F a = 1; for each G to_components(F) { if (G ==) m = 1; else if ( G) m = 0; else if (in_cache(G)) m = cache_value(G); else { select v G; m = ½ * #SAT(G|v) + ½ * #SAT(G|v); insert_cache(G,m); } if (m == 0) return 0; a = a * m; } return a;
Putting it All Together • Goal: combine • Clause learning • Component analysis • Formula caching to create a practical #SAT algorithm • Not quite as straightforward as it looks!
Issue 1: How Much to Cache? • Everything • Infeasible – often > 10,000,000 nodes • Only sub-formulas on current branch • Linear space • Similar to recursive conditioning [Darwiche 2002] • Can we do better?
Age versus Cumulative Hits age = time elapsed since the entry was cached
Efficient Cache Management • Age-bounded caching • Separate-chaining hash table • Lazy deletion of entries older than K when searching chains • Constant amortized time
Issue 2: Interaction of Component Analysis & Clause Learning • As clause learning progresses, formula becomes huge • 1,000 clauses 1,000,000 learned clauses • Finding connected components becomes too costly • Components using learned clauses unlikely to reoccur!
Bounded Component Analysis • Use only clauses derived from original formula for • Component analysis • “Keys” for cached entries • Use all the learned clauses for unit propagation • Can this possibly be sound? Almost!
Safety Theorem G| F| Given: original formula F learned clauses G partial assignment F| is satisfiable Ai is a component of F| satisfies Ai A1 A2 A3 Then: can be extended to satisfy G| It is safe to use learned clauses for unit propagation for SAT sub-formulas
UNSAT Sub-formulas • But if F| is unsatisfiable, all bets are off... • Without component caching, there is still no problem – because the final value is 0 in any case • With component caching, could cause incorrect values to be cached • Solution: Flush siblings (& their descendents) of UNSAT components from cache
Safe Caching + Clause Learning Implementation ... else if ( G) { m = 0; add a conflict clause; } ... if (m==0) { flush_cache( siblings(G) ) if (G is not last child of F) flush_cache(G); return 0; } a = a * m; ...
Evaluation • Implementation based on zChaff (Moskewicz, Madigan, Zhao, Zhang, & Malik 2001) • Benchmarks • Random formulas • Pebbling graph formulas • Circuit synthesis • Logistics planning
Random 3-SAT, 75 Variables sat/unsat threshhold
Random 3-SAT Results 75V, R=1.0 75V, R=1.4 75V, R=1.6 75V, R=2.0
Results: Pebbling Formulas X means time-out after 12 hours
Summary • A practical exact model-counting algorithm can be built by the careful combination of • Bounded component analysis • Component caching • Clause learning • Outperforms the best previous algorithm by orders of magnitude
What’s Next? • Better heuristics • component ordering • variable branching • Incremental component analysis • Currently consumes 10-50% of run time! • Applications to Bayesian networks • Compiler for discrete BN to weighted #SAT • Direct BN implementation • Applications to other #P problems • Testing, model-based diagnosis, …
Results: Planning Formulas X means time-out after 12 hours
Results: Circuit Synthesis X means time-out after 12 hours
Bayesian Nets to Weighted Counting • Introduce new vars so all internal vars are deterministic A B
A P Q B Bayesian Nets to Weighted Counting • Introduce new vars so all internal vars are deterministic A B
A P Q B Bayesian Nets to Weighted Counting • Weight of a model is product of variable weights • Weight of a formula is sum of weights of its models
A P Q B Bayesian Nets to Weighted Counting • Let F be the formula defining all internal variables • Pr(query) =weight(F & query)
Bayesian Nets to Counting • Unweighted counting is case where all non-defined variables have weight 0.5 • Introduce sets of variables to define other probabilities to desired accuracy • In practice: just modify #SAT algorithm to weighted #SAT