Combining Component Caching and Clause Learning for Effective Model Counting

Combining Component Caching and Clause Learning for Effective Model Counting Tian Sang University of Washington Fahiem Bacchus (U Toronto), Paul Beame (UW), Henry Kautz (UW), & Toniann Pitassi (U Toronto)

Why #SAT? • Prototypical #P complete problem • Natural encoding for counting problems • Test-set size • CMOS power consumption • Can encode probabilistic inference

Generality NP complete • SAT • #SAT • Bayesian Networks • Bounded-alternation Quantified Boolean formulas • Quantified Boolean formulas • Stochastic SAT #P complete PSPACE complete

Our Approach • Good old Davis-Putnam-Logemann-Loveland • Clause learning (“no good-caching”) • Bounded component analysis • Formula caching

DPLL with Clause Learning DPLL(F) while exists unit clause (y)  F F  F|y if F is empty, report satisfiable and halt if F contains the empty clause Add a conflict clause C to F return false choose a literal x return DPLL(F|x) || DPLL(F|x)

x1 y x2 false y 1-UIP scheme (t) x3 p q a b t Decision scheme (p  q   b) Conflict Graph Known Clauses (p  q  a) ( a   b   t) (t  x1) (t  x2) (t  x3) (x1 x2  x3  y) (x2 y) Current decisions p  false q  false b  true

Component Analysis • Can use DPLL to count models • Just don’t stop when first assignment is found • If formula breaks into separate components (no shared variables), can count each separately and multiply results: #SAT(C1 C2) = #SAT(C1) * #SAT(C2) (Bayardo & Shrag 1996)

Formula Caching • New idea: cache number of models of residual formulas at each node • Bacchus, Dalmao & Pitassi 2003 • Beame, Impagliazzo, Pitassi, & Segerlind 2003 • Matches time/space tradeoffs of best known exact probabilistic inference algorithms:

#SAT with Component Caching #SAT(F) // Returns fraction of all truth // assignments that satisfy F a = 1; for each G  to_components(F) { if (G ==) m = 1; else if (  G) m = 0; else if (in_cache(G)) m = cache_value(G); else { select v  G; m = ½ * #SAT(G|v) + ½ * #SAT(G|v); insert_cache(G,m); } if (m == 0) return 0; a = a * m; } return a;

Putting it All Together • Goal: combine • Clause learning • Component analysis • Formula caching to create a practical #SAT algorithm • Not quite as straightforward as it looks!

Issue 1: How Much to Cache? • Everything • Infeasible – often > 10,000,000 nodes • Only sub-formulas on current branch • Linear space • Similar to recursive conditioning [Darwiche 2002] • Can we do better?

Age versus Cumulative Hits age = time elapsed since the entry was cached

Efficient Cache Management • Age-bounded caching • Separate-chaining hash table • Lazy deletion of entries older than K when searching chains • Constant amortized time

Issue 2: Interaction of Component Analysis & Clause Learning • As clause learning progresses, formula becomes huge • 1,000 clauses  1,000,000 learned clauses • Finding connected components becomes too costly • Components using learned clauses unlikely to reoccur!

Bounded Component Analysis • Use only clauses derived from original formula for • Component analysis • “Keys” for cached entries • Use all the learned clauses for unit propagation • Can this possibly be sound? Almost!

Safety Theorem G| F| Given: original formula F learned clauses G partial assignment  F| is satisfiable Ai is a component of F|  satisfies Ai A1 A2 A3 Then:  can be extended to satisfy G| It is safe to use learned clauses for unit propagation for SAT sub-formulas

UNSAT Sub-formulas • But if F| is unsatisfiable, all bets are off... • Without component caching, there is still no problem – because the final value is 0 in any case • With component caching, could cause incorrect values to be cached • Solution: Flush siblings (& their descendents) of UNSAT components from cache

Safe Caching + Clause Learning Implementation ... else if (  G) { m = 0; add a conflict clause; } ... if (m==0) { flush_cache( siblings(G) ) if (G is not last child of F) flush_cache(G); return 0; } a = a * m; ...

Evaluation • Implementation based on zChaff (Moskewicz, Madigan, Zhao, Zhang, & Malik 2001) • Benchmarks • Random formulas • Pebbling graph formulas • Circuit synthesis • Logistics planning

Random 3-SAT, 75 Variables sat/unsat threshhold

Random 3-SAT Results 75V, R=1.0 75V, R=1.4 75V, R=1.6 75V, R=2.0

Results: Pebbling Formulas X means time-out after 12 hours

Summary • A practical exact model-counting algorithm can be built by the careful combination of • Bounded component analysis • Component caching • Clause learning • Outperforms the best previous algorithm by orders of magnitude

What’s Next? • Better heuristics • component ordering • variable branching • Incremental component analysis • Currently consumes 10-50% of run time! • Applications to Bayesian networks • Compiler for discrete BN to weighted #SAT • Direct BN implementation • Applications to other #P problems • Testing, model-based diagnosis, …

Questions?

Results: Planning Formulas X means time-out after 12 hours

Results: Circuit Synthesis X means time-out after 12 hours

Bayesian Nets to Weighted Counting • Introduce new vars so all internal vars are deterministic A B

A P Q B Bayesian Nets to Weighted Counting • Introduce new vars so all internal vars are deterministic A B

A P Q B Bayesian Nets to Weighted Counting • Weight of a model is product of variable weights • Weight of a formula is sum of weights of its models

A P Q B Bayesian Nets to Weighted Counting • Let F be the formula defining all internal variables • Pr(query) =weight(F & query)

Bayesian Nets to Counting • Unweighted counting is case where all non-defined variables have weight 0.5 • Introduce sets of variables to define other probabilities to desired accuracy • In practice: just modify #SAT algorithm to weighted #SAT

Combining Component Caching and Clause Learning for Effective Model Counting

Combining Component Caching and Clause Learning for Effective Model Counting

Presentation Transcript

Combining Knowledge-based Methods and Supervised Learning for Effective Word Sense Disambiguation

The Model Clause explained

MDP Component - Caching Proxy Module

Combining Predicate and Numeric Abstraction for Software Model Checking

Combining Classification and Model Trees for Handling Ordinal Problems

Heuristics for Fast Exact Model Counting

Combining Inductive and Analytical Learning

Koala component model

Obsydian Component Model

Grammatical functions and Pragmatic considerations in clause combining

Combining Effective Individual and Group Learning In Online Courses

Using Problem Structure for Efficient Clause Learning

The Model Clause

CORBA Component Model and TANGO

Component Object Model

Combining Genetics, Learning and Parenting

Combining Component Caching and Clause Learning for Effective Model Counting

Developing a Constructivist Model for Effective Physics Learning

Combining Predicate and Numeric Abstraction for Software Model Checking

Communicating Vocabulary: Conditional Model/ Noun Clause Model

Clause Learning and Intelligent Backtracking in MiniSAT