Performing Bayesian Inference by Weighted Model Counting

Performing Bayesian Inference by Weighted Model Counting Tian Sang, Paul Beame, and Henry Kautz Department of Computer Science & Engineering University of Washington Seattle, WA

Goal • Extend success of “compilation to SAT” work for NP-complete problems to “compilation to #SAT” for #P-complete problems • Leverage rapid advances in SAT technology • Example: Computing permanent of a 0/1 matrix • Inference in Bayesian networks (Roth 1996, Dechter 1999) • Provide practical reasoning tool • Demonstrate relationship between #SAT and conditioning algorithms • In particular: compilation to DNNF (Darwiche 2002, 2004)

Contributions • Simple encoding of Bayesian networks into weighted model counting • Techniques for extending state-of-the-art SAT algorithms for efficient weighted model counting • Evaluation on computationally challenging domains • Outperforms join-tree methods on problems with high tree-width • Competitive with best conditioning methods on problems with high degree of determinism

Outline • Model counting • Encoding Bayesian networks • Related Bayesian inference algorithms • Experiments • Grid networks • Plan recognition • Conclusion

SAT and #SAT • Given a CNF formula, • SAT: find a satisfying assignment n • #SAT: count satisfying assignments • Example: (x  y)  (y  z) • 5 models: (0,1,0), (0,1,1), (1,1,0), (1,1,1), (1, 0, 0) • Equivalently: satisfying probability = 5/23 • Probability that formula is satisfied by a random truth assignment • Can modify Davis-Putnam-Logemann-Loveland to calculate this value

DPLL for SAT DPLL(F) if F is empty, return 1 if F contains an empty clause, return 0 else choose a variable x to branch return (DPLL(F|x=1) V DPLL(F|x=0)) #DPLL for #SAT #DPLL(F) // computes satisfying probability of F if F is empty, return 1 if F contains an empty clause, return 0 else choose a variable x to branch return 0.5*#DPLL(F|x=1 )+ 0.5*#DPLL(F|x=0)

Weighted Model Counting • Each literal has a weight • Weight of a model = Product of weight of its literals • Weight of a formula = Sum of weight of its models WMC(F) if F is empty, return 1 if F contains an empty clause, return 0 else choose a variable x to branch return weight(x) * WMC(F|x=1) + weight(x) * WMC(F|x=0)

Cachet • State of the art model counting program (Sang, Bacchus, Beame, Kautz, & Pitassi 2004) • Key innovation: sound integration of component caching and clause learning • Component analysis(Bayardo & Pehoushek 2000): if formulas C1 and C2 share no variables, BWMC (C1 C2) = BWMC (C1) * BWMC (C2) • Caching (Majercik & Littman 1998; Darwiche 2002; Bacchus, Dalmao, & Pitassi 2003; Beame, Impagliazzo, Pitassi, & Segerland 2003): save and reuse values of internal nodes of search tree • Clause learning(Marquis-Silva 1996; Bayardo & Shrag 1997; Zhang, Madigan, Moskewicz, & Malik 2001): analyze reason for backtracking, store as a new clause

Cachet • State of the art model counting program (Sang, Bacchus, Beame, Kautz, & Pitassi 2004) • Key innovation: sound integration of component caching and clause learning • Naïve combination of all three techniques is unsound • Can resolve by careful cache management (Sang, Bacchus, Beame, Kautz, & Pitassi 2004) • New branching strategy (VSADS) optimized for counting (Sang, Beame, & Kautz SAT-2005)

Computing All Marginals • Task: In one counting pass, • Compute number of models in which each literal is true • Equivalently: compute marginal satisfying probabilities • Approach • Each recursion computes a vector of marginals • At branch point: compute left and right vectors, combine with vector sum • Cache vectors, not just counts • Reasonable overhead: 10% - 40% slower than counting

B B A 0.2 0.8 A 0.6 0.4 Encoding Bayesian Networks to Weighted Model Counting A A 0.1 B

B B A 0.2 0.8 A 0.6 0.4 Encoding Bayesian Networks to Weighted Model Counting A A 0.1 Chance variable P added with weight(P)=0.2 B

B B A 0.2 0.8 A 0.6 0.4 Encoding Bayesian Networks to Weighted Model Counting A A 0.1 and weight(P)=0.8 B

B B A 0.2 0.8 A 0.6 0.4 Encoding Bayesian Networks to Weighted Model Counting A A 0.1 Chance variable Q added with weight(Q)=0.6 B

B B A 0.2 0.8 A 0.6 0.4 Encoding Bayesian Networks to Weighted Model Counting A A 0.1 and weight(Q)=0.4 B

B B A 0.2 0.8 A 0.6 0.4 Encoding Bayesian Networks to Weighted Model Counting A A 0.1 B

Main Theorem • Let: • F = a weighted CNF encoding of a Bayes net • E = an arbitrary CNF formula, the evidence • Q = an arbitrary CNF formula, the query • Then:

Exact Bayesian Inference Algorithms • Junction tree algorithm (Shenoy & Shafer 1990) • Most widely used approach • Data structure grows exponentially large in tree-width of underlying graph • To handle high tree-width, researchers developed conditioning algorithms, e.g.: • Recursive conditioning (Darwiche 2001) • Value elimination (Bacchus, Dalmao, Pitassi 2003) • Compilation to d-DNNF (Darwiche 2002; Chavira, Darwiche, Jaeger 2004; Darwiche 2004) • These algorithms become similar to DPLL...

Techniques

Experiments • Our benchmarks: Grid, Plan Recognition • Junction tree - Netica • Recursive conditioning – SamIam • Value elimination – Valelim • Weighted model counting – Cachet • ISCAS-85 and SATLIB benchmarks • Compilation to d-DNNF – timings from (Darwiche 2004) • Weighted model counting - Cachet

S T Experiments: Grid Networks • CPT’s are set randomly. • A fraction of the nodes are deterministic, specified as a parameter ratio. • T is the query node

Results of ratio=0.5 10 problems of each size, X=memory out or time out

Results of ratio=0.75

Results of ratio=0.9

Plan Recognition • Task: • Given a planning domain described by STRIPS operators, initial and goal states, and time horizon • Infer the marginal probabilities of each action • Abstraction of strategic plan recognition: We know enemy’s capabilities and goals, what will it do? • Modified Blackbox planning system (Kautz & Selman 1999) to create instances

ISCAS/SATLIB Benchmarks

Summary • Bayesian inference by translation to model counting is competitive with best known algorithms for problems with • High tree-width • High degree of determinism • Recent conditioning algorithms already make use of important SAT techniques • Most striking: compilation to d-DNNF • Translation approach makes it possible to quickly exploit future SAT algorithms and implementations

Performing Bayesian Inference by Weighted Model Counting

Performing Bayesian Inference by Weighted Model Counting

Presentation Transcript

Bayesian Inference

Bayesian Inference!!!

Bayesian Inference

Bayesian Inference

Bayesian Inference

Bayesian Inference

Bayesian Inference

Bayesian Inference

Bayesian Inference

Bayesian Inference

Bayesian inference

Bayesian Inference

Bayesian inference

Bayesian Inference

Bayesian inference

Bayesian Inference

Bayesian inference

Bayesian Inference

Bayesian inference

Bayesian Inference

Bayesian inference