1 / 35

SampleSearch Scheme for Consistent Samples

This paper presents the SampleSearch scheme, an algorithm for generating consistent samples in Bayesian networks using importance sampling. The algorithm addresses the rejection problem and provides an approximation of the sampling distribution. Experimental results show its effectiveness.

Download Presentation

SampleSearch Scheme for Consistent Samples

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SampleSearch: A scheme that searches for Consistent Samples Vibhav Gogate and Rina Dechter University of California, Irvine USA

  2. Outline • Background • Bayesian Networks with Zero probabilities • Importance Sampling • Rejection Problem • The SampleSearch Scheme • Algorithm • Sampling Distribution and its Approximation • Experimental Results

  3. P(S) P(C|S) P(B|S) P(X|C,S) P(D|C,B) Bayesian Networks: Representation(Pearl, 1988) Smoking lung Cancer Bronchitis X-ray Dyspnoea P(S, C, B, X, D)= P(S) P(C|S) P(B|S) P(X|C,S) P(D|C,B) (A) Probability of Evidence P(smoking=no, dyspnoea=yes)=? (B) Belief Updating: P (lung cancer=yes | smoking=no, dyspnoea=yes ) = ?

  4. Complexity • Belief Updating • NP-hard when zeros are present • General case when all CPTs are positive, not known. • Relative Approximation • Randomized Polynomial time algorithm when all CPTs are positive (Dagum and Luby 1997) • Probability of Evidence • NP-hard when zeros are present • Relative Approximation • Randomized Polynomial time algorithm when all CPTs are positive and (1/P(e)) is polynomial (Karp, Dagum and Luby 1993)

  5. Importance Sampling (Rubinstein ’81)

  6. Importance Sampling for Belief Updating

  7. Generating i.i.d. samples from Q Q(A,B,C)=Q(A)*Q(B|A)*Q(C|A,B) Q(A)=(0.8,0.2) Q(B|A)=(0.4,0.6,0.2,0.8) Q(C|A,B)=Q(C)=(0.2,0.8) Root 0.8 0.2 A=1 A=0 0.8 0.2 0.4 0.6 B=0 B=1 B=0 B=1 0.8 0.8 0.2 0.2 0.8 0.2 0.8 0.2 C=0 C=1 C=0 C=1 C=0 C=1 C=0 C=1

  8. Rejection Problem • Importance Sampling requirement • f(xi)>0 => Q(xi)>0 • Conversely, Q(xi) can be >0 even if f(xi)=0. • So if the probability of sampling ∑Q(xi|f(xi)>0) is very small • A large number of assignments will have zero weight • Extreme case: Our approximation = zero.

  9. Rejection Problem Root All Blue leaves correspond to solutions i.e. f(x) >0 All Red leaves correspond to non-solutions i.e. f(x)=0 0.8 0.2 A=1 A=0 0.8 0.2 0.4 0.6 B=0 B=1 B=0 B=1 0.8 0.8 0.2 0.2 0.8 0.2 0.8 0.2 C=0 C=1 C=0 C=1 C=0 C=1 C=0 C=1

  10. E A D B F G C Constraint Networks(Dechter 2003) Example: map coloring Variables - countries (A,B,C,etc.) Values - colors (red, green, blue) Constraints: A Solution is an assignment that satisfies all constraints

  11. Constraint networks to model “zeros” Constraints A=0, C=0 not allowed A=1, C=1 not allowed Or A≠C A B C • Why constraints? • For a partial sample if a constraint is violated f(X=x)=0 for any full extension X=x of the sample. • For every full assignment X=x • solution implies f(X=x) >0 and • non-solution f(X=x)=0 F D G

  12. Using Constraints Constraints A≠B, A≠C Root 0.8 0.2 A=1 A=0 0.8 0.2 0.4 0.6 B=0 B=1 B=0 B=1 0.8 0.8 0.2 0.2 0.8 0.2 0.8 0.2 C=0 C=1 C=0 C=1 C=0 C=1 C=0 C=1

  13. Using Constraints Root Constraints A≠B, A≠C 0.8 A=0 0.4 0.6 B=0 B=1 Constraint A≠B violated 0.8 0.2 0.2 0.8 C=0 C=1 C=0 C=1 C=0

  14. Outline • Background • Bayesian Networks • Importance Sampling • Rejection Prblem • The SampleSearch Scheme • Algorithm • Sampling Distribution and Approximation • Experimental Results

  15. Algorithm SampleSearch Root Constraints A≠B, A≠C 0.8 0.2 A=1 A=0 0.8 0.2 0.4 0.6 B=0 B=1 B=0 B=1 0.8 0.8 0.2 0.2 0.8 0.2 0.8 0.2 C=0 C=1 C=0 C=1 C=0 C=1 C=0 C=1

  16. Algorithm SampleSearch Root Constraints A≠B, A≠C 0.8 0.2 A=1 A=0 0.8 0.2 0.4 0.6 1 B=0 B=1 B=0 B=1 0.8 0.8 0.2 0.2 0.8 0.2 0.8 0.2 C=0 C=1 C=0 C=1 C=0 C=1 C=0 C=1

  17. Algorithm SampleSearch Root Constraints A≠B, A≠C 0.8 0.2 A=1 A=0 0.8 0.2 1 B=1 B=0 B=1 0.8 0.2 0.8 0.2 0.8 0.2 C=0 C=1 C=0 C=1 C=0 C=1 Resume Sampling

  18. Algorithm SampleSearch Root Constraints A≠B, A≠C 0.8 0.2 A=1 A=0 0.8 0.2 1 B=1 B=0 B=1 0.8 0.2 0.8 1 0.2 0.8 0.2 C=0 C=1 C=0 C=1 C=0 C=1 Constraint Violated Until Solution i.e. f(x)>0 found

  19. Generate more Samples Root Constraints A≠B, A≠C 0.8 0.2 A=1 A=0 0.8 0.2 0.4 0.6 B=0 B=1 B=0 B=1 0.8 0.8 0.2 0.2 0.8 0.2 0.8 0.2 C=0 C=1 C=0 C=1 C=0 C=1 C=0 C=1

  20. Generate more Samples Root Constraints A≠B, A≠C 0.8 0.2 A=1 A=0 0.8 0.2 0.4 0.6 B=0 B=1 B=0 B=1 0.8 1 0.8 0.2 0.2 0.8 0.2 0.8 0.2 C=0 C=1 C=0 C=1 C=0 C=1 C=0 C=1

  21. Constraints A≠B, A≠C Root Root Root A=0 A=0 A=0 B=0 B=1 B=1 B=1 B=0 C=0 C=1 C=0 C=1 C=1 Traces of SampleSearch Root A=0 B=1 C=1

  22. The Sampling distribution QR of SampleSearch • Did you generate samples from Q? -NO! Root What is probability of generating A=0? QR(A=0)=0.8 Why? SampleSearch is systematic 0.8 What is probability of generating B=1? QR(B=1|A=0)=1 Why? SampleSearch is systematic A=0 0 1 B=0 B=1 What is probability of generating B=0? Simple: QR(B=0|A=0)=0 All samples generated by SampleSearch are solutions 0 0 0 1 C=0 C=1 C=0 C=1 Backtrack-free distribution

  23. Computing QR • Invoke an oracle or a complete search procedure O(n) times per sample Root ?? Solution A=0 B=1 ?? Solution ?? Solution C=1

  24. Approximation AR of QR Root • IF Hole THEN AR=Q • IF No solutions on the other branch THEN AR=1 0.8 Hole Don’t know A=0 0 1 No solutions here B=0 B=1 0 0 0 1 No solutions here C=0 C=1 C=0 C=1

  25. Root Root Root 0.8 0.8 A=0 A=0 A=0 1 0.6 B=0 B=1 ? B=1 B=1 B=0 1 0.8 ? C=0 C=1 C=0 C=1 C=1 Approximation AR of QR Root • Problem: Can’t guarantee convergence 0.8 0.8 A=0 0.6 1 ? B=1 1 0.8 ? C=1

  26. Root Root Root 0.8 0.8 0.8 A=0 A=0 ? 1 0.6 A=0 B=1 ? B=1 1 B=0 1 0.8 B=1 ? C=0 C=1 C=1 1 C=1 Guarantee convergence in the limit • Store all possible traces Approximation ARN IF Hole THEN ARN=Q IF No solutions on other branch THEN ARN=1

  27. Improving Naive SampleSeach • Handle Non-binary domains • See the paper, Proof is complicated. • Better Search Strategy • Can use any state-of-the-art CSP/SAT solver e.g. minisat (Sorrenson et al 2006) • All theorems and result hold • Better Importance Function • Use output of generalized belief propagation to compute the initial importance function Q (Gogate and Dechter 2005)

  28. Experimental Results • Previous Algorithms • Likelihood weighting (LW) • Proposal=Prior • IJGP-sampling (IJGP-S) (Gogate and Dechter 2005) • Proposal=Output of generalized belief propagation • Adding SampleSearch • SampleSearch with LW (S+LW) • SampleSearch with IJGP-sampling (S+IJGP-S)

  29. Linkage BN_69

  30. Linkage BN_73

  31. Linkage BN_76

  32. Conclusions • Belief networks with zero probabilities lead to the Rejection problem in importance Sampling. • We presented a SampleSearch scheme that works with any importance sampling scheme to circumvent the Rejection Problem. • Sampling Distribution of SampleSearch is the backtrack-free distribution QR • Expensive to compute • Approximation of QR based on storing all traces that yields an asymptotically unbiased estimator • Empirically, when a substantial number of zero probabilities are present, SampleSearch based schemes dominate their pure sampling counter-parts.

More Related