SampleSearch Scheme for Consistent Samples

SampleSearch: A scheme that searches for Consistent Samples Vibhav Gogate and Rina Dechter University of California, Irvine USA

Outline • Background • Bayesian Networks with Zero probabilities • Importance Sampling • Rejection Problem • The SampleSearch Scheme • Algorithm • Sampling Distribution and its Approximation • Experimental Results

Complexity • Belief Updating • NP-hard when zeros are present • General case when all CPTs are positive, not known. • Relative Approximation • Randomized Polynomial time algorithm when all CPTs are positive (Dagum and Luby 1997) • Probability of Evidence • NP-hard when zeros are present • Relative Approximation • Randomized Polynomial time algorithm when all CPTs are positive and (1/P(e)) is polynomial (Karp, Dagum and Luby 1993)

Importance Sampling (Rubinstein ’81)

Importance Sampling for Belief Updating

Generating i.i.d. samples from Q Q(A,B,C)=Q(A)*Q(B|A)*Q(C|A,B) Q(A)=(0.8,0.2) Q(B|A)=(0.4,0.6,0.2,0.8) Q(C|A,B)=Q(C)=(0.2,0.8) Root 0.8 0.2 A=1 A=0 0.8 0.2 0.4 0.6 B=0 B=1 B=0 B=1 0.8 0.8 0.2 0.2 0.8 0.2 0.8 0.2 C=0 C=1 C=0 C=1 C=0 C=1 C=0 C=1

Rejection Problem • Importance Sampling requirement • f(xi)>0 => Q(xi)>0 • Conversely, Q(xi) can be >0 even if f(xi)=0. • So if the probability of sampling ∑Q(xi|f(xi)>0) is very small • A large number of assignments will have zero weight • Extreme case: Our approximation = zero.

Rejection Problem Root All Blue leaves correspond to solutions i.e. f(x) >0 All Red leaves correspond to non-solutions i.e. f(x)=0 0.8 0.2 A=1 A=0 0.8 0.2 0.4 0.6 B=0 B=1 B=0 B=1 0.8 0.8 0.2 0.2 0.8 0.2 0.8 0.2 C=0 C=1 C=0 C=1 C=0 C=1 C=0 C=1

E A D B F G C Constraint Networks(Dechter 2003) Example: map coloring Variables - countries (A,B,C,etc.) Values - colors (red, green, blue) Constraints: A Solution is an assignment that satisfies all constraints

Constraint networks to model “zeros” Constraints A=0, C=0 not allowed A=1, C=1 not allowed Or A≠C A B C • Why constraints? • For a partial sample if a constraint is violated f(X=x)=0 for any full extension X=x of the sample. • For every full assignment X=x • solution implies f(X=x) >0 and • non-solution f(X=x)=0 F D G

Using Constraints Constraints A≠B, A≠C Root 0.8 0.2 A=1 A=0 0.8 0.2 0.4 0.6 B=0 B=1 B=0 B=1 0.8 0.8 0.2 0.2 0.8 0.2 0.8 0.2 C=0 C=1 C=0 C=1 C=0 C=1 C=0 C=1

Using Constraints Root Constraints A≠B, A≠C 0.8 A=0 0.4 0.6 B=0 B=1 Constraint A≠B violated 0.8 0.2 0.2 0.8 C=0 C=1 C=0 C=1 C=0

Outline • Background • Bayesian Networks • Importance Sampling • Rejection Prblem • The SampleSearch Scheme • Algorithm • Sampling Distribution and Approximation • Experimental Results

Algorithm SampleSearch Root Constraints A≠B, A≠C 0.8 0.2 A=1 A=0 0.8 0.2 0.4 0.6 B=0 B=1 B=0 B=1 0.8 0.8 0.2 0.2 0.8 0.2 0.8 0.2 C=0 C=1 C=0 C=1 C=0 C=1 C=0 C=1

Algorithm SampleSearch Root Constraints A≠B, A≠C 0.8 0.2 A=1 A=0 0.8 0.2 0.4 0.6 1 B=0 B=1 B=0 B=1 0.8 0.8 0.2 0.2 0.8 0.2 0.8 0.2 C=0 C=1 C=0 C=1 C=0 C=1 C=0 C=1

Algorithm SampleSearch Root Constraints A≠B, A≠C 0.8 0.2 A=1 A=0 0.8 0.2 1 B=1 B=0 B=1 0.8 0.2 0.8 0.2 0.8 0.2 C=0 C=1 C=0 C=1 C=0 C=1 Resume Sampling

Algorithm SampleSearch Root Constraints A≠B, A≠C 0.8 0.2 A=1 A=0 0.8 0.2 1 B=1 B=0 B=1 0.8 0.2 0.8 1 0.2 0.8 0.2 C=0 C=1 C=0 C=1 C=0 C=1 Constraint Violated Until Solution i.e. f(x)>0 found

Generate more Samples Root Constraints A≠B, A≠C 0.8 0.2 A=1 A=0 0.8 0.2 0.4 0.6 B=0 B=1 B=0 B=1 0.8 0.8 0.2 0.2 0.8 0.2 0.8 0.2 C=0 C=1 C=0 C=1 C=0 C=1 C=0 C=1

Generate more Samples Root Constraints A≠B, A≠C 0.8 0.2 A=1 A=0 0.8 0.2 0.4 0.6 B=0 B=1 B=0 B=1 0.8 1 0.8 0.2 0.2 0.8 0.2 0.8 0.2 C=0 C=1 C=0 C=1 C=0 C=1 C=0 C=1

Constraints A≠B, A≠C Root Root Root A=0 A=0 A=0 B=0 B=1 B=1 B=1 B=0 C=0 C=1 C=0 C=1 C=1 Traces of SampleSearch Root A=0 B=1 C=1

The Sampling distribution QR of SampleSearch • Did you generate samples from Q? -NO! Root What is probability of generating A=0? QR(A=0)=0.8 Why? SampleSearch is systematic 0.8 What is probability of generating B=1? QR(B=1|A=0)=1 Why? SampleSearch is systematic A=0 0 1 B=0 B=1 What is probability of generating B=0? Simple: QR(B=0|A=0)=0 All samples generated by SampleSearch are solutions 0 0 0 1 C=0 C=1 C=0 C=1 Backtrack-free distribution

Computing QR • Invoke an oracle or a complete search procedure O(n) times per sample Root ?? Solution A=0 B=1 ?? Solution ?? Solution C=1

Approximation AR of QR Root • IF Hole THEN AR=Q • IF No solutions on the other branch THEN AR=1 0.8 Hole Don’t know A=0 0 1 No solutions here B=0 B=1 0 0 0 1 No solutions here C=0 C=1 C=0 C=1

Root Root Root 0.8 0.8 A=0 A=0 A=0 1 0.6 B=0 B=1 ? B=1 B=1 B=0 1 0.8 ? C=0 C=1 C=0 C=1 C=1 Approximation AR of QR Root • Problem: Can’t guarantee convergence 0.8 0.8 A=0 0.6 1 ? B=1 1 0.8 ? C=1

Root Root Root 0.8 0.8 0.8 A=0 A=0 ? 1 0.6 A=0 B=1 ? B=1 1 B=0 1 0.8 B=1 ? C=0 C=1 C=1 1 C=1 Guarantee convergence in the limit • Store all possible traces Approximation ARN IF Hole THEN ARN=Q IF No solutions on other branch THEN ARN=1

Improving Naive SampleSeach • Handle Non-binary domains • See the paper, Proof is complicated. • Better Search Strategy • Can use any state-of-the-art CSP/SAT solver e.g. minisat (Sorrenson et al 2006) • All theorems and result hold • Better Importance Function • Use output of generalized belief propagation to compute the initial importance function Q (Gogate and Dechter 2005)

Experimental Results • Previous Algorithms • Likelihood weighting (LW) • Proposal=Prior • IJGP-sampling (IJGP-S) (Gogate and Dechter 2005) • Proposal=Output of generalized belief propagation • Adding SampleSearch • SampleSearch with LW (S+LW) • SampleSearch with IJGP-sampling (S+IJGP-S)

Linkage BN_69

Linkage BN_73

Linkage BN_76

Conclusions • Belief networks with zero probabilities lead to the Rejection problem in importance Sampling. • We presented a SampleSearch scheme that works with any importance sampling scheme to circumvent the Rejection Problem. • Sampling Distribution of SampleSearch is the backtrack-free distribution QR • Expensive to compute • Approximation of QR based on storing all traces that yields an asymptotically unbiased estimator • Empirically, when a substantial number of zero probabilities are present, SampleSearch based schemes dominate their pure sampling counter-parts.

SampleSearch Scheme for Consistent Samples

SampleSearch Scheme for Consistent Samples

Presentation Transcript

Samples from a Celebration

MDM for a Consistent View of Customers

Next are some samples from my original scheme

Strongly consistent replication for a bargain

SEARCHES WITH A WARRANT

SEARCHES WITHOUT A WARRANT

Searches for New Physics in the Top Quark Samples at CDF

Searches for Young Pulsars

Searches for

Searches for New Physics in the Top Quark Samples at CDF

God Searches for a Heart...

SampleSearch: A scheme that searches for Consistent Samples

Searches for New Physics

For Toxic Samples:

Searches for Dark Matter

Searches for New Phenomena

A dvanced searches in

A  2 veto for Continuous Wave Searches

A Consistent Life

Searches for double partons

Searches for double partons