Sampling and Soundness: Can We Have Both?

Carla Gomes, Bart Selman, Ashish Sabharwal Cornell University Jörg Hoffmann DERI Innsbruck …and I am: Frank van Harmelen Sampling and Soundness: Can We Have Both?

ISWC’07 Talk Roadmap A Sampling Method with a Correctness Guarantee Can we apply this to the Semantic Web? Discussion

ISWC’07 How Might One Count? Problem characteristics: Space naturally divided into rows, columns, sections, … Many seats empty Uneven distribution of people (e.g. more near door, aisles, front, etc.) How many people are present in the hall?

ISWC’07 #1: Brute-Force Counting Idea: • Go through every seat • If occupied, increment counter Advantage: • Simplicity, accuracy Drawback: • Scalability

ISWC’07 #2: Branch-and-Bound (DPLL-style) Idea: • Split space into sectionse.g. front/back, left/right/ctr, … • Use smart detection of full/empty sections • Add up all partial counts Advantage: • Relatively faster, exact Drawback: • Still “accounts for” every single person present: need extremely fine granularity • Scalability Framework used in DPLL-based systematic exact counters e.g. Relsat [Bayardo-et-al ’00], Cachet [Sang et al. ’04]

ISWC’07 #3: Naïve Sampling Estimate Idea: • Randomly select a region • Count within this region • Scale up appropriately Advantage: • Quite fast Drawback: • Robustness: can easily under- or over-estimate • Scalability in sparse spaces:e.g. 1060 solutions out of 10300 means need region much larger than 10240 to “hit” any solutions

ISWC’07 Sampling with a Guarantee Idea: • Identify a “balanced” row split or column split (roughly equal number of people on each side) • Use local search for estimate • Pick one side at random • Count on that side recursively • Multiply result by 2 This provably yields the true count on average! • Even when an unbalanced row/column is picked accidentallyfor the split, e.g. even when samples are biased or insufficiently many • Surprisingly good in practice, using a local search as the sampler

ISWC’07 Algorithm SampleCount Input: Boolean formula F Set numFixed = 0, slack = some constant (e.g. 2, 4, 7, …) Repeat until F becomes feasible for exact counting Obtain s solution samples for F Identify the most balanced variable and variable-pair[“x is balanced” : s/2 samples have x = 0, s/2 have x = 1 “(x,y) is balanced” : s/2 samples have x = y, s/2 have x = –y] If x is more balanced than (x,y), randomly set x to 0 or 1Else randomly replace x with y or –y; simplify F Increment numFixed Output: model count  2numFixed – slack exactCount(simplified F) with confidence (1 – 2– slack ) [Gomes-Hoffmann-Sabharwal-Selman IJCAI’07] Note: showing one trial

ISWC’07 Correctness Guarantee Key properties: Holds irrespective of the quality of the local search estimates No free lunch! Bad estimates  high variance of trial outcome  min(trials) is high-confidence but not tight Confidence grows exponentially with slack and t Ideas used in the proof: Expected model count = true count (for each trial) Use Markov’s inequality Pr[X>kE[X]] < 1/k to bound error probability (X is outcome of one trial) Theorem: SampleCount with t trials gives a correct lower bound with probability (1 – 2– slack t) e.g. slack =2, t =4  99% correctness confidence

Instance True Count SampleCount (99% conf.) Relsat (exact) Cachet (exact) 2bitmax_6 2.1 x 1029  2.4 x 1028 29 sec 2.1 x 1029 66 sec 2.1 x 1029 2 sec 3bitadd_32 ---  5.9 x 101339 32 min --- 12 hrs --- 12 hrs wff-3-3.5 1.4 x 1014  1.6 x 1013 4 min 1.4 x 1014 2 hrs 1.4 x 1014 7 min wff-3-1.5 1.8 x 1021  1.6 x 1020 4 min  4.0 x 1017 12 hrs 1.8 x 1021 3 hrs wff-4-5.0 ---  8.0 x 1015 2 min  1.8 x 1012 12 hrs  1.0 x 1014 12 hrs Circuit Synthesis, Random CNFs ISWC’07

ISWC’07 Talk Roadmap • A Sampling Method with a Correctness Guarantee • Can we apply this to the Semantic Web? • Discussion

ISWC’07 Talk Roadmap • A Sampling Method with a Correctness Guarantee • Can we apply this to the Semantic Web? [Highly speculative] • Discussion

ISWC’07 Counting in the Semantic Web… • … should certainly be possible with this method • Example: given RDF database D, count how many triples comply with query q • Throw a constraint cutting the set of all triples in half • If feasible, count n triples exactly; return n*2#constraints-slack • Else, iterate • “Merely” technical challenges: • What are “constraints” cutting the set of all triples in half? • How to “throw” a constraint? • When to stop throwing constraints? • How to efficiently count the remaining triples?

ISWC’07 What about Deduction? • Does  follow from ? • Exploit connection “implication  UNSAT upper bounds”? • A similar theorem does NOT hold for upper bounds • Nutshell: Markov’s inequality Pr[X>kE[X]] < 1/k does not have a symmetric “Pr[X<kE[X]]” counterpart • An adaptation is possible but has many problems → does not look too promising • Heuristic alternative: • Add constraints into  to obtain ’; check whether ’ implies  • If “No”, stop; if “yes”, goto next trial • After t successful trials, output “it’s enough, I believe it” • No provable confidence but may work well in practice

ISWC’07 What about Deduction? • Does  follow from ? • Much more distant adaptation: • “Constraint” = something that removes half of  !! • Throw some and check whether ’   • Confidence problematic: • Can we draw any conclusions if ’ NOT  ? • May be that 1, 2 in  with 1 2  , but a constraint separated 1 from 2 • May be thatall relevant  are thrown out • Are there interesting cases where we can bound the probability of these events??

ISWC’07 Talk Roadmap • A Sampling Method with a Correctness Guarantee • Can we apply this to the Semantic Web? [Highly speculative] • Discussion

ISWC’07 Discussion • In prop CNF, one can efficiently obtain high-confidence lower bounds on nr of models, by sampling • Application to Semantic Web: • Adaptation to counting tasks should be possible • Adaptation for   , via upper bounds, is problematic • Promising: heuristic method sacrificing confidence guarantee • Alternative adaptation weakens  instead of strengthening it • “Sampling the knowledge base” • Confidence guarantees?? Your feedback and thoughts are highly appreciated!!

ISWC’07 What about Deduction? • Does  follow from ? • Straightforward adaptation: • There is a variant of this algorithm that computes high-confidence upper bounds instead • Throw “large” constraints, check if ’ is SAT • If SAT, no implication; if UNSAT in each of t iterations, confidence on upper bound on #models • Many problems: • Is the ’ actually easier to check?? • “Large” constraints are tough even in propositional CNF context! • (“Large” = involves half of the prop vars; needed for confidence) • Upper bound on #models is not confidence in UNSAT!

Sampling and Soundness: Can We Have Both?