1 / 1

Lifted First-Order Probabilistic Inference Rodrigo de Salvo Braz, Eyal Amir and Dan Roth

Contributions. Example. Example. Comparing Lifted and Propositional Inferences. Previous approach - Knowledge-driven Construction. Where we are. Previous approaches - Probabilistic models. Example. Key ideas. Lifted First-Order Probabilistic Inference. Previous approaches - Logic.

elmo
Download Presentation

Lifted First-Order Probabilistic Inference Rodrigo de Salvo Braz, Eyal Amir and Dan Roth

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Contributions Example Example Comparing Lifted and Propositional Inferences Previous approach - Knowledge-driven Construction Where we are Previous approaches - Probabilistic models Example Key ideas Lifted First-Order Probabilistic Inference Previous approaches - Logic Goal • Powerful probabilistic capabilities in models such as Bayesian networks and Markov models, but lacks structured data such as lists, trees, etc. • Cannot automatically apply knowledge to more than one object at the same time (for example, probabilistic properties of nouns to all nouns in a piece of text). • Allows for declaration of probabilistic knowledge (KB) that applies to many objects: • Prob(properNoun(Word) | noun(Word), capitalized(Word), similarTo(Word, Word2), knownName(Word2)) = 0.7holds for all words involved. • Does not build BN in advance of inference. Instead, performs inference using KB items as necessary. • Does not consider individual objects separately unless this is really necessary: similar to theorem-proving, but with probabilities. • Example: given capitalized(Beautiful) and adjective(Beautiful), the algorithm does not have to consider the whole list of known names in order to decide that Beautiful is not a proper noun, because it is not a noun in the first place. An algorithm constructing a model before doing inference would build a node per known name! Smoking facts in a large population: Prob(friends(X,Y)) = 0.1 Prob(smoker(X)) = 0.3 Prob(smoker(Y) | friends(X,Y), smoker(X)) = 0.6 Prob(cancer(X) | smoker(X)) = 0.6, X  john Prob(cancer(john)) = 0.01 A possible query would be: in a population of 40,000, what is the probability of a male having cancer? • Constructs a propositional probabilistic model based on logic-like probabilistic knowledge; • Builds model before starting inference, often building unnecessary parts. A group of people, each with data such as name, age, profession, as well as relationships between them: Prob(friends(X,Z) | friends(X,Y), friends(Y,Z)) = 0.6 Prob(age(Person) > 20 | degree(Person)) = 0.9. • A general system to reason with symbolic data in a probabilistic fashion • Being able to describe objects of arbitrary structure: lists, trees, objects with various fields and refering to other objects; • Being able to describe probabilistic facts. • Powerful data representation, powerful knowledge declaration, but lack of probabilities; • Prevents us to say things as simple as: “most birds fly”, “most capitalized words are proper nouns” or “rectangular objects are common”. Each atom in a rule is a parameterized random variable. noun(Word) actually stands for all its instance random variables noun(w1), noun(w2),..., word(wn). We use a First-Order Variable Elimination algorithm, where often all instances of a parameterized random variable can be eliminated at once, regardless of the domain size, since they all have the same structure. This is an idea introduced by David Poole in 2003 that we formalized and call Inversion Elimination (IE). However, IE only works when these instances are independent of each other. For more general cases, we introduced Counting Elimination, which however does depend on the domain size. We compare lifted methods (inversion and counting elimination) to their propositional counterparts (which produce identical numerical results), for P(sick(Person) | epidemics), P(death | sick(Person)) in (I) and P(p(X),p(Y),r) in (II). Web pages, with their words, subjects and links, and how they relate to each other. Prob(subject(Page,coffe) | wordIn(Page,java)) = 0.2 Prob(subject(Page,programming) | wordIn(Page,java)) = 0.8 Prob(subject(Page,Subj) | link(Page,Page2), subject(Page2, Subj)) = 0.7. Lifted First-Order ProbabilisticInference Rodrigo de Salvo Braz, Eyal Amir and Dan Roth • A First-Order Probabilistic Inferencealgorithm which: • provides a language for both structured data and probabilistic information; • takes advantage of first-order level information by eliminating many objects in a single step, as opposed to previous work. • We have: • predicates (relational structure); • no individuals split unnecessarily. • But we do not have yet: • function symbols, so no data structures such as lists and trees; • does not take full advantage of inference during construction. This will require approximation. This research is supported by ARDA’s AQUAINTProgram, by NSF grant ITR-IIS-0085980 and DAF grant FA8750-04-2-0222 (DARPA REAL program).AAAI’06, IJCAI’05

More Related