Probabilistic Abduction using Markov Logic Networks

Probabilistic Abduction using Markov Logic Networks Rohit J. Kate Raymond J. Mooney

Abduction • Abduction is inference to the best explanation for a given set of evidence • Applications include tasks in which observations need to be explained by the best hypothesis • Plan recognition • Intent recognition • Medical diagnosis • Fault diagnosis … • Most previous work falls under two frameworks for abduction • First-order logic based Abduction • Probabilistic abduction using Bayesian networks

Logical Abduction Given: • Background knowledge, B, in the form of a set of (Horn) clauses in first-order logic • Observations, O, in the form of atomic facts in first-order logic Find: • A hypothesis, H, a set of assumptions (logical formulae) that logically entail the observations given the theory BH  O • Typically, best explanation is the one with the fewest assumptions, e.g. minimizes |H|

Sample Logical Abduction Problem • Background Knowledge: x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria)) x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria)) • Observations: Infected(John, Malaria) Transfuse(Blood, Mary, John) • Explanation: Infected(Mary, Malaria)

Previous Work in Logical Abduction • Several first-order logic based approaches [Poole et al. 1987; Stickel 1988; Ng & Mooney 1991; Kakas et al. 1993] • Perform first-order “backward” logical reasoning to determine the set of assumptions sufficient to deduce observations • Unable to reason under uncertaintyto find the most probable explanation Background Knowledge: x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria)) Holds 80% of the times x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria)) Holds 40% of the times Observation: Infected(John, Malaria) 99% sure Transfuse(Blood, Mary, John) 60% sure

Previous Work in Probabilistic Abduction • An alternate framework is based on Bayesian networks [Pearl 1988] • Uncertainties are encoded in a directed graph • Given a set of observations, probabilistic inference over the graph computes the posterior probabilities of explanations • Unable to handle structured representations because essentially based on propositional logic

Probabilistic Abduction using MLNs • We present a new approach for probabilistic abduction that combines first-order logic and probabilistic graphical models • Uses Markov Logic Networks (MLNs)[Richardson and Domingos 2006], a theoretically sound framework for combining first-order logic and probabilistic graphical models Rest of the talk: • MLNs • Our approach using MLNs • Experiments • Future Work and Conclusions

Markov Logic Networks (MLNs) [Richardson and Domingos 2006] • A logical knowledge base is a set of hard constraintson the set of possible worlds • An MLN is a set of soft constraints:When a world violates a clause, it becomes less probable, not impossible • Give each clause a weight(Higher weight  Stronger constraint)

Sample MLN Clauses x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria)) 20 x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria)) 5

MLN Probabilistic Model MLN is a template for constructing a Markov network Ground literals correspond to nodes Ground clauses correspond to cliques connecting the ground literals in the clause Probability of a world (truth assignments) x: Weight of clause i No. of true groundings of clause i in x

Sample MLN Probabilistic Model • Clauses with weights: x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria))20 x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria))5 • Constants: John, Mary, M • Ground literals: Mosquito(M) Infected(M,Malaria) Bite(M,John) Bite(M,Mary) Infected(John,Malaria) Infected(Mary,Malaria) Transfuse(Blood,John,Mary) Transfuse(Blood,Mary,John)

Sample MLN Probabilistic Model • Clauses with weights: x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria))20 x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria))5 • Constants: John, Mary, M • Ground literals: Mosquito(M) true Infected(M,Malaria) true Bite(M,John) false Bite(M,Mary) false Infected(John,Malaria) true Infected(Mary,Malaria) true Transfuse(Blood,John,Mary) true Transfuse(Blood,Mary,John) false

Sample MLN Probabilistic Model • Clauses with weights: x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria)) 20  x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria))5 • Constants: John, Mary, M • Ground literals: Mosquito(M) true Infected(M,Malaria) true Bite(M,John) false Bite(M,Mary) false Infected(John,Malaria) true Infected(Mary,Malaria) true Transfuse(Blood,John,Mary) true Transfuse(Blood,Mary,John) false

Sample MLN Probabilistic Model • Clauses with weights: x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria)) 20   x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria))5 • Constants: John, Mary, M • Ground literals: Mosquito(M) true Infected(M,Malaria) true Bite(M,John) false Bite(M,Mary) false Infected(John,Malaria) true Infected(Mary,Malaria) true Transfuse(Blood,John,Mary) true Transfuse(Blood,Mary,John) false

Sample MLN Probabilistic Model • Clauses with weights: x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria))20   x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria)) 5  • Constants: John, Mary, M • Ground literals: Mosquito(M) true Infected(M,Malaria) true Bite(M,John) false Bite(M,Mary) false Infected(John,Malaria) true Infected(Mary,Malaria) true Transfuse(Blood,John,Mary) true Transfuse(Blood,Mary,John) false

Sample MLN Probabilistic Model • Clauses with weights: x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria))20   x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria)) 5   • Constants: John, Mary, M • Ground literals: Mosquito(M) true Infected(M,Malaria) true Bite(M,John) false Bite(M,Mary) false Infected(John,Malaria) true Infected(Mary,Malaria) true Transfuse(Blood,John,Mary) true Transfuse(Blood,Mary,John) false

Sample MLN Probabilistic Model • Clauses with weights: x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria))20   x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria))5   • Constants: John, Mary, M • Ground literals: Mosquito(M) true Infected(M,Malaria) true Bite(M,John) false Bite(M,Mary) false Infected(John,Malaria) true Infected(Mary,Malaria) true Transfuse(Blood,John,Mary) true Transfuse(Blood,Mary,John) false

MLNs Inference and Learning • Using probabilistic inference techniques one can determine the most probable truth assignment, probability that a clause holds etc. • Given a database of training examples, appropriate weights of the formulae can be learned to maximize the probability of the training data • An open-source software package for MLNs called Alchemy is available

Abduction using MLNs • Given: Infected(Mary,Malaria)  Transfuse(Blood,Mary,John) → Infected(John,Malaria)) Transfuse(Blood, Mary, John) Infected(John, Malaria) • The clause is satisfied whether Infected(Mary, Malaria) is true or false • Given the observations, a world has the same probability in MLN whether the explanation is true or false, explanations cannot be inferred • The MLN inference mechanism is inherently deductive and not abductive

Adapting MLNs for Abduction • Explicitly include the reverse implications x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria)) y (Infected(y,Malaria) → x (Transfuse(Blood,x,y)  Infected(x,Malaria))) • Existentially quantify the universally quantified variables which appear on the LHS but not on the RHS in the original clause • Now, given Transfuse(Blood, Mary, John) and Infected(John, Malaria),the probability of the world in which Infected(Mary,Malaria) is true will be higher

Adapting MLNs for Abduction • However, there could be multiple explanations for the same observations: x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria)) y (Infected(y,Malaria) → x (Transfuse(Blood,x,y)  Infected(x,Malaria))) x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria)) y (Infected(y,Malaria) → x (Mosquito(x)  Infected(x,Malaria)  Bite(x,y))) • An observation should be explained by one explanation and not multiple explanations • The system should support “explaining away” [Pearl 1988]

Adapting MLNs for Abduction x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria)) x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria)) • Add the disjunction clause and the mutual exclusivity clause for the same RHS term • Since MLN clauses are soft constraints both explanations can still be true • y (Infected(y,Malaria) → • x (Transfuse(Blood,x,y)  Infected(x,Malaria))) v • x (Mosquito(x)  Infected(x,Malaria)  Bite(x,y))) • y (Infected(y,Malaria) → • ( x (Transfuse(Blood,x,y)  Infected(x,Malaria))) v • (x (Mosquito(x)  Infected(x,Malaria)  Bite(x,y))))

Adapting MLNs for Abduction • In general, for the Horn clauses P1→ Q,P2→ Q , …, Pn→ Q in the background knowledge base, add: • A reverse implication disjunction clause Q → P1 v P2v… v Pn • A mutual exclusivity clause for every pair of explanations Q →  P1 v P2 Q →  P1 v Pn … Q →  P2 v Pn • Weights can be learned from training examples or can be set heuristically

Adapting MLNs for Abduction • There could be constants or variables on the RHS predicate x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria)) x (Infected(x,Malaria)  Transfuse(Blood,x,John) → Infected(John,Malaria))

Adapting MLNs for Abduction • There could be constants or variables on the RHS predicate x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria)) x (Infected(x,Malaria)  Transfuse(Blood,x,John) → Infected(John,Malaria)) • Infected(John,Malaria) → • x (Transfuse(Blood,x,John)  Infected(x,Malaria))) v • x (Mosquito(x)  Infected(x,Malaria)  Bite(x,John)) • Infected(John,Malaria) → • ( x (Transfuse(Blood,x,John)  Infected(x,Malaria))) v • (x (Mosquito(x)  Infected(x,Malaria)  Bite(x,John)))

Adapting MLNs for Abduction • There could be constants or variables on the RHS predicate x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria)) x (Infected(x,Malaria)  Transfuse(Blood,x,John) → Infected(John,Malaria)) • Infected(John,Malaria) → • x (Transfuse(Blood,x,John)  Infected(x,Malaria))) v • x (Mosquito(x)  Infected(x,Malaria)  Bite(x,John)) • Infected(John,Malaria) → • ( x (Transfuse(Blood,x,John)  Infected(x,Malaria))) v • (x (Mosquito(x)  Infected(x,Malaria)  Bite(x,John))) • y (Infected(y,Malaria) → • x (Mosquito(x)  Infected(x,Malaria)  Bite(x,y)))

Adapting MLNs for Abduction • There could be constants or variables on the RHS predicate x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria)) x (Infected(x,Malaria)  Transfuse(Blood,x,John) → Infected(John,Malaria)) • Infected(John,Malaria) → • x (Transfuse(Blood,x,John)  Infected(x,Malaria))) v • x (Mosquito(x)  Infected(x,Malaria)  Bite(x,John)) • Infected(John,Malaria) → • ( x (Transfuse(Blood,x,John)  Infected(x,Malaria))) v • (x (Mosquito(x)  Infected(x,Malaria)  Bite(x,John))) • y (Infected(y,Malaria) → • x (Mosquito(x)  Infected(x,Malaria)  Bite(x,y))) • Formal algorithm is described in the paper, requires appropriate unifications and variable re-namings

Experiments: Dataset • Plan recognition dataset used to evaluate abductive systems [Ng & Mooney 1991; Charniak & Goldman 1991] • Character’s higher-level plans must be inferred to explain their observed actions in a narrative text • “Fred went to the supermarket. He pointed a gun at the owner. He packed his bag.” => robbing • “Jack went to the supermarket. He found some milk on the shelf. He paid for it.” => shopping • Dataset contains 25 development [Goldman 1990] and 25 test examples [Ng & Mooney 1992]

Experiments: Dataset contd. • Background knowledge-base was constructed for the ACCEL system [Ng and Mooney 1991] to work with the 25 development examples; 107 such rules instance_shopping(s) ^ go_step(s,g)  instance_going(g) instance_shopping(s) ^ go_step(s,g) ^ shopper(s,p)  goer(g,p) • Narrative text is represented in first order logic; average 12.6 literals • “Bill went to the store. He paid for some milk.” instance_going(Go1) goer(Go1,Bill) destination_go(Store1) instance_paying(Pay1) payer(Pay1,Bill) thing_paid(Pay1,Milk1) • Assumptions explaining the above actions instance_shopping(S1) shopper(S1,Bill) go_step(S1,Go1) pay_step(S1,Pay1) thing_shopped_for(S1,Milk)

Experiments: Methodology • Our algorithm automatically adds clauses to the knowledge-base for performing abduction using MLNs • We found that 25 development examples were too few to learn weights for MLNs, we heuristically set the weights • Small negative weights on unit clauses so that they are not assumed for no reason • Medium weights on reverse implication clauses • Large weights on mutual exlcusivity clauses • Given a set of observations, we use Alchemy’s probabilistic inference to determine the most likely truth assignment for the remaining literals

Experiments: Methodology contd. • We compare with the ACCEL system [Ng & Mooney 1992], a purely logic-based system for abduction • Selects the best explanation using a metric • Simplicity metric: selects the explanation of smallest size • Coherence metric: selects the explanation that maximally connects the observations (specifically geared towards this task) • “John took the bus. He bought milk.” => John took the bus to the store where he bought the milk.

Experiments: Methodology contd. • Besides finding the assumptions, a deductive system like MLN also finds other facts that can be deduced from the assumptions • We deductively expand ACCEL’s output and gold-standard answers for a fair comparison • We measure • Precision: what fraction of the predicted ground literals are in the gold-standard answers • Recall: what fraction of the ground literals in the gold-standard answers were predicted • F-measure: harmonic mean of precision and recall

Experiments: Results Development Set

Experiments: Results contd. Test Set

Experiments: Results contd. • MLN performs better than ACCEL-simplicity particularly on the development set • ACCEL-coherence performs the best, but was specifically tailored for narrative understanding task • The dataset used does not require full probabilistic treatment because little uncertainty in the knowledge-base or observations • MLNs did not need any heuristic metric but simply found the most probable explanation

Future Work • Evaluate probabilistic abduction using MLNs on a task in which uncertainty plays a bigger role • Evaluate on a larger dataset on which the weights could be learned to automatically adapt to a particular domain • Previous abductive systems like ACCEL have no learning mechanism • Perform probabilistic abduction using other frameworks of combining first-order logic and graphical models [Getoor & Taskar 2007], for example, Bayesian Logic Programming [Kersting & De Raedt 2001] and compare with the presented approach

Conclusions • A general method for probabilistic first-order logical abduction using MLNs • Existing off-the-shelf deductive inference system of MLNs is employed to do abduction by suitably reversing the implications • Handles uncertainties using probabilties and an unbounded number of related entities using first-order logic, capable of learning • Experiments on a small plan recognition dataset demonstrated that it compares favorably with special-purpose logic-based abductive systems

Thanks! Questions?

Probabilistic Abduction using Markov Logic Networks

Probabilistic Abduction using Markov Logic Networks

Presentation Transcript

Markov Logic and Deep Networks

Markov Logic

Markov Logic

Abduction, Uncertainty, and Probabilistic Reasoning

10-803 Markov Logic Networks

Markov Logic Networks

Policy Transfer via Markov Logic Networks

Markov Logic Networks

Learning Markov Logic Networks Using Structural Motifs

Boosting Markov Logic Networks

Human Detection under Partial Occlusions using Markov Logic Networks

Learning the Structure of Markov Logic Networks

Markov Logic

Abduction, Uncertainty, and Probabilistic Reasoning

Markov Logic

Discriminative Training of Markov Logic Networks

Abduction, Uncertainty, and Probabilistic Reasoning

Abduction, Uncertainty, and Probabilistic Reasoning

Markov Logic

Discriminative Training of Markov Logic Networks