Intro to AI Resolution & Prolog

Intro to AIResolution & Prolog Ruth Bergman Fall 2004

West Wessex Marathon Race (by Martin Hollis) • With a field of five for the west wessex marathon race, there was little to interest the bettors. So, Peter Piper opened one of his ingenious books where he accepts multiple bets at high odds. To place a multiple bet, you must bet on two propositions, and you will win only if you are wholly successful. Peter Piper is now on an extended vacation because of the bettors who lost these bets: • A will not win the gold, nor B the silver. • C will win a medal, and D will not. • D and E will both win medals. • D will not win the silver, nor E the bronze. • A will win a medal, and C will not. Who won which of the medals?

West Wessex Marathon Race • A racer medals iff he receives a gold, silver or bronze medal x medal(x)  (gold(x) v silver(x) v bronze(x)) • A racer can only receive one medal • x gold(x)  medal(x) ^ (silver(x) v bronze(x)) • x silver(x)  medal(x) ^ (gold(x) v bronze(x)) • x bronze(x)  medal(x) ^ (silver(x) v gold(x)) • Only one race can receive a gold (silver, bronze) medal • x,y gold(x) ^ x != y   gold(y) • x,y silver(x) ^ x != y   silver(y) • x,y bronze(x) ^ x != y   bronze(y) • Some participant must win a gold (silver, bronze) medal • x gold(x) • x silver(x) • x bronze(x)

West Wessex Marathon Race • A will not win the gold, nor B the silver is false gold(A) v silver(B) • C will win a medal, and D will not is false  medal(C) v medal(D) • D and E will both win medals is false  medal(D) v  medal(E) • D will not win the silver, nor E the bronze is false silver(D) v bronze(E) • A will win a medal, and C will not is false  medal(A) v medal(C) • Who won which of the medals? Proof in class.

Inference Rules for FOL • Inference Rules • Modus Ponens, And-Elimination, And-Introduction, Or-Introduction, Resolution • Rules for substituting variables • Universal-Elimination, Existential-Elimination, Existential-Introduction

Theorem Proving Strategies • Strategy 1: Deduce everything possible from the KB, and hope to derive what you want to prove. • A forward chaining approach • Strategy 2: Find out what facts you need in order to prove theorem. Try to prove those facts. Repeat until there are no facts to prove. • A backward chaining approach

Forward Chaining • A “data-driven” strategy • This inference procedure will find a proof for any true sentence • For false sentences this procedure may not halt. This procedure is semi-decidable. • Generates many irrelevant conclusions. • Forward-chain(KB, a) • If a is in KB return yes • Apply inferences rules on KB • Add new sentences to KB • Go to 1

Backward Chaining • back-chain(KB, alist)* • If alist is empty return yes • Let a = First(alist) • for each sentence(p1… pn)ain KB • back-chain(KB,[p1, …, pn,Rest(alist)]) • * Simplified version ignores variable substitution. • Applied when there is a goal to prove • Apply modus-ponens backwards • Modus Ponens is incomplete x P(x)  Q(x), xP(x)  R(x) x Q(x)  S(x), x R(x)  S(x) • cannot conclude S(A)

Resolution : A Complete Inference Procedure • Simple resolution (for ground terms): • Modus ponens derives only atomic conclusions, but resolution derives new implications ,  ,   

p1  … pj …  pm , • q1  … qk …  qn • SUBST(, (p1  … pj-1  pj+1 … pm  q1  … qk-1  qk+1 …  qn)) The Resolution Inference Rule • Generalized Resolution : • for literals pi and qi, where UNIFY(pj , qk) = 

Modus Ponens ,     Resolution True  ,    True   Resolution v.s. Modus Ponens • Resolution is a generalization of modus ponens

Resolution Proofs • w  P(w) v Q(w) • x P(x) v R(x) • y  Q(y) v S(y) • x  R(x) v S(x)  Q(y) v S(y)  P(w) v Q(w) {y/w} P(x)  R(x)  P(w) v S(w) {x/w}  R(z) v S(z) Premise S(w)  R(w) {w/A, z/A} Conclusion, resolvent S(A) Still not complete

Resolution Proof with Refutation • Complete inference procedure • refutation, proof by contradiction, reductio ad absurdum • To proof P, • assume P is false (add P to KB) • prove a contradiction • (KB  P  False)  (KB  P)

Resolution Principle • The resolution procedure: • To prove p, add P to KB • convert all sentence in KB to cannonical form • use the resolution rule (repeatedly) • try to derive an empty clause

Proof by Refutation  Q(y) v S(y)  P(w) v Q(w) {y/w} • w  P(w) v Q(w) • x P(x) v R(x) • y  Q(y) v S(y) • x  R(z) v S(z) •  S(A) P(x)  R(x)  P(w) v S(w) {x/w}  R(z) v S(z) S(w)  R(w) {w/A, z/A}  S(A) S(A) contradiction

CNF P(w)  Q(w) P(x)  R(x) Q(y)  S(y) R(z)  S(z) Canonical Form for Resolution • Conjunctive Normal Form (CNF) : conjunctions of disjunctions • Implied universal quantifier on variables • Remove existential quantifiers • Quantifiers can be tricky: •  y  x p(x,y)   x p(x,A) where A is a new symbol •  x  y p(x,y)  ??? • intuitively,  x p(x,A) is wrong • we want to capture the idea that the existential quantifier is somehow dependent on the universal scoped outside of it

Conversion to Normal Form • Any first-order logic sentence can be converted to CNF • Conversion procedure • Eliminate implication • Reduce the scope of negations () to single predicates • Standardize variable • Move quantifiers left • Skolemize : remove existential quantifiers • Distribute  over  • Remove  to form separate wff

Conversion to CNF Example: Convert  ( y  x p(x,y)   x  y p(x,y)) to CNF • Eliminate implication: Replace A  B by  A  B  (  y  x p(x,y) v  x  y p(x,y)) • Move  inwards : negations are only allowed on atoms  y  x p(x,y) ^  x  y  p(x,y) • Standardize variables : unique variable name  y  x p(x,y) ^  w  z  p(w,z) • Move quantifier left without changing meaning  y  x  w  z p(x,y) ^ p(w,z)

Conversion to CNF: Skolemization 5. Skolemization (named after the Polish logician Skolem) • replace each existentially quantified variable with a new function () with arguments that are any universally quantified variable scoped outside of it •  y  x p(x,y)   x p(x,sk1) •  x  y p(x,y)   x p(x,sk2(x)) •  x  y  z p(x,y) ^ q(x,z)   x p(x,sk3(x)) ^ q(x,sk4(x)) •  x  y  z p(x,y) ^ q(x,z)   x y p(x,y) ^ q(x,sk5(x,y)) • These function are often referred to as Skolem functions • We can now remove  as implicit. All remaining variables are universally quantified. p(x,sk1) ^ p(sk2(x),z)

Conversion to CNF • Distribute  over  • (a  b)  c • Remove  • p(x,sk1), p(sk2(x),z) • As a result, we have sets of clauses to which we can apply the resolution rule • find a set with a positive term that matches a negative term in another set • to do so, we need to understand how to match things

Unification • The process of matching is called unification: • p(x) matches p(Jack) with x/Jack • q(fatherof(x),y) matches q(y,z) with y/fatherof(x) and z /y • note the result of the match is q(fatherof(x),fatherof(x)) • p(x) matches p(y) with • x/Jack and y/Jack • x/John and y/John • or x/y • The match that makes the least commitment is called the most general unifier (MGU)

Substitution • We use the notation subst(t,s) to denote the application of the substitution s = {v1/t1,v2/t2 ... vn/tn} to t. • Composition of substitutions • subst(p, COMPOSE(1, 2)) = subst(subst(p, 1), 2)

Unification Algorithm • Assume input formulas p and q have standardized variables parentof(fatherof(John),y) parentof(y,z) • parentof(fatherof(John),v1) parentof(v2,v3) • Unification takes two atomic sentences, p and q, and returns a substitution that would make p and q look the same.

Unification Algorithm • Unify(p, q, q) returns a substitution or fail • If q =fail return fail • If p = q return q • If p is a variable return Unify-var(p, q, q) • If q is a variable return Unify-var(q, p, q) • If either p or q is a constant if p = q return q else return fail • If the predicate symbols of p and q are different return fail • If p and q have different arity return fail • for each pair of arguments t1, t2 of p, q • Let S = Unify (t1,t2, q) • If S = fail return fail • Apply S to the remaining arguments of p and q • Set q = COMPOSE(S, q) • Return q

Unification Algorithm • Unify-var(var, x, q) returns a substitution or fail • If {var/val} in q return Unify(val, x, q) • If {x/val} in q return Unify(var, val, q) • If var occurs anywhere in x return fail • Return {var/x} U q • Note how this works with skolemization • The occurs check takes time linear in the size of the expression.  Thus time complexity of Unify is O(n2)

Resolution Example •  y  x p(x,y)   x  y p(x,y) •  x  y p(x,y)   y  x p(x,y) • From our previous axiomitization •  x male(x) v female(x) ^ (male(x) ^ female(x)) •  x y z parentof(x,y) ^ ancesterof(y,z)  ancesterof(x,z) •  x y parentof(x,y)  ancesterof(x,y) •  x parentof(fatherof(x),x) ^ parentof(motherof(x),x) ^ male(fatherof(x)) ^ female(motherof(x)) •  x  y1, y2 parentof(y1,x) ^ parentof(y2,x) ^ (y1 = y2) • x,y childof(y,x)  parentof(x,y) • male(jacob), female(rebecca), parentof(rebecca,jacob) ... • show •  x  y ancesterof(y,x) ^ female(y)

Example • Relevant clausal forms from KB • parentof(x,y) v ancesterof(x,y) • parentof(motherof(x),x) • female(motherof(x)) • negated goal •  x  y ancesterof(y,x) ^ female(y) • ancesterof(y,A) v female(y) • Proof .... (done in class)

Resolution Strategies • We know that repeated application of the resolution rule will find a proof if one exists • How efficient is this process? • The resolution procedure may be viewed as a search problem • The state space is very large • Some strategies for guiding the search toward a proof … • A resolution strategy is complete if its use will result in a procedure that will find a contradiction whenever one exists.

Proof by Refutation 2  S(A)  R(z) v S(z) {z/A}  R(A) P(x)  R(x) • w  P(w) v Q(w) • x P(x) v R(x) • y  Q(y) v S(y) • x  R(z) v S(z) •  S(A) {x/A}  P(w) v Q(w)  S(A) {w/A}  Q(y) v S(y)  S(A) {y/A}  S(A) S(A) contradiction

Horn Clauses: A Special Case • Horn clauses are a special subset of logic: • A x P(x) • A x,y P(x)  Q(x)^ R(y) • All universal variables • At most one positive atomic sentence P(x) v  Q(x) v  R(x) • Note that pure backward chaining suffices to prove statements: • To prove P(Y), either • unify with atomic P(x) • unify with P(x) v  Q(x) v  R(x); Q(x) v  R(x) become subgoals

Unit Resolution • Aka Unit preference • prefers to do resolution where one of the sentence is a single literal (unit clause) • we are trying to prove a sentence of length 0, so it might be a good idea to prefer inferences that produce shorter sentences • it is a useful heuristic that can be combined with other strategies • Complete for Horn clauses.

Input Resolution • every resolution combines one of the input sentences (either query or sentence in KB) • shape of a diagonal “spine” • input resolution is equivalent to unit resolution • complete for Horn form but incomplete in general case • Example: Q(u) v P(u) • Q(w) v P(w) Q(y) v  P(y)  Q(x) v  P(x)

An Input Resolution Proof  S(A)  R(z) v S(z) {z/A}  R(A) P(x)  R(x) • w  P(w) v Q(w) • x P(x) v R(x) • y  Q(y) v S(y) • x  R(z) v S(z) •  S(A) {x/A}  P(w) v Q(w)  S(A) {w/A}  Q(y) v S(y)  S(A) {y/A}  S(A) S(A) contradiction

Linear Resolution • a generalization of input resolution • allow P and Q to be resolved together if P is an input sentence or P is an ancestor of Q in the proof tree • linear resolution is complete

A Linear Resolution Proof  Q(x) v  P(x) Q(y) v  P(y) • Q(u) v P(u) • Q(w) v P(w) •  Q(x) v  P(x) • Q(y) v  P(y) • Q(w) v P(w)  P(x)  Q(w) Q(u) v P(u) P(u) contradiction

Set of support • This strategy tries to eliminate many resolutions altogether by identifying a subset of sentences called the set of support • Every resolution combines a sentence from set of support with another sentence • add resolvent into set of support • if set of support is small relative to KB, reduce search space significantly • A bad choice for the set of support will make the algorithm incomplete • If the remainder of the sentences are jointly satisfiable • Use the negated query as the set of support • goal-directed : easy to understand • Assumes the KB is consistent

An Set-of-Support Resolution Proof • w  P(w) v Q(w) • x P(x) v R(x) • y  Q(y) v S(y) • x  R(x) v S(x) • S(A) • Q(A) • P(A) • R(A) P(A) contradiction

Simplification Strategies • Elimination of Tautologies • A clause containing P(x) v  P(x) may be removed • Subsumption • eliminate all sentences that are subsumed by an existing sentence in KB • remove P(A), P(A)  Q(B), P(y) v Q(z) if P(x) is in KB • subsumption keeps the KB smaller thus reducing the branching factor

Prolog: a Logic Programming Language • A program is a sequence of sentences • Sentences are represented as Horn clauses, no negated antecedents • The Prolog representation has the consequent, or head, on the left hand side, and the antecedents, or body, on the right. • A query is a conjunction of terms; it is called the goal

Prolog: Example • fatherof(x,y) :- male(x), parentof(x,y). • motherof(x,y) :- female(x), parentof(x,y). • ancesterof(x,y) :- parentof(x,y). • ancesterof(x,y) :- parentof(x,z), ancesterof(z,y). • childof(x,y) :- parentof(y,x). • sonof(x,y) :- male(x), childof(x,y). • daughterof(x,y) :- female(x), childof(x,y). • male(john), male(joe),female(mary), female(jane) • parentof(john,mary), parentof(mary,jane), parentof(joe, john)

How does it work • male(john), male(joe),female(mary), female(jane) • parentof(john,mary), parentof(mary,jane), parentof(joe, john) • How would ancesterof(x,y) be answered? • ancesterof(john,mary) • ancesterof(mary,jane) • ancesterof(joe,john) • ancesterof(john,jane) • . • . • .

How Does it Work? • Queries are answered by the following algorithm: • The first clause with a head that matches is chosen • if it has subgoals, they become new queries, answered in left to right order • when all goals are satisfied, an answer is returned • if another answer is requested, the search continues where it left off • When no more matches are possible with chosen clause, next clause is chosen • This is effectively a depth-first search of the solution tree • note that we need to avoid left recursion!!

The Difference Between Logic and Prolog • Handling of negation: • Prolog handles negation as failure to prove, e.g. • not(fatherof(bill,mike)) will return true in Prolog, but not provable in formal logic • To save time, Prolog often runs without the occurs check. Can cause unsound inferences. • Prolog programs can side-effect as they run, and possibly do computations (e.g. arithmetic) • Clause ordering affects what can be shown; ordering doesn’t matter in logic

Rules in Production Systems • A rule based system • A rule has the format if condition then action “if the condition is met in the world then take the action” • the condition states when the rule applies • the action describes the result of applying the rule

Production System A production system consists of • A knowledge base • Working memory : short-term memory • contains set of positive literals with no variable • Rule memory: long-term memory • contains set of inference rules • A control strategy • A rule interpreter

Forward-Chaining Production Systems The forward-chaining production system loop: Repeat • Match phase • computes subset of rules whose left-hand side is satisfied by current content of working memory • Conflict resolution phase • decide which rule(s) should be executed • Act phase • modifies the knowledge base, usually working memory

Match phase • A unification exercise • c cycles, w elements in WM, r rules with n elements in LHS  c*w*r*n unifications • The RETE algorithm • used in the OPS-5 production system • Construct a RETE network using the working memory and the rule base • eliminate duplication btn. Rules • eliminate duplication over time

Conflict Resolution Phase • More than one rule passes the match phase • Do every actions or select one (some) • Control strategy • No duplication • Recency • Specificity • Mammal(x)  add Legs(x,4) • Mammal(x)  Human(x)  add Legs(x,2) • Operation priority : Action(Dust(p)) v.s. Action(Evacuate)  More on production systems when we discuss uncertain reasoning

Frame Systems & Semantic Networks Whether the language uses strings or nodes and links, and whether it is called a semantic network of a logic, has no effect on its meaning or on its implementation Semantic networks has • Graphical notation easy for people to visualize • very simple execution model easy to predict the behavior of the inference engine simple query language

Some Uses of Logic • Otter theorem prover • current system 4th generation, dating back to the 60’s • first order logic plus equality • used as a proving assistant • To see what’s been done, go to the otter home page!

Intro to AI Resolution & Prolog