380 likes | 407 Views
Horn clauses. Intro. A Horn clause is a disjunction of literals of which at most one is positive . E.g. {lawyer(x), rich(x)} Every Horn clause can be written as an implication whose premise is a conjunction of positive literals and whose conclusion is a single positive literal. E.g.
E N D
Intro • A Horn clause is a disjunction of literals of which at most one is positive. E.g. {lawyer(x), rich(x)} • Every Horn clause can be written as an implication whose premise is a conjunction of positive literals and whose conclusion is a single positive literal. E.g. lawyer(x) rich(x) • Horn clauses with exactly one positive literal are called definite clauses. E.g. the above. • A definite clause with no negative literals simply asserts a given proposition – sometimes called a fact. cat(tuna)
Lawyers Axiomatization • lawyer(john) • x lawyer(x) rich(x) • x rich(x) y house(x,y) • x,y rich(x) house(x,y) big(y) • x,y ( house(x,y) big(y) work(y) ) • Conclusion we want to show: John has at least one house that needs a lot of work. I.e. y house(john,y) work(y) We will negate it and add to the database. (y house(john,y) work(y))
Example (concluded) Here are all the clauses we got through INSEADO from the premises and the negated conclusion. All of them are Horn clauses (more specifically they are definite clauses). • {lawyer(john)} • {lawyer(x1), rich(x1)} • {rich(x2), house(x2,houseof(x2))} • {rich(x3), house(x3,y1), big(y1)} • {house(x4,y2), big(y2), work(y2)} • {house(john,y3), work(y3)} Note: We rename the variables, in order to not have variable name clashes between clauses.
Resolution by two finger method • {lawyer(john)} • {lawyer(x1), rich(x1)} • {rich(x2), house(x2,houseof(x2))} • {rich(x3), house(x3,y1), big(y1)} • {house(x4,y2), big(y2), work(y2)} • {house(john,y3), work(y3)} • {rich(john)} 1,2 mgu = {x1john} • {lawyer(x2), house(x2,houseof(x2))} 2,3 mgu = {x1x2} • {lawyer(x2), house(x3,y1), big(y1)} 2,4 • {rich(x3), big(houseof(x3))} 3,4 • {rich(x2), big(houseof(x2)), work(houseof(x2))} 3,5 • {…} 4,5 • {…} 3,6 • {…} 5,6 … you can continue with the two finger method…but it’s too long.
Ordered Resolution Strategy • Ordered resolution strat.: • Each clause is treated as an ordered set. • Resolution is permitted only on the first literal of each clause. • Intuition: To derive empty clause, every literal must be eliminated. So, work on first literal till it is gone before starting to work on other literals. • The literals in the conclusion preserve the order from the parent clauses, with literals from the positive parent followed by the literals from the negative parent (i.e., the one with the negated atom) • {lawyer(john)} • {lawyer(x1), rich(x1)} • {rich(x2), house(x2,houseof(x2))} • {rich(x3), house(x3,y1), big(y1)} • {house(x4,y2), big(y2), work(y2)} • {house(john,y3), work(y3)} • {rich(john)} 1,2 • {house(john,houseof(john))} 3,7 • {house(john,y1), big(y1)} 4,7 • {big(houseof(john)), work(houseof(john))} 5,8 • {work(houseof(john))} 6,8 • {big(houseof(john))} 8,9 • {work(houseof(john))} 10,12 • {} 11,13 Refutation complete for Horn clauses. Not in general.
Directed Resolution • A directed clause is a Horn clause in which the positive literal occurs either at the beginning or the end of the clause. • When we order the clause to have the positive literal at the end, the clause is called forward clause. • When we order the clause to have the positive literal at the beginning, the clause is called backward clause. • We can use a bit of syntactic sugar. • Write forward clauses using • Write backward clauses using • E.g. {1,…, n, } can write it as 1,…, n (forward) {, 1,…, n} can write it as 1,…, n(backward) {1,…, n} can write it as 1,…, n (forward) {1,…, n} can write it as 1,…, n(backward)
Positioning the positive literal • The possibility of controlling the direction of resolution by positioning the positive literal at one or the other end of a clause raises the question of which direction is more efficient. Example. insect(x) animal(x) mammal(x) animal(x) ant(x) insect(x) bee(x) insect(x) spider(x) insect(x) lion(x) mammal(x) tiger(x) mammal(x) zebra(x) mammal(x) Assuming that Zeke is a zebra, is Zeke an animal? I.e. is the following entailed? zebra(zeke) animal(zeke) Negated goal: zebra(zeke) & animal(zeke), which make the clauses: {zebra(zeke)} and {animal(zeke)}.
Let’s try considering them forward • {insect(x), animal(x)} • {mammal(x), animal(x)} • {ant(x), insect(x)} • {bee(x), insect(x)} • {spider(x), insect(x)} • {lion(x), mammal(x)} • {tiger(x), mammal(x)} • {zebra(x), mammal(x)} • {zebra(zeke)} • {animal(zeke)} • {mamal(zeke)} {8,9} • {animal(zeke)} {2,11} • {} {10,12} We use ordered resolution on the left. Only three steps needed!!
Let’s try considering them backward • {animal(x), insect(x)} • {animal(x), mammal(x)} • {insect(x), ant(x)} • {insect(x), bee(x)} • {insect(x), spider(x)} • {mammal(x), lion(x)} • {mammal(x), tiger(x)} • {mammal(x), zebra(x)} • {zebra(zeke)} • {animal(zeke)} • {insect(zeke)} {1,10} • {mammal(zeke)} {2,10} • {ant(zeke)} {3,11} • {bee(zeke)} {4,11} • {spider(zeke)} {5,11} • 16. {lion(zeke)} {6,12} • 17. {tiger(zeke)} {7,12} • 18. {zebra(zeke)} {8,12} • 19. {} {9,18} • Now we did 9 steps!! • So, should we conclude that if we do clauses forward, then the resolution will be more efficient? • No! Look at the next example.
Another example Consider the following databases of information about zebras: Zebras are mammals, striped, and medium in size. Mammals are animals and warm-blooded. Striped things are nonsolid and nonspotted. Things of medium size are neither small nor large. zebra(x) mammal(x) zebra(x) striped(x) zebra(x) medium(x) mammal(x) animal(x) mammal(x) warm(x) striped(x) nonsolid(x) striped(x) nonspotted(x) medium(x) nonsmall(x) medium(x) nonlarge(x) Assuming that Zeke is a zebra, is Zeke nonlarge? I.e. is the following entailed? zebra(zeke) nonlarge(zeke) Negated goal: The clauses: {zebra(zeke)} and {nonlarge(zeke)}.
Let’s try backward resolution So, backward resolution needs only three steps!! What about forward resolution? • {mammal(x), zebra(x)} • {striped(x), zebra(x)} • {medium(x), zebra(x)} • {animal(x), mammal(x)} • {warm(x), mammal(x)} • {nonsolid(x), striped(x)} • {nonspotted(x), striped(x)} • {nonsmall(x), medium(x)} • {nonlarge(x), medium(x)} • {zebra(zeke)} • {nonlarge(zeke)} • {medium(zeke)} 9,11 • {zebra(zeke)} 3,12 • {} 10,13
Let’s try forward resolution • {zebra(x), mammal(x)} • {zebra(x), striped(x)} • {zebra(x), medium(x)} • {mammal(x), animal(x)} • {mammal(x), warm(x)} • {striped(x), nonsolid(x)} • {striped(x), nonspotted(x)} • {medium(x), nonsmall(x)} • {medium(x), nonlarge(x)} • {zebra(zeke)} • {nonlarge(zeke)} • {mammal(zeke)} 1,10 • {striped(zeke)} 2,10 • {medium(zeke)} 3,10 • {animal(zeke)} 4,12 • {warm(zeke)} 5,12 • {nonsolid(zeke)} 6,13 • {nonspotted(zeke)} 7,13 • {nonsmall(zeke)} 8,14 • {nonlarge(zeke)} 9,14 • {} 11,20 • Forward resolution needs 10 steps!!
Forward vs. backward resolution • The fact is that forward resolution is best for some clause sets, and backward resolution is best for others. • To determine which is best for a which, we need to look at the branching factor of the clauses. • E.g. The search space branches backward in the animal example an forward in the zebra problem. • Consequently, we should use backward resolution in the animal example problem and forward resolution in the zebra problem. • Of course, things aren’t always this simple. Sometimes, it’s best to use some clauses in the forward direction and others in the backward direction. • Deciding which clauses to use in which direction is NP-complete.
Databases and queries • We will use ordered resolution for fill-in-the-blank queries. • The query is posed as a conjunction of positive literals, containing some number of variables. • The database consists entirely of positive ground literals. • The task is to find binding for variables. • Consider the following DB: parent(art,john) carpenter(ann) senator(john) parent(ann,john) carpenter(cap) senator(kim) parent(bob,kim) parent(bea,kim) parent(cap,lem) parent(coe,lem) Query: x,y. parent(x,y) carpenter(x) senator(y) SQL: SELECT x,y FROM parent, carpenter, senator WHERE parent.x=carpenter.x AND parent.y=senator.y;
Let’s use ordered resolution… • First let’s negate the query and write in clausal form: (parent(x,y) carpenter(x) senator(y)) {parent(x,y), carpenter(x), senator(y)} • In order to record the binding of variables we will add an answer literal ans(x,y), which should stay always at the end • don’t confuse here with the forward vs. backward: ans(x,y) is an artificial literal, that will never resolve). {parent(x,y), carpenter(x), senator(y), ans(x,y)} • Now let’s use ordered resolution to derive an answer, using the DB shown previously.
Let’s use ordered resolution… parent(art,john) parent(ann,john) parent(bob,kim) parent(bea,kim) parent(cap,lem) parent(coe,lem) carpenter(ann) carpenter(cap) senator(john) senator(kim) {parent(x,y), carpenter(x), senator(y), ans(x,y)} {carpenter(art), senator(john), ans(art,john)} {carpenter(ann), senator(john), ans(ann,john)} {carpenter(bob), senator(kim), ans(bob,kim)} {carpenter(bea), senator(kim), ans(bea,kim)} {carpenter(cap), senator(lem), ans(cap,lem)} {carpenter(coe), senator(lem), ans(coe,lem)} {senator(john), ans(ann,jon)} {senator(lem), ans(cap,lem)} {ans(ann,john)} Well, from the standpoint of correctness it doesn’t matter if we change the order of literals, and (re)write the query as: {senator(y), carpenter(x), parent(x,y), ans(x,y)}
Efficiency • From the standpoint of efficiency, one of the key questions is the order of the literals in the query. Suppose we do: parent(art,john) parent(ann,john) parent(bob,kim) parent(bea,kim) parent(cap,lem) parent(coe,lem) carpenter(ann) carpenter(cap) senator(john) senator(kim) {senator(y), carpenter(x), parent(x,y), ans(x,y)} {carpenter(x), parent(x,john), ans(x,john)} {carpenter(x), parent(x,kim), ans(x,kim)} {parent(ann,john), ans(ann,john)} {parent(cap,john), ans(cap,john)} {ans(ann,john)} So, 5 steps instead of 9 previously.
Efficiency (continued) • Previously, we gained 4 steps. Well, 4 is not so impressive! • But,…Let’s consider a real census database with the following properties. • There are 100 senators. • There are 100,000 carpenters. • There are 10,000,000 parent-child pairs. • If we use: {parent(x,y), carpenter(x), senator(y), ans(x,y)} we will have a search space of > 10,000,000 possibilities • However, if we use: {senator(y), parent(x,y), carpenter(x), ans(x,y)} The search space will be at most 100*2 because there are only 100 senators and only 2 parents for each senator.
Heuristic?! • Heuristic: Cheapest literal first! • Unfortunately, the rule doesn’t always produce the optimal ordering. • E.g. Consider that instead of “parent(x,y)” we have “represents(y,x).” • And for x variable and a and b constants suppose that: ||represents(a,x)|| 10,000 and ||represents(x,b)|| = 1 • Now, the heuristic suggests that we should order as: {senator(y), represents(y,x), carpenter(x), ans(x,y)} after y is bounded then there are 10,000 possibilities for represents(…). In total, we have 100*10,000 = 1,000,000 possibilities to check. • The problem is that there is a better ordering: {carpenter(x), represents(y,x), senator(y),ans(x,y)} because after x is bounded there is only 1 possibility for represents(…). In total, we have 100,000 * 1= 100,000 possibilities to check. • One way of guaranteeing the optimal ordering is to exhaustively search through all possible orderings (very expensive). Better way exits…
Backward chaining for definite clauses • Prolog goes recursively in a depth first fashion. • This is called “Backward chaining.” • {zebra(zeke)} • {lion(bob)} • {tiger(quincy)} • {spider(sp)} • {animal(x), mammal(x)} • {animal(y), insect(y)} • {insect(z), spider(z)} • {mammal(w), lion(w)} • {mammal(v), tiger(v)} • {mammal(o), zebra(o)} {animal(zeke)} {mammal(zeke)} {lion(zeke)} {tiger(zeke)} {zebra(zeke)} {}
Binding variables • {zebra(zeke)} • {lion(bob)} • {tiger(quincy)} • {spider(sp)} • {animal(x), mammal(x)} • {animal(y), insect(y)} • {insect(z), spider(z)} • {mammal(w), lion(w)} • {mammal(v), tiger(v)} • {mammal(o), zebra(o)} {animal(t)} {mammal(x)} subst{tx} {lion(w)} subst{xw} {} subst{wbob} i.e. tbob {tiger(v)} subst{xv} {} subst{vquincy} i.e. tquincy …
Prolog • Prolog uses backward chaining as its inference procedure. • Very fast!! • Uses the implication (with symbol :-) for the clauses. Commas mean “and.” • Constants start with lower-case letter while variables start with upper-case letter. • Variables are implicitly universally quantified. • All sentences end with period. zebra(zeke). lion(bob). tiger(quincy). spider(sp). animal(X) :- mammal(X). animal(Y) :- insect(Y). insect(Z) :- spider(Z). mammal(W) :- lion(W). mammal(V) :- tiger(V). mammal(O) :- zebra(O).
SWI-Prolog • Create a file “zebra.pl” • Double click on it to open SWI-Prolog, which also loads the file. • Then, you can ask queries: • E.g. (type ; to see more bindings) ?- animal(X). X = bob ; X = quincy ; X = zeke ; X = sp Yes
Prolog Search Trees • The simplest cases are ground queries, those that do not contain variables, e.g. • book(principia)? • wrote(gottlob,begriffsschrift)? • Prolog goes through the data base, starting at the top, trying to match the query with a fact listed there (in order to apply resolution). • If a matching fact is found, Prolog replies Yes, otherwise No. • If the ground query is complex (or conjunctive), like (3), Prolog tries to match one goal after the other, going from left to right. • (3) book(principia),wrote(gottlob,begriffsschrift)? • In these cases, Prolog also records the variable binding under which a given goal matches a certain fact listed in the data base. wrote(terry,shrdlu). wrote(bill,lunar). wrote(roger,sam). wrote(gottlob,begriffsschrift). wrote(bertrand,principia). wrote(alfred,principia). book(begriffsschrift). book(principia). program(lunar). program(sam). program(shrdlu).
This tree is to be read from top to bottom and from left to right as follows. “When Prolog is presented with the conjunctive query book(What),wrote(bertrand, What), it first finds a clause in the data base matching book(What), namely under the assignment What=begriffsschrift. The first goal, which is now book(begriffsschrift), is therefore taken care of, as indicated by crossing outbook(begriffsschrift). Also, the assignment What=begriffsschrift has turned the second goal into the ground goal wrote(bertrand, begriffsschrift). But since this goal does not appear in the data base, it cannot be crossed out, and so Prolog does not report the variable assignment What=begriffsschrift, as this answer is not correct according to the dataB”
So Prolog backtracks to the initial query in the topmost box, trying to find an alternative match for book(What) in the data base. Indeed, Prolog finds another clause that matches, this time under the assignment What=principia. The first goal is again crossed out. Also, the assignment What=principia has turned the second goal into the ground goal wrote(bertrand,principia). This goal does appear in the data base, so it can be crossed out, too. Therefore, Prolog reports the variable assignment What=principia.
Prolog may backtrack to find additional answers to a given query – provided the user types a semicolon. If every goal in a leaf box is crossed out, then Prolog has found an answer. The answer consists of all the assignments that appear on branches going from the leaf to the top.
Goal order • The order of goals can have an effect on the size of of the search tree. • Suppose, for example, we reverse the order of goals in the previous example. • In that case, we get this somewhat smaller search tree.
Recursion parent(katherine,bertrand). parent(bertrand,kate). parent(kate,sue). ancestor(Old,Young) :- parent(Old,Young). ancestor(Old,Young) :- parent(Old,Middle), ancestor(Middle,Young). The second ancestor rule exhibits recursion.
Variable renaming (standartizing variables appart) • The previous search tree reveals another feature of Prolog’s processing of queries, namely renaming of variables. • The second time Prolog uses the recursive ancestor rule, it must change the name of the variable Middle occurring in that rule. Without doing so, Prolog would end up recording that Middle=betrand and Middle=kate, which is a plain contradiction! • Identical variables within a clause stand for the same objects. • But variables in different clauses, even if they look the same, stand for possibly different objects. • Moreover, identical variables in different applications of the same recursive rule may stand for different objects. • So the Middle variable in the recursive ancestor rule stands for betrand in the first application and for kate in the second. Variable renaming ensures that this is possible.
Rule ordering (again) • ancestor(Old,Young) :- parent(Old,Middle), ancestor(Middle,Young). • ancestor(Old,Young) :- parent(Old,Young). • Reversing rule order never has any effect on the answers Prolog gives (except possibly for the order in which multiple answers are reported). • However, reversing rule order has an effect on search trees. • First, reversing rule order reverses the left-to-right ordering of the boxes in which the goals appear. • We get the mirror search tree.
Clause ordering • The order of clauses in the body of a rule is of greater importance than the order of the rules themselves. • Let’s reverse the order of clauses in the body of the recursive rule. • ancestor(Old,Young) :- ancestor(Middle,Young), parent(Old,Middle). • ancestor(Old,Young) :- parent(Old,Young). • Prolog won’t terminate!!
Prolog as a theorem prover • Prolog can prove some theorem by luck… • E.g. Suppose we load this database: link(a,b). link(b,c). path(X,Y) :- link(X,Y). path(X,Y) :- path(X,Z), path(Z,Y). And the query is: path(a,c). We will have: {link(a,b)} {link(b,c)} {path(X,Y), link(X,Y)} {path(X,Y), path(X,Z), path(Z,Y)} {path(a,c)} {link(a,c)} Stuck! Try next clause. {path(a,Z), path(Z,c)} {link(a,Y)} {} subst = {Xa, Zb} {link(Z,c)} {} subst = {Xa, Zb, Yc} Ok, we proved it!
Prolog as a theorem prover • Prolog is not complete as a theorem prover. • Suppose we write the previous database by changing the order of clauses. link(a,b). link(b,c). path(X,Y) :- path(X,Z), path(Z,Y). path(X,Y) :- link(X,Y). And the query is the same: path(a,c). Now, we will have: {link(a,b)} {link(b,c)} {path(X,Y), path(X,Z), path(Z,Y)} {path(X,Y), link(X,Y)} {path(a,c)} {link(a,c)} Stuck! Try next clause. {path(a,Z), path(Z,c)} {path(Z,c), path(Z,Z’), path(Z’,Y)} {path(Z,Z’), path(Z’,Y), path(Z,Z’’), path(Z’’,Z’)} …forever So, by just a simple rearrangement of clauses we can’t prove it anymore.