590 likes | 612 Views
This article discusses the limitations of current theorem provers for first-order logic and introduces a propositional prover with semantics for solving hard problems. It explores the use of unit resolution, general resolution, and human interaction in theorem proving. The article also highlights the DPLL example and strategies for strategy selection in theorem proving.
E N D
OSHL: A Propositional Prover with Semantics for First-Order Logic David A. Plaisted UNC Chapel Hill
Current theorem provers • Largely syntactic • Resolution or ME (tableau) based • First-order provers are often poor on non-Horn clauses • Rarely can solve hard problems • Human interaction needed for hard problems
Unit Resolution and General Resolution • Resolution is efficient for Horn and renameable Horn problems. • Resolution is efficient if the proof can be found by UR resolution. • Hard problems tend not to be Horn, renameable Horn, or UR resolvable. • Of 1697 TPTP problems provable by Otter in 30 seconds, 1042 can be proved by UR resolution.
Unit Resolution and General Resolution • Of the 1697 problems provable by Otter, only 297 were both non Horn and had rating greater than zero. • Of these 297, at most 215 are not UR resolvable. • Otter can do hundreds of thousands of resolutions in 30 seconds on this machine. • Resolution is inefficient on hard, non UR resolvable problems. • Need for new approaches.
How do humans prove theorems? • Semantics • Case analysis • Sequential search through space of possible structures • Focus on the theorem
“Systematic methods can now routinely solve verification problems with thousands or tens of thousands of variables, while local search methods can solve hard random 3SAT problems with millions of variables.” (from a conference announcement)
DPLL Example {p,r},{p,q,r},{p,r} p=T p=F {T,r},{T,q,r},{T,r} {F,r},{F,q,r},{F,r} SIMPLIFY SIMPLIFY {q,r} {r},{r} SIMPLIFY {}
Eliminating Duplication with the Hyper-Linking Strategy, Shie-Jue Lee and David A. Plaisted, Journal of Automated Reasoning 9 (1992) 25-42.
Replacement Rules with Definition Detection, David A. Plaisted and Yunshan Zhu, in Caferra and Salzer, eds., Automated Deduction in Classical and Non-Classical Logics, LNAI 1761 (1998) 80-94.
More DefinitionsS1 S2 … Sn=Sn Sn-1 … S1Left Associative
More Definitions Similar results for other definitions: S1 S2 … Sn=Sn Sn-1 … S1, left side left associated, right side right associated S1 S2 … Sn=S1 S2 … Sn S1 S2 … Sn, both sides associated to the left S1 S2 … Sn=S1 S2 … Sn S1 S2 … Sn, left side left associated, right side right associated Similar results for ∩
Later propositional strategies • Billon’s disconnection calculus, derived from hyper-linking • Disconnection calculus theorem prover (DCTP), derived from Billon’s work • FDPLL
Performance of DCTP on TPTP, 2003 • DCTP 1.3 first in EPS and EPR (largely propositional) • DCTP 10.2p third in FNE (first-order, no equality) solving same number as best provers • DCTP 10.2p fourth in FOF and FEQ (all first-order formulae, and formulae with equality) • DCTP 1.3 is a single strategy prover.
Strategy Selection • Schulz, Stephan, E-A Brainiac Theorem Prover, Journal of AI Communications 15(2/3):111-126, 2002.
Strategy Selection • The Vampire kernel provides a fairly large number of features for strategy selection. The most important ones are: • Choice of the main saturation procedure : (i) OTTER loop, with or without the Limited Resource Strategy, (ii) DISCOUNT loop. • A variety of optional simplifications. • Parameterised reduction orderings. • A number of built-in literal selection functions and different modes of comparing literals. • Age-weight ratio that specifies how strongly lighter clauses are preferred for inference selection. • Set-of-support strategy.
Strategy Selection • The automatic mode of Vampire 7.0 is derived from extensive experimental data obtained on problems from TPTP v2.6.0. Input problems are classified taking into account simple syntactic properties, such as being Horn or non-Horn, presence of equality, etc. Additionally, we take into account the presence of some important kinds of axioms, such as set theory axioms, associativity and commutativity. Every class of problems is assigned a fixed schedule consisting of a number of kernel strategies called one by one with different time limits.
Various Provers • PTTP solved 999 of 2200 tested problems. • Otter proved 1595. • leanCoP proved 745. • Source: • Jens Otten and Wolfgang Bibel.leanCoP: Lean Connection-Based Theorem Proving. Journal of Symbolic Computation, Volume 36, pages 139-161. Elsevier Science, 2003. • Vampire 6.0: 3286 refutations of 7267 problems, more solved
DCTP Strategy Selection • DCTP 1.31 has been implemented as a monolithic system in the Bigloo dialect of the Scheme language. • DCTP 1.31 is a single strategy prover. Individual strategies are started by DCTP 10.21p using the schedule based resource allocation scheme known from the E-SETHEO system. Of course, different schedules have been precomputed for the syntactic problem classes. The problem classes are more or less identical with the sub-classes of the competition organisers. • In CASC-J2 DCTP 10.21p performed substantially better.
Semantics • Gelernter 1959 Geometry Theorem Prover • Adapt semantics to clause form: • An interpretation (semantics) I is an assignment of truth values to literals so that I assigns opposite truth values to L and L for atoms L. • The literals L and L are said to be complementary.
Semantics ╨ • We write I C (IsatisfiesC) to indicate that semantics I makes the clause C true. • If C is a ground clause then I satisfies C if I satisfies at least one of its literals. • Otherwise I satisfies C if I satisfies all ground instances D of C. (Herbrand interpretations.) • If I does not satisfy C then we say IfalsifiesC.
Example Semantics • Specify I by interpreting symbols • Interpret predicate p(x,y) as x = y • Interpret function f(x,y) as x + y • Interpret a as 1, b as 2, c as 3 • Then p(f(a,b),c) interprets to TRUE but p(a,b) interprets to FALSE • Thus I satisfies p(f(a,b),c) but I falsifies p(a,b)
Obtaining Semantics • Humans using mathematical knowledge • Automatic methods (finite models) • Trivial semantics
Goal of OSHL • First-order logic • Clause form • Propositional efficiency • Semantics • Requires ground decidability
Structure of OSHL • Goal sensitivity if semantics chosen properly • Choose initial semantics to satisfy axioms • Use of natural semantics • For group theory problems, can specify a group • Sequential search through possible interpretations • Thus similar to Davis and Putnam’s method • Propositional Efficiency • Constructs a semantic tree
I0 I1 I2 I3 … D0 D1 D2 T • unsatisfiable Ordered Semantic Hyperlinking (Oshl) • Reduce first-order logic problem to propositional problem • Imports propositional efficiency into first-order logic • The algorithm • Imposes an ordering on clauses • Progresses by generating instances and refining interpretations
OSHL • I0 is specified by the user • Di is chosen minimal so that Ii falsifies Di • Di is an instance of a clause in S • Ii is chosen minimal so that Ii satisfies Dj for all j < i • Let Ti be {D0,D1, …, Di-1}. • Ii falsifies Di but satisfies Ti • When Ti is unsatisfiable OSHL stops and reports that S is unsatisfiable.
Clause Ordering • ||L||lin • ||P(f(x),g(x,c))||lin = 6 • ||L||dag • ||P(f(x),f(x))||dag = 4 • Extend to clauses additively, ignoring negations • OSHL chooses Di minimal in such an ordering
Alternate version of OSHL • Want to keep the size of T small • Do this by throwing away clauses of T subject to the condition: • The minimal model of Ti+1 is larger than the minimal model of Ti for all i. • This guarantees completeness. • Leads to a formulation using sequences of clauses and resolutions between clauses.
Rules of OSHL Start with empty sequence (C1,C2, …, Cn), D minimal contradict I, I minimal model (C1,C2, …, Cn,D) (C1,C2, …, Cn, D), Cn not needed (C1,C2, …, Cn-1,D) (C1,C2, …, Cn,D), max resolution possible (C1,C2, …, Cn-1,res(Cn,D,L)) Proof if empty clause derived
╨ Propositional Example (p I0 p) () ({-p1, -p2, -p3}) I0[-p3] ({-p1, -p2, -p3}, {-p4, -p5, -p6}) I0[-p3,-p6] ({…}, {…}, {-p7}) I0[-p3,-p6,-p7] ({…}, {…}, {-p7}, {p3, p7}) ({…}, {-p4, -p5, -p6}, {p3}) ({-p1, -p2, -p3},{p3}) ({-p1, -p2 }) I0[-p2]
Semantics • Trivial semantics: • Positive: Choose I0 to falsify all atoms, first D is all positive. Forward chaining. • Negative: Choose I0 to satisfy all atoms, first D is all negative. Backward chaining. • Natural semantics: I0 chosen by user
Semantics Ordering • <t a well founded ordering on atoms, extended to literals • Extend <t to interpretations as follows: • I and J agree on L if they interpret L the same • Suppose I0 is given • I <t J if I and J are not identical, A is the minimal atom on which they disagree, and I agrees with I0 on A
Semantics Ordering • <t is not a well founded ordering on interpretations. But <t minimal models of T always exist. • Ii is always chosen as the <t minimal model of T. • Theorem: Such Ii always has the form I0[L1 … Lm] where Li are literals of clauses of T. • I0[L1 … Lm] L iff at(L) {at(L1 … Ln)} and I0 L, or for some i L = Li. ╨ ╨
Instantiation Example • Suppose I0 interprets arithmetic in the standard way. • Suppose S contains axioms of arithmetic and the clause X+35. • Then the first instance chosen could be 2+35, (1+1)+35, (3-1)+35 et cetera but it could not be 3+35, nor could it be an instance of an axiom.
Instantiation Example • Suppose the first instance chosen is 2+35. • Then I1 is I0[2+35], which interprets all atoms as in standard arithmetic except that the statement 2+35 is true. • The next instance chosen might be 2+3-1 = 5-1 2+3 = 5. This contradicts I1. It is an instance of the clause X-1 = Y-1 X = Y and corresponds to generating the subgoal 2+3-1 = 5-1.
U Rules • Choose clauses instances to match existing literals. Look for a contradiction. • Basic clauses and U clauses • Basic clauses are used in three rules given • Sequence can also have U clauses on the end • U clauses have a selected literal • In basic clauses the max. lit. is selected • In U clauses other literals can be selected. • Significant performance enhancement.
U Rules • UR resolution: Find C in S having a ground UR resolvent with selected literals. Let C' be the corresponding instance of C. Add C' to the end of the sequence of clauses and select the UR resolvent from it. • Filtering: Find C in S such that NIL is derivable by unit resolution from selected literals and C. Let C' be the corresponding instance of C. Add C' to the end of the sequence of clauses. Select a literal from it.
U Rules • Case Analysis: Find C in S and L in C such that L has all the variables of C. Find instance L' of L that is complementary to a selected literal of some clause in the sequence. Let C' be the corresponding instance of C. Add C' to the end of the sequence and select a literal from it. • This rule expands definitions.
Examples of U Rules • UR resolution: Given the sequence ({s(a), p(b)}, {t(a), q(b)}) and the clause {not p(X), not q(X), r(X)} create the sequence ({s(a), p(b)}, {t(a), q(b)}, {not p(b), not q(b), r(b)} ) • Filtering: Given the sequence ({s(a), p(b)}, {t(a), q(b)}) and the clause {not p(X), not q(X)} create the sequence ({s(a), p(b)}, {t(a), q(b)}, {not p(b), not q(b)} )
Examples of U Rules • Case analysis: Given the sequence ({s(a), p(b)}, {t(a), q(b)}) and the clause {not q(X), r(X), s(X)} create the sequence ({s(a), p(b)}, {t(a), q(b)}, {not q(b), r(b), s(b)} )
Example Proof Using U Rules • All positive semantics • Clauses: • A1. XY, YX, X=Y A2. ZX, XY, ZY A3. g(X,Y)X, XY A4. g(X,Y)Y, XY A5. ZX, ZX Y A6. ZY, ZX Y A7. ZX Y, ZX, ZY T. A B = B A
Example Proof Using U Rules • 1. {A B = B A} (T) • 2. {A B B A, B A A B, A B = B A} (Case Analysis, A1) • 3. {g(A B, B A) B A, A B B A} (UR resolution, A4) • 4. {g(A B, B A) B A, g(…) B} (UR resolution, A5) • 5. {g(A B, B A) B A, g(…) A} (UR resolution, A6) • 6. {g(…) B, g(…) A, g(…) A B} (UR resolution, A7) • 7. {A B B A, g(…) A B} (Filtering, A3)
Example Proof Using U Rules • 1. {A B = B A} • 2. {A B B A, B A A B, A B = B A} (Case Analysis) • 3. {g(A B, B A) B A, A B B A} (UR resolution) • 4. {g(A B, B A) B A, g(…) B} (UR resolution) • 5. {g(A B, B A) B A, g(…) A} (UR resolution) • 8. {g(…) B, g(…) A, A B B A,} (Resolution of 6. and 7.)
Example Proof Using U Rules • 1. {A B = B A} • 2. {A B B A, B A A B, A B = B A} (Case Analysis) • 3. {g(A B, B A) B A, A B B A} (UR resolution) • 4. {g(A B, B A) B A, g(…) B} (UR resolution) • 9. {g(A B, B A) B A, g(…) B, A B B A} (Resolution of 8. and 5.)
Example Proof Using U Rules • 1. {A B = B A} • 2. {A B B A, B A A B, A B = B A} (Case Analysis) • 3. {g(A B, B A) B A, A B B A} (UR resolution) • 10. {g(A B, B A) B A} (Resolution of 9. and 4.)
Example Proof Using U Rules • 1. {A B = B A} • 2. {A B B A, B A A B, A B = B A} (Case Analysis) • 11. {A B B A} (Resolution of 10. and 3.)
Example Proof Using U Rules • 1. {A B = B A} • 12. {B A A B, A B = B A} (Resolution of 11 and 2) Now the other half of the proof will be done. Note that there is only one ascending sequence of clauses constructed by OSHL and we are only indicating part of it.