220 likes | 589 Views
Automatic Generation of First Order Theorems. Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network. Overview of Talk. Automated Theory Formation Principles Implementation in the HR system Applications Application to Theorem Generation
E N D
Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network
Overview of Talk • Automated Theory Formation • Principles • Implementation in the HR system • Applications • Application to Theorem Generation • HR adds to the TPTP library • HR becomes a MathWeb service • Future Directions
Scientific Theories • Scientific theories about a domain contain: • Concepts, examples, definitions, • hypotheses, explanations, etc. • e.g. chemistry:acids • Concepts: Acid, Base, Salt • Hypothesis: Acid + Base Salt + Water • Experiments for plausibility/evidence • Reaction pathways for explanation
Theories in Pure Mathematics • Concepts have examples and definitions • Hypotheses are “conjectures” • Explanations are proofs • Conjectures become “theorems” • e.g pure maths:group theory • Concepts: cyclic groups, Abelian groups • Conjecture: cyclic groups are Abelian • Examples provide empirical evidence • Proof for explanation
HR: Theory Formation Cycle • Start with background knowledge • user-supplied axioms + concepts • Invent a new concept (machine learning) • Look for conjectures empirically (d-mining) • Prove the conjectures (theorem proving) • Disprove the conjectures (model generation) • Assess all concepts w.r.t. new concept • Invent a new concept • Build it from the most interesting old concepts
Inventing New Concepts • Ten General Production Rules (PR) • Work in all domains (math + non math) • Build new concept from one (or two) old ones • Example: Abelian groups • Given: [G,a,b,c] : a*b=c • Compose PR: [G,a,b,c] : a*b=c & b*a=c • Exists PR: [G,a,b] : c (a*b=c & b*a=c) • Forall PR: [G] : a b ( c (a*b=c & b*a=c))
Making Conjectures • Theory formation step • Attempt to invent a new concept • Concept has same examples as previous one • HR makes an equivalence conjecture • Concept has no examples • HR makes a non-existence conjecture • HR can also make implication conjectures • Examples of one concept are all examples of another concept
Proving Theorems • HR relies on third party theorem provers • Equivalence conjectures: • Sets of implication conjectures • From which prime implicates are extracted • E.g. a (a*a=a a=id) • a*a=a a=id, a=id a*a=a • HR uses the Otter theorem prover • William McCune • Only uses this for finite algebras
Disproving Non-Theorems • Any conjectures which Otter can’t prove • HR looks for a counterexample • Using the MACE model generator • Also written by William McCune • Other possibilities: CAS, CSP • Counterexamples are added to the theory • Fewer similar non-theorems are made later
Assessing Interestingness • New concepts from interesting old ones • Concepts measured in terms of: • Intrinsic values, e.g. complexity of definition • Relational values, e.g. novelty of categorisation • Concepts also assessed by conjectures • Quality, quantity of conjectures involving conc. • Conjectures also assessed • Difficulty of proof (proof length from Otter) • Surprisingness (of lhs and rhs definitions)
Applications of ATF • Machine Learning • Learn concept definitions: e.g. seq. ext. • Theory for prediction tasks • Theory for puzzle generation • Constraint Satisfaction Problems • Conjectures: induced constraints • Concepts: implied constraints • Mathematical Discovery • Exploration of new domains • Invention of Integer Sequences (NWN)
Application to ATP • Big project: using ATF to improve ATP • Sub-project: • Using AFT to assess ATP programs • Compare first order ATP programs • Using a large set of HR’s conjectures • Facilitate comparison: • Using MathWeb (Zimmer,Franke,…) • Using SystemOnTPTP (Sutcliffe)
First Attempt • Aim: add to the TPTP library • 5882 test problems for first order provers • Otter, SPASS, E, Vampire, etc. • New provers are tested using TPTP • HR produced 46,000 group conjectures • In ten minutes. • Around 200 of these were worthy of TPTP • All provable by SPASS in 120 seconds • 153 provable by only SPASS and E only • 42 provable by only SPASS
Example Theorem • Otter and E could not prove this: • x y (( z (inv(z)=x & z*y=x) & u (x*u=y & v (v*x=u & inv(v)=x))) ( a (inv(a)=x & a*y=x) & • b (b*y=x & inv(b)=y))) [about pairs of identity elements]
Interface of HR into MathWeb • MathWeb project in Saarbrücken • Has access to many first order ATP progs. • E, Otter, SPASS, Vampire, Bliksem, … • Idea: HR passes conjectures to MathWeb • MathWeb translates conjectures using tptp2x • MathWeb calls the provers • Interface • Via sockets at the moment • Later by XMLRPC for better standardization
Additional Implementation • By Zimmer, Colton and Franke • Changes to HR • Improvements in quantity of theorems • Ability to write conjectures in TPTP format • Changes to MathWeb • Calling one prover after another (1000s of times in a row) • Quicker interaction with tptp2x • Integration of the E system
Experiments • Possible experiments: • Which one proves most of HR’s theorems 1st • Compare the average times • How many timeouts for each prover • Watch this space for results….. • Saturday: 9000 group theory theorems proved by SPASS, E & Otter, before a crash! • Preliminary (unsurprising) result • Average times: SPASS < E < Otter
Future Work: MathWeb #1 • Try HR on more provers in MathWeb • Vampire, Bliksem • Offer HR as a new MathWeb service • User says: “Give me 1,000 theorems which SPASS and E take over 10 secs. to prove” • Interface HR and model generators in MW • Use MACE, etc. to disprove theorems • Interface HR and CSP, CAS in MW • Infinite Group theory with Bundy and Sorge
Future Work: MathWeb #2 • Aim: Beat SPASS…… • SPASS is too good for HR in group theory • 46,000 theorems and SPASS proved them all! • Part two of my Calculemus project: • With Jacques Calmet & Clemens Ballarin in Karlsruhe • HR invents new domains • Adds and constrains new operators for finite algebras • “Grow” difficult theorems from prime implicates
Future Work: HR Project • Colton: Express HR as a ML program • Try domains other than maths • Walsh: Integrate HR • With every maths program ever written • Bundy: • Build an automated mathematician
Web Pages • Mathweb: • www.mathweb.org • HR: • www.dai.ed.ac.uk/~simonco/research/hr • NumbersWithNames program: • www.machine-creativity.com/programs/nwn • Demonstration: Tomorrow @ 2pm? Room 208.