10 likes | 143 Views
James Bridge supervised by Professor Larry Paulson. The Challenge to be Addressed. Logic Systems and Theorem Provers. In logic there is a trade off between expressive power in stating theorems and the decidability of the proof process.
E N D
James Bridge supervised by Professor Larry Paulson The Challenge to be Addressed Logic Systems and Theorem Provers In logic there is a trade off between expressive power in stating theorems and the decidability of the proof process. The simplest logic is propositional logic where theorems are Boolean expressions and the proof process is described as satisfiability or SAT. SAT is decidable but propositional logic is too simple to express many mathematical ideas and problems. Even at this level, for general expressions SAT is NP-complete and there is unlikely to be a polynomial time algorithm to solve it. First order logic adds existential and universal quantifiers and allows functions so is significantly more powerful than propositional logic. It is semi-decidable. Most automatic theorem provers work in first order logic. Beyond first order logic, higher order logics are widely used and are particularly useful for expressing complex hardware and software verification conditions but cannot generally be solved automatically. Instead proof assistants are used which require expert human intervention in the process. Much work is being done on combining automated first order logic provers with higher order logic proof assistants. Using computers to search for logical proofs of theorems has many applications ranging from pure maths to hardware and software verification. But the power of computer theorem provers falls well short of the initial promise of early work. The main challenge is the enormous size of the search space containing the proof being looked for. Much progress in search heuristics has been made but different heuristics work better with different theorems and determining which heuristic to pick for a new theorem is not straight forward. Applying Machine Learning to Automatic Theorem Proving Automatic Theorem Proving and the E-prover Automated theorem proving does not reproduce the complex but relatively short proofs that human mathematicians produce. Instead simple logic rules are applied to many clauses until a contradiction is reached. Theorems are proved by demonstrating a contradiction if the negation of the theorem is assumed. One such rule that may be applied is resolution (showed to be complete and sound in 1965 by Robinson). Though more complicated rules are now applied, such solvers are often described as being resolution based. The problem with only applying resolution as a rule in a naïve manner is that the number of new clauses generated can rapidly grow to an enormous size and this is particularly true where the clauses are encoding equality as a logical concept. One answer is to treat equality as a special case. Other improvements to make provers practical are to impose order on the parts of the logical expression (literals and terms), use a set of rules rather than just resolution and remove redundant clauses efficiently at an early stage. An efficient solver using such an approach is “E” written mainly by Stephan Schulz at Munich. This is the prover that I have been working with. The Nature of the Research The aim of my research is to use machine learning techniques to detect patterns within logical theorems and use them to guide automated theorem provers. The Intended Outcome of the Research The primary aim is to extend the range of theorems that can be proved automatically in reasonable time. A secondary aim is to develop an understanding of structure within theorems and how this may be exploited in theorem proving heuristics. Applying Theorem Proving The obvious and original application is checking mathematical theorems as a tool for mathematicians but this has proved an elusive goal (with some exceptions one of which is described below). But theorem provers can be applied to any problem where you want to prove that a complicated process doesn’t alter some desired invariant or an outcome follows from the input conditions. Computer software and hardware are obvious candidates but other applications include security protocols (for example Cohen’s TAPS system), commonsense reasoning in AI and even geometric proofs and the generation of consistent three dimensional models from images. The Present Status of the Work A Proof in Robbins Algebra – A Success for Automated Theorem Proving! In initial experiments I’ve run a theorem prover for a short time, collected data, then run to completion. I then applied machine learning to generate a classifier to predict whether or not a theorem will be proved without needing to run the proof process to completion. Results are promising but inconclusive. I now plan to extend the work to learning heuristic selection from theorem data. Despite becoming increasingly sophisticated and useful in a range of fields such as software verification and security protocols, automated theorem proving has not become a tool for mathematicians in the way that computer algebra packages have. A notable exception was the proof in 1996 by McCune of a conjecture by Robbins in Boolean algebra which had eluded human mathematicians for over sixty years. This story made the New York Times. Though the likes of Andrew Wiles are probably safe for the time being, the original aim of theorem proving, to do maths, has been given a fillip. Machine Learning – Support Vector Machines Machine learning involves using a computer to find a relationship between measurable features of a problem (eg temperature of a patient etc) and some outcome, typically a classification (the patient has measles or does not have measles). Rather than analytically model the relationship; instead, parameter values in a model are estimated by looking at historic data (the learning process). If this is done correctly, the same parameters can be used to classify previously unseen data. In a support vector machine the numerical values are transformed to new values in a multi-dimensional space which may be of different dimension to the original feature set. In the new space the values can be partitioned in a linear fashion and classification is determined by which side of the partition a transformed point lies. Following advice from Sean Holden, an expert in the field, I have so far used the program SVMlite to generate classifiers in the form of support vector machines. At the same time I have passed the data on to Sean and to Andrew Naish at the Cambridge Computer Lab who are able to apply more sophisticated methods. Preliminary Results and Conclusions Working with approximately 1200 theorems in the SET classification of the TPTP library led to a classifier that was correct 70% of the time but the sample was skewed so the base line would be 57%. The tentative conclusion is that useful results may be obtained from machine learning but that more work is needed on feature selection and learning methods. Rather than remain with classifying theorems into solvable or not, the new work will be applied to the more useful area of heuristic selection. Finding Features in Proof States In theorem proving a “proof state” is the collection of logical clauses, divided into processed and unprocessed sets, that provide a snap shot of the proof process. The initial proof state generally consists of the negated theorem to be proved together with the associated axioms all of which are in an unprocessed state. As the search for a proof progresses the number of clauses increases and some are moved to a processed set. The proof state also contains information as to how the generated clauses arose. To apply machine learning to theorem proving requires a means of measuring features in the proof state, either for the original theorem and axioms or by running the prover for a short while to reach a more complex proof state. Possible features are clause related (eg clause length), or involve whole sets of clauses. My initial runs used 16 features, the number was increased to 60 in later runs. Acknowledgements I have received much guidance from my supervisor Larry Paulson and from Sean Holden. Andrew Naish has analysed the data I have generated using his own software and Stephan Schulz, the author of E has provided ideas and patiently answered my e-mails despite having a full-time job outside of academia. 5th June 2007