Theory Revision

Theory Revision Chris Murphy

The Problem • Sometimes we: • Have theories for existing data that do not match new data • Do not want to repeat learning every time we update data • Believe that our rule learners could perform much better if given basic theories to build off of

Two Types of Errors in Theories • Over-generalization • Theory covers negative examples • Caused by incorrect rules in theory or by existing rules missing necessary constraints • Example: uncle(A,B) :- brother(A,C). • Solution: uncle(A,B) :- brother(A,C), parent(C,B).

Two Types of Errors in Theories • Over-specialization • Theory does not cover all positive examples • Caused by rules having additional, unnecessary constraints or missing rules in the theory that are necessary to proving some examples • Example: uncle(A,B) :- brother(A,C), mother(C,B). • Solution: Uncle(A,B) :- brother(A,C), parent(C,B).

What is Theory Refinement? • “…learning systems that have a goal of making small changes to an original theory to account for new data.” • Combination of two processes: • Using a background theory to improve rule effectiveness and adequacy on data • Using problem detection and correction processes to make small adjustments to said theories

Basic Issues Addressed • Is there an error in the existing theory? • What part of the theory is incorrect? • What correction needs to be made?

Theory Refinement Basics • System is given a beginning theory about domain • Can be incorrect or incomplete (and often is) • Well refined theory will: • Be accurate with new/updated data • Make as few changes as possible to original theory • Changes are monitored by a “Distance Metric” that keeps a count of every change made

The Distance Metric • Adds every addition, deletion, or replacement of clauses • Used to: • Measure syntactical corruptness of original theory • Determine how good a learning system is at replicating human created theories • Drawback is that it does not recognize equivalent literals such as less(X,Y). And greq(Y,X). • Table on the right shows examples of distance between theories, as well as its relationship to accuracy

Why Preserve the Original Theory? • If you understood the original theory, you’ll likely understand the new one • Similar theories will likely retain the ability to use abstract predicates from the original theory

Theory Refinement Systems • EITHER • FORTE • AUDREY II • KBANN • FOCL, KR-FOCL, A-EBL, AUDREY, and more

EITHER • Explanation-based and Inductive Theory Extension and Revision • First system with ability to fix over-generalizing and over-specialization • Able to correct multiple faults • Uses one or more failings at a time to learn one or more corrections to a theory • Able to correct intermediate points in theories • Uses positive and negative examples • Able to learn disjunctive rules • Specialization algorithm does not allow positives to be eliminated • Generalization algorithm does not allow negatives to be admitted

FORTE • Attempts to prove all positive and negative examples using the current theory • When errors are detected: • Identify all clauses that are candidates for revision • Determine whether clause needs to be specialized or generalized • Determine what operators to test for various revisions • Best revision is determined based on its accuracy when tested on complete training set • Process repeats until system perfectly classifies the training set or until FORTE finds that no revisions improve the accuracy of the theory

Specializing a Theory • Needs to happen when one or more negatives are covered • Ways to fix the problem: • Delete a clause: simple, just delete and retest • Add new antecedents to existing clause • More difficult • FORTE uses two methods... • Add one antecedent at a time, like FOIL, choosing the antecedent that provides the best info gain at any point • Relational Pathfinding – uses graph structures to find new relations in data

Generalizing a Theory • Need to generalize when positives are not covered • Ways FORTE generalizes: • Delete antecedents from an existing clause (either singly or in groups) • Add a new clause • Copy clause identified at the revision point • Purposely over-generalize • Send over-general rule to specialization algorithm • Use inverse relation operators “identification” and “absorption” • These use intermediate rules to provide more options for alternative definitions

AUDREY II • Runs in two main phases: • Initial domain theory is specialized to eliminate negative coverage • At each step, a best clause is chosen, it is specialized, and the process repeats • Best clause is the one that contributes the most negative examples being incorrectly classified and is required by the fewest number of positives • If best clause covers no positives, it is deleted, otherwise, literals are added in a FOIL-like manner to eliminate covered negatives

AUDREY II • Revised theory is generalized to cover all positives (without covering any negatives) • Uncovered positive example is randomly chosen, and theory is generalized to cover the example • Process repeats until all remaining positives are covered • If assumed literals can be removed without decreasing positive coverage, that is done • If not, AUDREY II tries replacing literals with new conjuction of literals (also uses FOIL-type process) • If deleting and replacement fail, system uses a FOIL-like method of determining entirely new clauses for proving the literal

KBANN • System that takes a domain theory of Prolog style clauses, and transforms it into knowledge-based neural network (KNN) • Uses the knowledge base (background theory) to determine topology and initial weights of KNN • Different units and links within KNN correspond to various components of the domain theory • Topologies of KNNs can be different than topologies that we have seen in neural networks

KBANN • KNNs are trained on example data, and rules are extracted using an N of M method (saves time) • Domain theories for KBANN need not contain all intermediate theories necessary to learn certain concepts • Adding hidden units along with units specified by the domain theory allows the network to induce necessary terms not stated in background info • Problems arise when interpreting intermediate rules learned from hidden nodes • Difficult to label them based on the inputs they resulted from • In one case, programmers labeled rules based on the section of info that they were attached to in that topology

System Comparison • AUDREY II is better than FOCL at theory revision, but it still has room for improvement • Its revised theories are closer to both original theory and human-created correct theory

System Comparison • AUDREY II is slightly more accurate than FORTE, and its revised theories are closer to the original and correct theories • KR-FOCL addresses some issues of other systems by allowing user to decide among changes that have the same accuracy

Applications of Theory Refinement • Used to identify different parts of both DNA and RNA sequences • Used to debug student written basic Prolog programs • Used to maintain working theories as new data is obtained

Theory Revision

Theory Revision

Presentation Transcript

Trust Management and Theory Revision

Revision

Revision

Revision

Revision

GCSE PE Theory (Edexcel) - Revision and Target Setting

Revision

Revision on Collision Theory

Revision

INTERMEDIATE 2 Theory Revision

Revision

Revision

Revision

CP2 Circuit Theory Revision Lecture

Revision

Preferential Theory Revision

Revision!

ReVision