420 likes | 1.27k Views
Combining Inductive and Analytical Learning. Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수 1999. 7. 9. Contents. Motivation Inductive-Analytical Approaches to Learning Using Prior Knowledge to Initialize the Hypothesis The KBANN Algorithm
E N D
Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수 1999. 7. 9.
Contents • Motivation • Inductive-Analytical Approaches to Learning • Using Prior Knowledge to Initialize the Hypothesis • The KBANN Algorithm • Using Prior Knowledge to Alter the Search Objective • The TANGENTPROP Algorithm • The EBNN Algorithm • Using Prior Knowledge to Augment Search Operators • The FOCL Algorithm Combining Inductive & Analytical Learning
Motivation(1/2) • Inductive & Analytical Learning Inductive Learning Analytical Learning Goal: Hypothesis fits data Hypothesis fits domain theory Justification: Statistical inference Deductive inference Advantages: Requires little prior knowledge Learns from scarce data Pitfalls: Scarce data, incorrect bias Imperfect domain theory • A spectrum of learning tasks • Most practical learning problems lie somewhere between these two extremes of the spectrum. Combining Inductive & Analytical Learning
Motivation(2/2) • What kinds of learning algorithms can we devise that make use of approximate prior knowledge, together with available data, to form general hypothesis? • domain-independent algorithms that employ explicitly input domain-dependent knowledge • Desirable Properties • no domain theory learn as well as inductive methods • perfect domain theory learn as well as analytical methods • imperfect domain theory & imperfect training data combine the two to outperform either inductive or analytical methods • accommodate arbitrary and unknown errors in domain theory • accommodate arbitrary and unknown errors in training data Combining Inductive & Analytical Learning
The Learning Problem • Given: • A set of training examples D, possibly containing errors • A domain theory B, possibly containing errors • A space of candidate hypothesis H • Determine: • A hypothesis that best fits the training examples & domain theory Combining Inductive & Analytical Learning
Hypothesis Space Search • Learning as a task of searching through hypothesis space • hypothesis space H • initial hypothesis • the set of search operator O • define individual search steps • the goal criterion G • specifies the search objective • Methods for using prior knowledge Use prior knowledge to • derive an initial hypothesis from which to begin the search • alter the objective G of the hypothesis space search • alter the available search steps O Combining Inductive & Analytical Learning
Using Prior Knowledge to Initialize the Hypothesis • Two Steps 1. initialize the hypothesis to perfectly fit the domain theory 2. inductively refine this initial hypothesis as needed to fit the training data • KBANN(Knowledge-Based Artificial Neural Network) 1. Analytical Step • create an initial network equivalent to the domain theory 2. Inductive Step • refine the initial network (use BACKPROP) Given: A set of training examples A domain theory consisting of nonrecursive, propositional Horn clauses Determine: An artificial neural network that fits the training examples, biased the domain theory Table 12.2(p.341) Combining Inductive & Analytical Learning
Example: The Cup Learning Task Neural Net Equivalent to Domain Theory Result of refining the network Combining Inductive & Analytical Learning
Remarks • KBANN vs. Backpropagation • when given an approximately correct domain theory & scarce training data • KBANN generalizes more accurately than Backpropagation • Classifying promoter regions in DNA • Backpropagation: error rate 8/106 • KBANN: error rate 4/106 • bias • KBANN • domain-specific theory • Backpropagation • domain-independent syntactic bias • toward small weight values Combining Inductive & Analytical Learning
Using Prior Knowledge to Alter the Search Objective • Use of prior knowledge • incorporate it into the error criterion minimized by gradient descent • network must fit a combined function of the training data & domain theory • Form of prior knowledge • derivatives of the target function • certain type of prior knowledge can be expressed quite naturally • example: recognizing handwritten characters • “the identity of the character is independent of small translations and rotations of the image.” Combining Inductive & Analytical Learning
The TANGENTPROP Algorithm • Domain Knowledge • expressed as derivatives of the target function with respect to transformations of its inputs • Training Derivatives • TANGENTPROP assumes various training derivatives of the target function are provided. • Error Function : transformation(rotation or translation) : constant to determine the relative importance Table 12.4(p.349) Combining Inductive & Analytical Learning
Remarks • TANGENTPROP combines the prior knowledge with observed training data, by minimizing an objective function that measures both • the network’s error with respect to the training example values • the network’s error with respect to the desired derivatives • TANGENTPROP is not robust errors in the prior knowledge • need to automatically select • EBNN Algorithm Combining Inductive & Analytical Learning
The EBNN Algorithm(1/2) • Input • A set of training examples of the form • A domain theory represented by a set of previously trained NN • Output • A new NN that approximates the target function • Algorithm • Create a new, fully connected feedforward network to represent the target function • For each training example, determine corresponding training derivatives • Use the TANGENTPROP algorithm to train the target network Combining Inductive & Analytical Learning
The EBNN Algorithm(2/2) • Computation of training derivatives • compute them itself for each observed training example • explain each training example in terms of a given domain theory • extract training derivatives from this explanation • provide important information for distinguishing relevant from irrelevant features • How to weight the relative importance of the inductive & analytical component of learning • is chosen independently for each training example • consider how accurately the domain theory predicts the training value for this particular example • Error Function Figure 12.7(p.353) A(x): domain theory prediction for input x : ith training instance : jth component of the vector x c: normalizing constant Combining Inductive & Analytical Learning
Remarks • EBNN vs. Symbolic Explanation-Based Learning • domain theory consisting of NNs rather than Horn clauses • relevant dependencies take the form of derivatives • accommodates imperfect domain theories • learns a fixed-sized neural network • requires constant time to classify new instances • unable to represent sufficiently complex functions Combining Inductive & Analytical Learning
Using Prior Knowledge to Augment Search Operators • The FOCL Algorithm • Two operators for generating candidate specializations 1. Add a single new literal 2. Add a set of literals that constitute logically sufficient conditions for the target concept, according to the domain theory • select one of the domain theory clauses whose head matches the target concept. • Unfolding: Each nonoperational literal is replaced, until the sufficient conditions have been restated in terms of operational literals. • Pruning: the literal is removed unless its removal reduces classification accuracy over the training examples. • FOCL selects among all these candidate specializations, based on their performance over the data • domain theory is used in a fashion that biases the learner • leaves final search choices to be made based on performance over the training data Figure 12.8(p.358) Figure 12.9(p.361) Combining Inductive & Analytical Learning