1 / 16

Combining Inductive and Analytical Learning

Combining Inductive and Analytical Learning. Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수 1999. 7. 9. Contents. Motivation Inductive-Analytical Approaches to Learning Using Prior Knowledge to Initialize the Hypothesis The KBANN Algorithm

fraley
Download Presentation

Combining Inductive and Analytical Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수 1999. 7. 9.

  2. Contents • Motivation • Inductive-Analytical Approaches to Learning • Using Prior Knowledge to Initialize the Hypothesis • The KBANN Algorithm • Using Prior Knowledge to Alter the Search Objective • The TANGENTPROP Algorithm • The EBNN Algorithm • Using Prior Knowledge to Augment Search Operators • The FOCL Algorithm Combining Inductive & Analytical Learning

  3. Motivation(1/2) • Inductive & Analytical Learning Inductive Learning Analytical Learning Goal: Hypothesis fits data Hypothesis fits domain theory Justification: Statistical inference Deductive inference Advantages: Requires little prior knowledge Learns from scarce data Pitfalls: Scarce data, incorrect bias Imperfect domain theory • A spectrum of learning tasks • Most practical learning problems lie somewhere between these two extremes of the spectrum. Combining Inductive & Analytical Learning

  4. Motivation(2/2) • What kinds of learning algorithms can we devise that make use of approximate prior knowledge, together with available data, to form general hypothesis? • domain-independent algorithms that employ explicitly input domain-dependent knowledge • Desirable Properties • no domain theory  learn as well as inductive methods • perfect domain theory  learn as well as analytical methods • imperfect domain theory & imperfect training data  combine the two to outperform either inductive or analytical methods • accommodate arbitrary and unknown errors in domain theory • accommodate arbitrary and unknown errors in training data Combining Inductive & Analytical Learning

  5. The Learning Problem • Given: • A set of training examples D, possibly containing errors • A domain theory B, possibly containing errors • A space of candidate hypothesis H • Determine: • A hypothesis that best fits the training examples & domain theory Combining Inductive & Analytical Learning

  6. Hypothesis Space Search • Learning as a task of searching through hypothesis space • hypothesis space H • initial hypothesis • the set of search operator O • define individual search steps • the goal criterion G • specifies the search objective • Methods for using prior knowledge Use prior knowledge to • derive an initial hypothesis from which to begin the search • alter the objective G of the hypothesis space search • alter the available search steps O Combining Inductive & Analytical Learning

  7. Using Prior Knowledge to Initialize the Hypothesis • Two Steps 1. initialize the hypothesis to perfectly fit the domain theory 2. inductively refine this initial hypothesis as needed to fit the training data • KBANN(Knowledge-Based Artificial Neural Network) 1. Analytical Step • create an initial network equivalent to the domain theory 2. Inductive Step • refine the initial network (use BACKPROP) Given:  A set of training examples  A domain theory consisting of nonrecursive, propositional Horn clauses Determine:  An artificial neural network that fits the training examples, biased the domain theory  Table 12.2(p.341) Combining Inductive & Analytical Learning

  8. Example: The Cup Learning Task Neural Net Equivalent to Domain Theory Result of refining the network Combining Inductive & Analytical Learning

  9. Remarks • KBANN vs. Backpropagation • when given an approximately correct domain theory & scarce training data • KBANN generalizes more accurately than Backpropagation • Classifying promoter regions in DNA • Backpropagation: error rate 8/106 • KBANN: error rate 4/106 • bias • KBANN • domain-specific theory • Backpropagation • domain-independent syntactic bias • toward small weight values Combining Inductive & Analytical Learning

  10. Using Prior Knowledge to Alter the Search Objective • Use of prior knowledge • incorporate it into the error criterion minimized by gradient descent • network must fit a combined function of the training data & domain theory • Form of prior knowledge • derivatives of the target function • certain type of prior knowledge can be expressed quite naturally • example: recognizing handwritten characters • “the identity of the character is independent of small translations and rotations of the image.” Combining Inductive & Analytical Learning

  11. The TANGENTPROP Algorithm • Domain Knowledge • expressed as derivatives of the target function with respect to transformations of its inputs • Training Derivatives • TANGENTPROP assumes various training derivatives of the target function are provided. • Error Function : transformation(rotation or translation) : constant to determine the relative importance  Table 12.4(p.349) Combining Inductive & Analytical Learning

  12. Remarks • TANGENTPROP combines the prior knowledge with observed training data, by minimizing an objective function that measures both • the network’s error with respect to the training example values • the network’s error with respect to the desired derivatives • TANGENTPROP is not robust errors in the prior knowledge • need to automatically select • EBNN Algorithm Combining Inductive & Analytical Learning

  13. The EBNN Algorithm(1/2) • Input • A set of training examples of the form • A domain theory represented by a set of previously trained NN • Output • A new NN that approximates the target function • Algorithm • Create a new, fully connected feedforward network to represent the target function • For each training example, determine corresponding training derivatives • Use the TANGENTPROP algorithm to train the target network Combining Inductive & Analytical Learning

  14. The EBNN Algorithm(2/2) • Computation of training derivatives • compute them itself for each observed training example • explain each training example in terms of a given domain theory • extract training derivatives from this explanation • provide important information for distinguishing relevant from irrelevant features • How to weight the relative importance of the inductive & analytical component of learning • is chosen independently for each training example • consider how accurately the domain theory predicts the training value for this particular example • Error Function  Figure 12.7(p.353) A(x): domain theory prediction for input x : ith training instance : jth component of the vector x c: normalizing constant Combining Inductive & Analytical Learning

  15. Remarks • EBNN vs. Symbolic Explanation-Based Learning • domain theory consisting of NNs rather than Horn clauses • relevant dependencies take the form of derivatives • accommodates imperfect domain theories • learns a fixed-sized neural network • requires constant time to classify new instances • unable to represent sufficiently complex functions Combining Inductive & Analytical Learning

  16. Using Prior Knowledge to Augment Search Operators • The FOCL Algorithm • Two operators for generating candidate specializations 1. Add a single new literal 2. Add a set of literals that constitute logically sufficient conditions for the target concept, according to the domain theory • select one of the domain theory clauses whose head matches the target concept. • Unfolding: Each nonoperational literal is replaced, until the sufficient conditions have been restated in terms of operational literals. • Pruning: the literal is removed unless its removal reduces classification accuracy over the training examples. • FOCL selects among all these candidate specializations, based on their performance over the data • domain theory is used in a fashion that biases the learner • leaves final search choices to be made based on performance over the training data  Figure 12.8(p.358)  Figure 12.9(p.361) Combining Inductive & Analytical Learning

More Related