1 / 21

Inductive Learning of Rules

Inductive Learning of Rules. Mushroom Edible? Spores Spots Color Y N Brown N Y Y Grey Y N Y Black Y N N Brown N Y N White N Y Y Brown Y Y N Brown N N Red . Don’t try this at home. Types of Learning. What is learning?

erol
Download Presentation

Inductive Learning of Rules

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Inductive Learning of Rules MushroomEdible? Spores Spots Color Y N Brown N Y Y Grey Y N Y Black Y N N Brown N Y N White N Y Y Brown Y Y N Brown N N Red Don’t try this at home...

  2. Types of Learning • What is learning? • Improved performance over time/experience • Increased knowledge • Speedup learning • No change to set of theoretically inferable facts • Change to speed with which agent can infer them • Inductive learning • More facts can be inferred

  3. Mature Technology • Many Applications • Detect fraudulent credit card transactions • Information filtering systems that learn user preferences • Autonomous vehicles that drive public highways (ALVINN) • Decision trees for diagnosing heart attacks • Speech synthesis (correct pronunciation) (NETtalk) • Data mining: huge datasets, scaling issues

  4. Defining a Learning Problem • Experience: • Task: • Performance Measure: A program is said to learn from experience E with respect to task T and performance measure P, if it’s performance at tasks in T, as measured by P, improves with experience E.

  5. Example: Checkers • Task T: • Playing checkers • Performance Measure P: • Percent of games won against opponents • Experience E: • Playing practice games against itself

  6. Example: Handwriting Recognition • Task T: • Performance Measure P: • Experience E: Recognizing and classifying handwritten words within images

  7. Example: Robot Driving • Task T: • Performance Measure P: • Experience E: Driving on a public four-lane highway using vision sensors

  8. Example: Speech Recognition • Task T: • Performance Measure P: • Experience E: Identification of a word sequence from audio recorded from arbitrary speakers ... noise

  9. Issues • What feedback (experience) is available? • What kind of knowledge is being increased? • How is that knowledge represented? • What prior information is available? • What is the right learning algorithm? • How avoid overfitting?

  10. Choosing the Training Experience • Credit assignment problem: • Direct training examples: • E.g. individual checker boards + correct move for each • Indirecttraining examples: • E.g. complete sequence of moves and final result • Which examples: • Random, teacher chooses, learner chooses • Supervised learning • Reinforcement learning • Unsupervised learning

  11. Choosing the Target Function • What type of knowledge will be learned? • How will the knowledge be used by the performance program? • E.g. checkers program • Assume it knows legal moves • Needs to choose best move • So learn function: F: Boards -> Moves • hard to learn • Alternative: F: Boards -> R

  12. The Ideal Evaluation Function • V(b) = 100 if b is a final, won board • V(b) = -100 if b is a final, lost board • V(b) = 0 if b is a final, drawn board • Otherwise, if b is not final V(b) = V(s) where s is best, reachable final board Nonoperational… Want operational approximation of V: V

  13. How Represent Target Function • x1 = number of black pieces on the board • x2 = number of red pieces on the board • x3 = number of black kings on the board • x4 = number of red kings on the board • x5 = number of black pieces threatened by red • x6 = number of red pieces threatened by black V(b) = a + bx1 + cx2 + dx3 + ex4 + fx5 + gx6 Now just need to learn 7 numbers!

  14. Target Function • Profound Formulation: Can express any type of inductive learning as approximating a function • E.g., Checkers • V: boards -> evaluation • E.g., Handwriting recognition • V: image -> word • E.g., Mushrooms • V: mushroom-attributes -> {E, P} • Inductive bias

  15. Theory of Inductive Learning

  16. Theory of Inductive Learning • Suppose our examples are drawn with a probability distribution Pr(x), and that we learned a hypothesis f to describe a concept C. • We can define Error(f) to be: • where D are the set of all examples on which f and C disagree.

  17. PAC Learning • We’re not perfect (in more than one way). So why should our programs be perfect? • What we want is: • Error(f) < e, for some chosen e. • But sometimes, we’re completely clueless: (hopefully, with low probability). What we really want is: • Prob ( Error(f) > e) < d. • As the number of examples grows, e and d should decrease. • We call this Probably approximately correct.

  18. Definition of PAC Learnability • Let C be a class of concepts. • We say that C is PAC learnable by a hypothesis space H if: • there is a polynomial-time algorithm A, • a polynomial function p, • such that for every C in C, every probability distribution Pr, and e and d, • if A is given at least p(1/e, 1/d) examples, • then A returns with probability 1-d a hypothesis whose error is less than e. • k-DNF, and k-CNF are PAC learnable.

  19. Version Spaces: A Learning Alg. • Key idea: • Maintain most specific and most general hypotheses at every point. Update them as examples come in. • We describe objects in the space by attributes: • faculty, staff, student • 20’s, 30’s, 40’s. • male, female • Concepts: boolean combination of attribute-values: • faculty, 30’s, male, • female, 20’s.

  20. Generalization and Specializ... • A concept C1 is more general than C2 if it describes a superset of the objects: • C1={20’s, faculty} is more general than C2={20’s, faculty, female}. • C2 is a specialization of C1. • Immediate specializations (generalizations). • The version space algorithm maintains the most specific and most general boundaries at every point of the learning.

  21. Example T faculty student male female 20’s 30’s male, fac male,stud female,fac female,stud fac,20’s fac, 30’s male,fac,20 male,fac,30 fem,fac,20 male,stud,30

More Related