1 / 27

Methods for Learning Rule Sets in Computer Science

Explore sequential covering algorithm, genetic algorithm, and more to learn sets of rules. Sequential covering optimizes rule sets for accuracy. Incorporate greedy search for rule performance improvement.

pchad
Download Presentation

Methods for Learning Rule Sets in Computer Science

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chap. 10Learning Sets of Rules 박성배 서울대학교 컴퓨터공학과

  2. Learning Disjunctive Sets of Rules • Method 1 • Learn Decision Tree • Translate Tree into Rules • Method 2 • Genetic Algorithm • Method 3 • Learn Rule Sets Directly • Sequential Covering Algorithm

  3. Sequential Covering Algorithm (1) • SEQUENTIAL-COVERING(Target_attribute, Attributes, Examples, Threshold) • Learned_rules {} • Rule  LEARN-ONE-RULE(Target_attribute, Attributes, Examples) • while PERFORMANCE(Rule, Examples) > Threshold, do • Learned_rules Learned_rules + Rule • Examples  Examples - {examples correctly classified by Rule} • Rule  LEARN-ONE-RULE(Target_attribute, Attributes, Examples) • Learned_rules sort Learned_rules according to PERFORMANCE over Examples • return Learned_rules

  4. Sequential Covering Algorithm (2) 1. Learn one rule with high accuracy, any coverage 2. Remove positive examples covered by this rule 3. Repeat • Greedy Search • No Guarantee of Best Set of Rules

  5. Learn-One-Rule (1) • General to Specific Search • Greedy Depth-First Search • No Backtracking • Begin with most general rule • Greedily Adding Attribute Test • one which most improve rule performance • High Accuracy, Incomplete Coverage

  6. Learn-One-Rule (2)

  7. General to Specific Beam Search • To Reduce Risk of Suboptimal Choice • Maintain a List of k best candidates

  8. Learning Rule Sets (1) • Sequential Covering Algorithm • Learn One Rule at a time • Partition Data by Attribute-Value Pair • ID3 • Learn Entire Set of Disjunctive Rules • Partition Data by Attribute • If data is plentiful, sequential covering will be better.

  9. Learning Rule Sets (2) • Sequential Covering Algorithm • General to Specific Search • Single Maximally General Hypothesis • Generate then Test Search • Robust : Impact of Noisy Data is Minimized • Find-S Algorithm • Specific to General Search • Example-Driven

  10. Learning Rule Sets (3) • Rule Post-Pruning as Decision Tree • Rule PERFORMANCE • relative frequency • m-estimate of accuracy • entropy (= Information Gain)

  11. Learning First-Order Rules • Motivation for First-Order Rules • More Expressive • Inductive Logic Programming (ILP) • Inductive Learning of First-order Rules  Automatic Inferring PROLOG Program • First-Order Horn Clauses • Horn Clause • a Clause Containing at most One Positive Literal • H L1   Ln • H  (L1    Ln)

  12. Learning Sets of First-Order Rules: FOIL • FOIL • Natural Extension of SEQUENTIAL-COVERING & LEARN-ONE-RULE • Literals cannot contain Function Symbols. • Body of Rule May be Negated.

  13. FOIL (1)

  14. FOIL (2) • Seek Rules that Predict When the Target is TRUE • Hill-Climbing Search • Outer Loop • Generalize Current Disjunctive Hypothesis • Specific to General Search • Inner Loop • Hypothesis Space Consists of Conjunctions of Literals • General to Specific, Hill-Climbing Search

  15. Generating Candidate Specializations in FOIL (1) • Suppose Current Rule is P(x1, x2, , xk)  L1  Ln • New Literal Ln+1 that fit one of the following Forms: • Q(v1, , vr) • Q : Predicate name occurring in Predicates • vi : new variable or variables present in the rule • At least one vi must already exist in current rule • Equal(xj, xk) • negation of either of the above forms

  16. Generating Candidate Specializations in FOIL (2) • Example • Begin with most general rule • GrandDaughter(x, y)  • Generate Following Literals as Candidate • Equal(x, y), Female(x), Female(y), Father(x, y), Father(x, z), Father(z, x), Father(y, z), Father(z, y), and negation of these literals. • Suppose that Father(y, z) be most promising • GrandDaughter(x, y)  Father(y, z) • Iterate • GrandDaughter(x, y)  Father(y, z)  Father(z, x)  Female(y)

  17. Guiding Search in FOIL • To Select the Most Promising Literal • Consider Performance of Rule Over Training Data • Consider All Possible Bindings of Each Variable

  18. Guiding Search in FOIL • Information Gain in FOIL where • L is the candidate literal to add to rule R • p0 = number of positive bindings of R • n0 = number of negative bindings of R • p1 = number of positive bindings of R+L • n1 = number of negative bindings of R+L • t is the number of positive bindings of R also covered by R+L • Reduction of Number of Bits due to L

  19. Induction As Inverted Deduction (1) • Induction is Finding h such that where • xi is ith training instance • f(xi) is target function value for xi • B is other background knowledge

  20. Induction As Inverted Deduction (2) • Designing Inverse Entailment Operators O(B, D) = h such that • Minimum Description Length Principle • to choose hypothesis among hypotheses which satisfying • Practical Difficulty • Do not Allow Noisy Training Data • No. of Hypotheses satisfying is so large • Complexity of Hypothesis Space Increases as B is Increased.

  21. Deduction : Resolution Rule • P L L  R P R 1. Given initial clauses C1 and C2, find a literal L from clause C1 such that L occurs in clause C2. 2. Form the resolvent C by including all literals from C1 and C2, except for L and L. C = (C1- {L})  (C2- {L })

  22. Inverse Resolution Operator • Not Deterministic • Multiple C2 such that C1 and C2 produce C • Prefer Shorter One 1. Given initial clause C1 and C, find a literal L that occurs in C1, but not in C2. 2. Form the second clause C2 by including the following literals C2= (C- (C1 - {L})  {L }

  23. Rule-Learning Algorithm Based on Inverse Entailment Operators • Use Sequential Covering Algorithm 1. Select Training Example <xi, f(xi)> yet not covered 2. Apply Inverse Resolution to Generate Hypotheses hi That Satisfy 3. Iterate

  24. First-Order Resolution •  is a Unifying Substitution for two Literals L1 and L2, if L1  = L2 . 1. Find a literal L1 from C1, literal L2 from C2, and substitution such  that L1  = L2 . 2.From the Resolvent C by including all literals from C1 and C2, except L1  and L2 . C = (C1- {L1})  (C2- {L2})

  25. Inverting First-Order Resolution (1) • C = (C1- {L1}1)  (C2- {L2})2 where  = 12 • By Definition, L2 = L112-1, • C2 = (C - (C1 - {L1})1)2-1  {L112-1}

  26. Inverting First-Order Resolution (2) • Training Data D = GrandChild(Bob, Shanon), Background Info. B = {Father(Shannon, Tom), Father(Tom, Bob)}.

More Related