1 / 17

Mining the Most Interesting Rules

Mining the Most Interesting Rules. Roberto J. Bayardo Jr., Rakesh Agrawal Presented by: Mohamed G. Elfeky. Introduction. Algorithms for mining rules: Constraint-based Heuristic (Predictive rules) Interestingness-metric Several interestingness metrics:

graham
Download Presentation

Mining the Most Interesting Rules

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining the Most Interesting Rules Roberto J. Bayardo Jr., Rakesh Agrawal Presented by: Mohamed G. Elfeky

  2. Introduction • Algorithms for mining rules: • Constraint-based • Heuristic (Predictive rules) • Interestingness-metric • Several interestingness metrics: • confidence, support, laplace, gain, conviction

  3. Generic Problem Statement The rule: A  C The input is: (U, D, , C, N) • U is a set of conditions for the rule antecedent. • D is a data-set. •  is a total order on rules. • C is a condition for the rule consequent. • N is a set of constraints on rules.

  4. Optimized Rule Mining • Find a set A1 U such that: • A1satisfiesN, •   A2  U: A2 satisfies N  A1 < A2. • Any rule A  C whose A  A1 is optimal. • Generally, this is NP-Hard problem.

  5. Partial-Order Optimized Rule Mining • Partial order vs. Total order • Some rules may be incomparable. • Several equivalence classes for optimal rules. • Find a set O P(U) such that: •  A O: A is optimal, • For each equivalence class that has a rule that is optimal, exactly one member of this class is within O.

  6. Monotonicity • f(x) is said to be monotone in x if: x1 < x2 f(x1)  f(x2) • f(x) is said to be anti-monotone in x if: x1 < x2 f(x1)  f(x2)

  7. Optimality • SC-Optimality • PC-Optimality • Definition • Theoretical Implications • Practical Implications

  8. SC-Optimality:Definition The partial order sc • For rules r1 and r2: r1 <scr2 if and only if: • sup(r1)  sup(r2)  conf(r1) < conf(r2), or • sup(r1) < sup(r2)  conf(r1)  conf(r2). • Also, r1 =scr2 if and only if: • sup(r1) = sup(r2)  conf(r1) = conf(r2).

  9. SC-Optimality:Definition (cont.) The partial order s c • For rules r1 and r2: r1 <s  cr2 if and only if: • sup(r1)  sup(r2)  conf(r1) > conf(r2), or • sup(r1) < sup(r2)  conf(r1)  conf(r2). • Also, r1 =s  cr2 if and only if: • sup(r1) = sup(r2)  conf(r1) = conf(r2).

  10. SC-Optimality:Definition (cont.) sc-optimal rule sc-optimal rule non-optimal rule confidence No optimal rules fall outside the borders support

  11. SC-Optimality:Theoretical Implications • A total order t is implied by scif: • r1 scr2  r1 tr2 ^ r1 =scr2  r1 =tr2 • r is optimal for scr is optimal for t. • t defined by f(r) is implied by scif: • f(r) is monotone in support, and • f(r) is monotone in confidence.

  12. SC-Optimality:Theoretical Implications (cont.) • Interestingness metrics: • laplace(r) = • gain(r) = sup(r) (1 – /conf(r)) • conviction(r) = /(1 – conf(r)) sup(r) + 1 sup(r)/conf(r) + k

  13. PC-Optimality:Definition The partial order pc • For rules r1 and r2: r1 <pcr2 if and only if: • pop(r1)  pop(r2)  conf(r1) < conf(r2), or • pop(r1)  pop(r2)  conf(r1)  conf(r2). • Also, r1 =pcr2 if and only if: • pop(r1) = pop(r2)  conf(r1) = conf(r2).

  14. PC-Optimality:Definition (cont.) • pop(A  C) is the set of records from D that satisfy both A and C. • |pop(r)| = sup(r)  |D| • Analogously, the definition of p  c

  15. PC-Optimality:Theoretical Implications • scis implied by pc and s  cby p  c. • pc results in more incomparable rule pairs. • pc-optimal rule set will contain more rules than sc-optimal rule set.

  16. Optimality:Practical Implications • Two algorithms are proposed, one for each type of optimality. • Each algorithm produces a set of optimal rules without specifying the interestingness metrics. • The produced set is guaranteed to identify the most interesting rules according to several metrics.

  17. Optimality:Practical Implications (cont.) • These algorithms facilitate interactivity: • Examine the optimal rules according to some metric without additional querying or mining. • Find the most interesting rule that characterizes any given subset of the population.

More Related