1 / 18

Marzena Kryszkiewicz DaWak 2009

Non-Derivable Item Set and Non-Derivable Literal Set Representations of Patterns Admitting Negation. Marzena Kryszkiewicz DaWak 2009. Outline. Motivation Preliminary Representing Frequent Itemsets with Non-derivable itemsets Patterns admitting negation

armand
Download Presentation

Marzena Kryszkiewicz DaWak 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Non-Derivable Item Set and Non-Derivable Literal Set Representations of Patterns Admitting Negation MarzenaKryszkiewicz DaWak 2009

  2. Outline • Motivation • Preliminary • Representing Frequent Itemsets with Non-derivable itemsets • Patterns admitting negation • Properties of Derivable and Non-derivable Lisets • Representing frequent positive and negative patterns • Conclusion

  3. Motivation • Patterns and association rules can be generalized by admitting negation. • E.g. 75% of customers who buy coke also buy chips and neither beer nor milk. • Admitting negation in patterns usually results in an abundance of mined patterns, which makes analysis of the discovered knowledge infeasible. • It is preferable to discover and store a possibly small fraction of patterns, from which one can derive all other significant patterns when required.

  4. (Cont.) • In this paper, the properties of derivable and non-derivable patterns are examined. • The important relationships among patterns admitting negation that have the same canonical variation are established. • Lossless representations of frequent positive patterns were discussed. E.g. NDRL(non-derivable literal sets lossless representation), and NDIR( a concise representation)

  5. Downward Closed Sets • A set is defined as down ward closed, if • Property • Let . If , then sup(X)≥ sup(Y) • The set of all frequent itemsets is down ward closed.

  6. Generalized Disjunctive Rules • Let , is defined a generalized disjunctive rule based on Z, if and • sup( ) is defined as the number of transactions in D in which X occurs together with at least one item from A. • E.g. , and

  7. (Cont.) • Thm: Let be a generalized disjunctive rule. Then: • E.g. • err ( ) is defined as the number of transactions containing X that do not contain any item from A • is defined a certain rule, if err ( ) =0

  8. (Cont.) • Let be a generalized disjunctive rule. Then : • Let be a generalized disjunctive rule. Then : • doubt!! • E.g. be a generalized disjunctive rule. Then:

  9. Using Generalized association rules to estimate supports of itemsets • , when |Y|is even • , when |Y|is odd • Given itemsetB, we obtain the folowing set of 2|B| inequalities bounding sup(B):

  10. (Cont.)

  11. Representing Frequent Itemsets with Non-derivable itemsets • An itemsetX is defined as non-derivable if l(X)≠u(X) • NDR was defined as the set of all frequent non-derivable itemsets stored altogether with their supports:

  12. Patterns admitting negation • A liset is defined as a set consisting of non-contradictory literals • A liset is called positive if all literals contained in it are positive.

  13. (Cont.) • A canonical variation of a lisetX is defined as an itemset obtained from X by replacing all negative literals in X. That is, • All lisets having tha same canonical variation as lisetX are denoted by

  14. (Cont.) • Example:

  15. Properties of Derivable and Non-derivable Lisets • Thm: Let B be a liset. • The bound on the length of non-derivable lisets contains at most • at least 2|Z|-1 variations of Z have supports greater than 0. Hence, 2|Z|-1 ≤|D|, so |Z|≤

  16. Representing frequent positive and negative patterns • NDLR(non-derivable liset representation of frequent patterns admitting negation) as the family of all frequent non-derivable lisets stored altoghther with their supports: • NDIR (non-derivable itemset representation of frequent patterns admitting negation) is defined as non-derivable itemsets stored altogether with their supports each of which has at least one frequent variation:

  17. (Cont.)

  18. Conclusion • It introduced two lossless representations of frequent patterns admitting negation • doubt!!

More Related