Anomalous Association Rules

Anomalous Association Rules Máster Oficial en Soft Computing y Sistemas Inteligentes Universidad de Granada

Introduction Association Rule: X  Y Supp(X Y) ≡ Supp(X  Y) ≥ ε (5%) Conf(X  Y) = ≥ θ (80%) frequent confident Find all the frequent and confident associations Applications  Market basket, CRM, etc.

Introduction Problem: Thousands of rules are found. Unmanageable for any user! There are too many spurious associations. Possible solutions: • Subjective measures • Objective measures The main problem is the type of knowledge an association rule represents

Introduction The crucial problem is to determine which kind of events we are interested in, so that we can appropriately characterize them. It is often more interesting to find surprising non-frequent events than frequent ones. The type of interesting events is application dependent

Introduction • Infrequent itemsets in intrusion detection systems • Exceptions to associations for the detection of conflicting medicine therapies • Unsual short sequences of Nucleotides in genome sequencing • Etc.

Introduction Our Objective To introduce the concept of anomalous association rule as a confident rule representing homogeneous deviations from common behavior.

Related Work Suzuki, Hussain & Suzuki: “Exception Rules” X Y is an association rule X  I ¬ Y is the exception rule I is the “Interacting” itemset X  I is the reference rule Too many exceptions

Our Definition X usually implies Y (dominant rule) X Y frequent and confident When X does not imply Y, then it usually implies A (the Anomaly) X ¬Y  A Anomalous association rule confident X Y  ¬A confident

Our Definition

Our Definition X Y is the dominant rule

Our Definition X A when ¬ Y is the anomalous rule

Our Definition some overlapping cases may appear

Our Definition If symptons-X then disease-Y If symptons-X then disease-A when not disease-Y disease-A does not occur at the same time of symptons-X and disease-Y

Algorithm Based on TBAR “Tree based association rules” Data & Knowledge Engineering (2001) Berzal, Cubero, Marín, Serrano

A #7 B #9 C #7 D #8 D #5 B #6 C #6 D #7 D #5 D #5 D #5 Algorithm (assoc. rules) Possible Items:A, B, C, D, E, F L1 7 instances wih A 6 inst. withAB L2 5 inst. withAD 6 inst. withBC 5 inst. withABD L3

A#7 AB#6 AC#4 AD#5 AE#3 AF#3 B #9 C #7 D #8 A#7 B #6 D #5 A#7 A* Non frequent Algorithm (anomalous rules) Possible Items:A, B, C, D, E, F First scan Second scan

B #9 D #5 C #7 D #8 A#7 B #6 A#7 A* B #9B* C #7C* D #8D* C #6 D #7 D #5 Algorithm (anomalous rules) Possible Items:A, B, C, D, E, F First scan Second scan Candidate generation

Algorithm (anomalous rules) Rule generation: Inmediate from the frequent items

Experimentation El “Núcleo” de X  Y|A es Y|A

Usual consequent “Anomaly” Experimentation X Y if X then A when not Y X ¬Y  A

Experimentation Nursery: if NURSERY:very_crit and HEALTH:priority then CLASS:priority (9 out of 9) when not CLASS:spec_prior “Anomaly” Usual consequent

Experimentation Census: “Anomaly” if WORKCLASS: Local-gov then CAPGAIN: [99999.0 , 99999.0] (7 out of 7) when not CAPGAIN: [0.0 , 20051.0] Usual consequent

Conclusions We have introduced an alternative type of interesting knowledge: anomalous association rules We have given an efficient algorithm to detect all the anomalies

Conclusions Future Work: To complete experimentation To filter the anomalies, eliminating redundant rules To introduce measures of interest for the anomalies, allowing their ordering

Anomalous Association Rules

Anomalous Association Rules

Presentation Transcript

“Association Rules”

Association Rules

Association Rules

Association Rules Outline

Association Rules

Positional Association Rules

Association Rules

Association Rules

Association Rules

Association Rules

Association Rules

5. Association Rules

Association Rules

Association Rules

Association Rules Outline

Association Rules

Association Rules

Association Rules

Association Rules

Mining Association Rules

Association Rules