230 likes | 477 Views
Association Rules. Transactional data Algorithm Applications. Market Basket Analysis. Transactional data Sparse matrix: thousands of columns, each row has only dozens of values Items Itemsets: transactions (TID) A most cited example “diapers and beer”. Association rule mining.
E N D
Association Rules Transactional data Algorithm Applications CSE591: Data Mining by H. Liu
Market Basket Analysis • Transactional data • Sparse matrix: thousands of columns, each row has only dozens of values • Items • Itemsets: transactions (TID) • A most cited example “diapers and beer” CSE591: Data Mining by H. Liu
Association rule mining • Finding interesting association or correlation relationships • Defining interesting association rules • Support (P(AB)) • Confidence (P(B|A)) • An association rule • A -> B CSE591: Data Mining by H. Liu
Finding association rules • Finding frequent itemsets • downward closure property (or anti-monotonic) • Finding association rules from frequent itemsets • Frequent Itemsets • minisup • from 1-itemset to k-itemset • Association rules • miniconf • satisfying minimum confidence • Level-wise search • Anti-monotone property CSE591: Data Mining by H. Liu
Apriori candidate set generation • For k=1, C1 = all 1-itemsets. • For k>1, generate Ck from Lk-1 as follows: • The join step Ck = k-2 way join of Lk-1 with itself If both {a1, …,ak-2, ak-1} & {a1, …, ak-2, ak} are in Lk-1, then add {a1, …,ak-2, ak-1, ak} to Ck (We keep items sorted). • The prune step Remove {a1, …,ak-2, ak-1, ak} if it contains a non-frequent (k-1) subset • An example CSE591: Data Mining by H. Liu
Derive rules from frequent itemsets • Frequent itemsets != association rules • One more step is required to find association rules • For each frequent itemset X, For each proper nonempty subset A of X, • Let B = X - A • A B is an association rule if • Confidence (A B) ≥ minConf, where support (A B) = support (AB) and confidence (A B) = support (AB) / support (A) CSE591: Data Mining by H. Liu
Issues • Efficiency and thresholding for minsup • Number of association rules • size of data vs. size of association rules • Post-processing • Applications • combining association rules with classification • emergency patterns CSE591: Data Mining by H. Liu
Types of association rules • Single dimensional association rules • Multiple dimensional association rules • Multi-level association rules • Many other research activities on association rules • Creating ARs without candidate set generation • Speeding up rule generation • Interestingness measures CSE591: Data Mining by H. Liu