A Parameterised Algorithm for Mining Association Rules

Department of Information & Computer Education, NTNU A Parameterised Algorithm for Mining Association Rules Nuansri Denwattana, and Janusz R Getta, Database Conference 2001 (ADC 2001) Proceedings. 12th Australasian, 29 Jan.-2 Feb. 2001, pp. 45-51. Advisor：Jia-Ling Koh Speaker：Chen-Yi Lin

Department of Information & Computer Education, NTNU Outline • Introduction • Problem Definition • Finding Frequent Itemsets • Experimental Results • Conclusion

Department of Information & Computer Education, NTNU Introduction (1/2) • Majority of the algorithms finding frequent itemsets counts one category of itemsets, e.g. Apriori algorithm. • The quality of association rule mining algorithms is determined: • the number of passes through an input dataset • the number of candidate itemsets

Department of Information & Computer Education, NTNU Introduction (2/2) • One of the objectives is to construct an algorithm that makes a good guess. • the parameterised (n, p) algorithm finds all frequent itemsets from a range of n levels in itemset lattice in p passes (n>=p) through an input data set.

Department of Information & Computer Education, NTNU Problem Definition • Positive candidate itemset • It is assumed (guessed) to be frequent. • Negative candidate itemset • It is assumed (guessed) to be not frequent. • Remaining candidate itemset • candidates verified in another scan.

Department of Information & Computer Education, NTNU Finding Frequent Itemsets (Guessing Candidate Itemsets) Statistics table T Initial DB scan scan

Department of Information & Computer Education, NTNU apriori_gen Statistics table T Item frequency threshold = 80% m-element transaction threshold = 5 Number of levels to traverse (n) = 3 Number of passes through an input data set (p) = 2 3-element transactions: 5*80%=4  {B}  4-element transactions: 2*80%=2  {ABC} 5-element transactions: 3*80%=3  {BCEF}

Department of Information & Computer Education, NTNU apriori_gen apriori_gen pruning all subsets of positive superset

Department of Information & Computer Education, NTNU scan DB (1) generate remaining candidate itemsets Finding Frequent Itemsets (Verification of Candidate Itemsets) Minimum support=20%

Department of Information & Computer Education, NTNU apriori_gen scan DB scan DB (2)

Department of Information & Computer Education, NTNU Finding Frequent Itemsets

Department of Information & Computer Education, NTNU Experimental Results (1/6) • Parameters: • ntrans－number of transactions in a database • tl－average transaction length • np－number of patterns • sup－minimum support

Department of Information & Computer Education, NTNU Experimental Results (2/6) A comparison of no. database scans between Apriori and (n, p) algorithm

Department of Information & Computer Education, NTNU Experimental Results (3/6) Performance of Apriori and (n, p) with tl=10 np=10 sup=20%

Department of Information & Computer Education, NTNU Experimental Results (4/6) Performance of Apriori and (n, p) algorithm with tl=14 np=10 sup=20% Performance of Apriori and (n, p) algorithm with tl=20 np=100 sup=10%

Department of Information & Computer Education, NTNU Experimental Results (5/6) A performance of (n,3) with increasing ratio of (n/p)

Department of Information & Computer Education, NTNU Experimental Results (6/6) A performance of (8,p) with increasing parameter p

Department of Information & Computer Education, NTNU Conclusion • The important contribution is the reduction of number scans through a data set.

A Parameterised Algorithm for Mining Association Rules

A Parameterised Algorithm for Mining Association Rules

Presentation Transcript

Visualizing Association Rules for Text Mining

Data Mining Association Rules

Mining Association Rules

Mining Association Rules

Fast Algorithms for Mining Association Rules

DATA MINING - ASSOCIATION RULES-

Mining Association Rules

A distributed method for mining association rules

Fast Algorithms for Mining Association Rules

Fast Algorithms For Mining Association Rules

PARMA: A Parallel Randomized Algorithm for Approximate Association Rules Mining in MapReduce

Mining Causal Association Rules

Data Mining Association Rules

Association Rules Mining

Fast Algorithms for Mining Association Rules

Incremental Mining Association Rules

An Efficient Algorithm for Incremental Mining of Association Rules

A Classical Apriori Algorithm for Mining Association Rules

Mining Generalized Association Rules

Algorithms for Mining Association Rules

Hash-Based Algorithm for Mining Association Rules

Mining Negative Association Rules