An Adaptive Algorithm for Finding Frequent Sets in Landmark Windows

An Adaptive Algorithm for Finding Frequent Sets in Landmark Windows Xuan Hong Dang, Kok-Leong Ong and Vincent Lee Dept. Computer Science, Aarhus University, Denmark School of IT, Deakin University, Australia Faculty of IT, Monash University, Australia

Applications • Sensors of all sorts are generating a lot of data streams • Many applications consume these data streams to discover evolving knowledge about the data stream

Problem • Data rates can exceed compute capacity • Machine must adapt to produce results on time • HOW?

A solution for finding frequent sets • Our method • Approximate frequency counts • Built adaptability in processing through load shedding • Applicable to landmark, forgetful and sliding windows

StreamL • Given a transaction stream • {t1, t2, t3, ……………………………………………………., ti, tj, …} • ti = {x1, x2, …}, where xa is a literal landmark window

StreamL • Capacity is bounded by number of transactions in the window and the size of each transaction • How to measure this capacity? • A simple way is use MFS to estimate how many itemsets to process in each transaction, i.e.,

StreamL • For n transactions in the window, the number of itemsets to process is • If r is the rate, then the capacity to process each transaction can be

StreamL • When rate increases, the idea is to add a P such that • to maintain a non-overload situation. • To achieve a load of C, the adjust made by P is therefore achieved by dropping transactions

StreamL • When transactions are dropped in a window, • {t1, t2, t3, ……………………………………………………., ti, tj, …} • Frequency of X becomes inaccurate • Qualify this with an error e landmark window

StreamL • Qualify this with an error e, which is the result of dropping transactions with probability 1 - P (< 1) • We can use e to compute a guarantee using the Chernoff bounds, i.e., • How confident it is that true support of X deviates from the estimated support of X by +/- e

Details • We presented the idea sketch • See paper for algorithm for landmark window • The idea can be extended to other windows; see technical report for forgetful and sliding window

Thank You

An Adaptive Algorithm for Finding Frequent Sets in Landmark Windows

An Adaptive Algorithm for Finding Frequent Sets in Landmark Windows

Presentation Transcript

LCM: An Efficient Algorithm for Enumerating Frequent Closed Item Sets L inear time C losed itemset M iner

Finding Frequent Items in Data Streams

An Efficient Polynomial Delay Algorithm for Pseudo Frequent Itemset Mining

FP (Frequent pattern)-growth algorithm

Adaptive Playout Algorithm For VoIP

CBW: An Efficient Algorithm for Frequent Itemset Mining

Finding Similar Sets

An Adaptive Image Enhancement Algorithm for Face Detection

An Adaptive File Distribution Algorithm for Wide Area Network

An Adaptive Nearest Neighbor Classification Algorithm for Data Streams

An Adaptive “Sleep” Algorithm for Efficient Power Management in WLANs

Mining Compressed Frequent-Pattern Sets

An Online Algorithm for Finding the Longest Previous Factors

An efficient algorithm for finding double-vertex dominators in circuit graphs

An Adaptive Nearest Neighbor Classification Algorithm for Data Streams

Mining Compressed Frequent-Pattern Sets

An efficient algorithm for detecting frequent subgraphs in biological networks

Finding Frequent Items in Data Streams