220 likes | 236 Views
This study explores methods for summarizing closed sequential patterns with closed partial orders. It focuses on grouping closed sequential patterns and obtaining compact representations. The experiment compares different datasets for evaluation.
E N D
Summarizing Sequential Data with Closed Partial Orders Gemma Casas-Garriga Proceedings of the SIAM International Conference on Data Mining (SDM'05) Advisor:Jia-Ling Koh Speaker:Chun-Wei Hsieh 03/10/2006
Introduction • Closed patterns is a compact and significative set • The number of closed patterns may be still quite large • Summarizing closed patterns with post-processing
<(A)(C)(C)(C)(A)>,<(C)(A)(C)(C)(A)> Which is better than the other ? Motivation
Main steps • Grouping Closed Sequential Patterns • Obtaining Closed Partial Orders
Grouping Closed Sequential Patterns • A valid pair (S, T ) • S ⊆CS is a nonredundant set of closed sequences, whose tid lists are at least T • T ⊆ D is the maximal set of transactions where all s ∈ S are contained.
<(C)(A)(C)>? • The naive way may miss some element • Ex: <(C)(A)(C)> Grouping Closed Sequential Patterns • A naive approach is to group closed sequences with the same tid list
Grouping Closed Sequential Patterns • Let (S, T ) be a valid pair, then we have that S = t • for all s ∈ S we have that tid(s) is at least T • It has to use the transactions of the database
(S′, T ′) (S, T ) Grouping Closed Sequential Patterns • Given two valid pairs (S′, T′) and (S, T ), if T ⊆ T′ then for all s′∈ S′ there exists s ∈ S s.t. s′⊆ s.
Obtaining Closed Partial Orders • obtain a compact representation from each valid pair (S, T ) • A partial order can be modelled as a triple p = (V,E, l)
Obtaining Closed Partial Orders • Given a set of sequences S and let s, s′ ∈ S be two sequences s = , = • if − = ; and, − head (s, I ) ⋄ tail ( , j + 1) ⊆ , for some ∈ S; and, − head ( , j ) ⋄ tail ( s , i + 1) ⊆ , for some ∈ S. then that position i of s matches with position j of ; note it by p[i] ∼ q[j].
CCCA ACACCA ACC CA CACCA CAC CA ACCCA Obtaining Closed Partial Orders • S={<(A)(C)(C)(C)(A)>,<(C)(A)(C)(C)(A)>} AC CCA C ACCA
Obtaining Closed Partial Orders • Using the transitivity property to improve the algorithm • Transitivity: Given a valid pair (S, T ) let s, , ∈ S, if s[i] ∼ [j] and [j] ∼ [k], then s[i] ∼ [k].
Experiment • 3 different sequential database • Synthetic data (1000 transactions) • The command history of a unix computer user (607 transactions) • The first chapter of the book “1984” by George Orwell (340 transactions)