150 likes | 339 Views
Finding Approximate Repeating Patterns from Sequence Data. Jia-Lien Hsu ,Arbee L.P. Chen, Hung-Chen Chen. Proceeding : ISMIR2004. Speaker: Pei-Min Chou Date:2005/09/30. Introduction. Discover universal properties Repetitions and trends Application in music retrieval. Approximate type.
E N D
Finding Approximate Repeating Patterns from Sequence Data Jia-Lien Hsu ,Arbee L.P. Chen, Hung-Chen Chen Proceeding:ISMIR2004 Speaker: Pei-Min Chou Date:2005/09/30
Introduction • Discover universal properties • Repetitions and trends • Application in music retrieval
Approximate type • Longer_length • Shorter_length • Equal_length • Pattern may repeat with some variance • Example • ABC • ABKC match this paper
Def-- Longer_length_match(P,LL) • P=(p1,p2,…,pm) • LL=(s1,s2,…,sn) • Longer_length_match(P,LL) • 1,if pi=sbi, for i=1,2…m. • 0,otherwise. • r=n-m:approximation degree • P=(A,B,C,D) LL=(A,B,K,C,M,D) • Longer_length_match(P,LL)=1,r=2 A,B,K,C,M,D
Def--Freq(P,S,r,longer_length) • Freq(P,S,r,longer_length) =Σlonger_length_match(P,LLi) • LLi:substring of S • | LLi|=|P|+r • For any LLi=S[a…b] and LLj=S[c…d] longer_length_match(P,LLi)=1 longer_length_match(P,LLj)=1 either b<c or d<a Example: P=ABC S=AABCDEA XS=ABCCABCD V a b a b c dc d
LL1 LL2 Example1 • P=(ABC) • S=(AKBCDEABLCF), consider r=1 • | LLi|=|P|+r=3+1=4 • AKBC DE ABLC F • Freq(ABC,S,1,longer_length)=2
Definition • Pa_i:range of pattern_length • Pa_r:range of approximation degree pa_r={0,1,…max_pa_r} • Pa_f:minimal repeating frequency • AT( approximation type)
Example2 • S=ABFCDLBMABPFCFD • Consider • Pa_i={1,2,3,4} • Pa_r={0,1} • Pa_f=2 • AT=longer_length • P1={“A”,”B”,”C”,”D”,”F”} • P2={“AB”,”BF”,”CD”,”FC”,”FD”} • P3={“ABF”,”BFC”,”FCD”} • P4={“ABFC”}
Approach • Level-wise • Find approximate repeating patterns • Cut • Pattern_join
Cut • Reduce the substring • Cuti=S[a…b], i=1,2,3….. • cw=max_pa_i+max_pa_r • a=1+(cw*(i-1)) • b=min((2*cw-1)+(cw*(i-1)),strlen(S)) • i:cut_id, strlen(S):length of S
Example---cut • S=ABFCDLBMABPFCFD • Consider • max_pa_i=4,max_pa_r=1 • cw=4+1=5 • Cut1=“ABFCDLBMA” • a=1+(5*(1-1))=1, • b=min((2*5-1)+(5*(1-1)),15)=9 • Cut2=“LBMABPFCF” • Cut3=“PFCFD”
Pattern_join • Pi={<pati(1),plisti(1)>,…,<pati(j),plisti(j)>} • i:pattern set of length • pati(j):j-th pattern in Pi • plisti(j) (cut_id:start,end) • Ex. Cut1=“ABFCDLBMA” Cut2=“LBMABPFCF” Cut3=“PFCFD” P2={<“BF”,(1:2,3),(2:5,7)>},freq=2 • Note:If start>cw=5 plisti(j) :dummy • P2={<“FC”,(1:3,4),(2:7,8),(3:2,3)>},freq=2 • ABFCDLBMABPFCFD
Definition--- pattern_join • PJ(<pati(a),plisti(a)>,<pati(b),plisti(b)>)= • <pati+1(c),plisti+1(c)>,if pati(a)[2..i]= pati(b)[1..(i-1)] • Ø, otherwise • Example: • PJ(<“BF”,(1:2,3),(2:5,7)>,<“FD”,(1:3,5),(3:4,5)>) =<“BFD”,(1:2,5)>
Conclusion • Complete • Longer_length approximation type • Level_wise approach • Preliminary investigation of performance study show our approach is efficient • Future work • Effectiveness of real data • Polyphonic music object • Short_length and equal_length study