220 likes | 360 Views
Similarity Measure Based on Partial Information of Time Series. Advisor : Dr. Hsu Graduate : You-Cheng Chen Author : Xiaoming Jin Yuchang Lu Chunyi Shi. Outline. Motivation Objective Introduction Retrieval and Representation of partial Information
E N D
Similarity Measure Based on Partial Information of Time Series Advisor:Dr. Hsu Graduate:You-Cheng Chen Author:Xiaoming Jin Yuchang Lu Chunyi Shi
Outline • Motivation • Objective • Introduction • Retrieval and Representation of partial Information • System Setup • Results and Discussion • Conclusions • Personal Opinion
Motivation • A “good” similarity measurement is determined • by human.
Objective • To propose a model for the retrieval and representation of the partial information in time series.
Introduction The model has three objects: Get the partial information Represent partial information in a compressed form Most similarity model could be applied
Retrieval and Representation of Partial Information • 3.1 General Description Definition 1: Use a rule F to decompose X into a set of time series
3.1 General Description Definition 2: (1) Segment X into a set of sub-series (2) X’jk is the k-th F-based component of sub-series Xj Use mapping rule T to map each X’jk to a value Rk(j)
3.1 General Description Definition 3: is the orders of all the representing sequences of interest. where An is the degree of user’s interest to n-th component is portion of partial information of interest
3.1 General Description Definition 4: is the full representing sequence(FRS) of the partial information
3.1 General Description Definition 5: Given two time series X,Y
3.1 General Description Sum up, a representing model for partial information can be summarized by • Decomposition method F • Representation method T • Distance measurement D
3.1 General Description Example 1
3.1 General Description Use F to decompose time series to two components (1) Local fluctuating movement S’1 (2) Global movement S’2 FRS(X)=R1 and the length of the FRS(X)=200/8
3.2 Practial Method Let H is transform matrix of a given orthonormal discrete transform So Tj=H*Xj We denote the results of discrete transform of time Series Xj and Yj by DT(Xj)=XTj, DT(Yj)=YTj
3.2 Practial Method The k-th component of X is The k-th representing sequence is Then FRS(X) can be calculated as:
3.2 Practial Method Here we use DCT(discrete Cosine transform) in our experiments
4. System Setup • 4.1 Evaluation of Similarity Measurement Based on • Partial Information We use hierarchical agglomerative clustering(HAC) to cluster FRSs.
5. Results and Discussion • We used historical stock data and only considered the • time series of closing price. Step 1: use DCT to decompose time series and to represent partial information. Step 2: E=(E1,…,Er) to represent the chosen portion. Step 3: E was used to calculate K and together with A Then FRSs of each time series were generated Step 4: calculating MD and clustering
Conclusions • The experimental results could help designing a • more effective and more efficient similarity measurement
Personal Opinion The similarity measurement can be improved better by increasing the weight of the meaningful component.