1 / 30

Time Series Sequence Matching

Time Series Sequence Matching. Jiaqin Wang CMPS 565. Papers. “ Fast subsequence Matching in time-series database ” Christos Faloutsos, M.Ranganathan Yannis Manolopoulos “ Skyline index for time series data ” Quanzhong Li, Ines Fernando Vega Lopez, Bongki Moon. Types of Time Series sequence.

elam
Download Presentation

Time Series Sequence Matching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Time Series Sequence Matching Jiaqin Wang CMPS 565

  2. Papers • “Fast subsequence Matching in time-series database”Christos Faloutsos, M.Ranganathan Yannis Manolopoulos • “Skyline index for time series data”Quanzhong Li, Ines Fernando Vega Lopez, Bongki Moon

  3. Types of Time Series sequence • Financial, marketing area • Stock prices • Sales numbers • Scientific databases • Weather data • Environmental data

  4. Categories for time series sequencematching • Whole matching • data sequences and query sequence have the same length • Subsequence matching • Query sequence and data sequence have different length

  5. Whole matching • Given N sequences with the same length l • Use features extraction function to convert sequences into n-dimensional values • DFT • N-dimensional value (Q1,Q2,…,Qn) • Most energy in first few coefficients • Keep first few coefficients • Reduce dimensions of sequence

  6. Whole matching • Map each sequence as a n-dimensional point into the feature space • Only take first 2 coefficients • Organize these points into R-tree • For index and search in R-tree

  7. Whole matching • New coming query sequence • Use DFT convert to feature point • Map the query feature point into feature space • Find out points whose distance to query point within tolerance e • Consider them similar

  8. Some pictures of time series data and DFT • Discrete Fourier Transform (DFT ) • keep first few (2-3) coefficients • The first few coefficients contain most energy of the feature

  9. Feature space • TS1(0.05,3) • TS2(0.01,12) • ……

  10. Feature space • The distance e < minimum query distance

  11. Subsequence matching • A collection of N sequences, each one has different length • A query Q with tolerance e • Find out all sequence Sі(1<i<N), along with the correct offsets k,such that the sequence Sі[k:k+Len(Q)-1] matches the query sequence: D(Q, Sі[k:k+Len(Q)-1] ) <= e

  12. ST-index • Assuming the minimum query length w • Using a sliding window of size w and place it on the date sequence at every possible offsets of the whole data sequences • Extract the features in window at each possible offset and map each feature as a point into feature space

  13. Figure • Sliding window on sequence from offset 0 to Len(S)-w+1 • The length of window is w

  14. Figure • Sliding window on sequence from offset 0 to Len(S)-w+1 • The length of window is w

  15. Figure • Sliding window on sequence from offset 0 to Len(S)-w+1 • The length of window is w

  16. Figure • Sliding window on sequence from offset 0 to Len(S)-w+1 • The length of window is w

  17. Figure • Sliding window on sequence from offset 0 to Len(S)-w+1 • The length of window is w

  18. Result • A series of points in the feature space is curve • R-tree

  19. MBRs • Store points in R-tree is inefficient • Divide trial into sub-trials using minimum bounding rectangles (MBRs)

  20. MBRs in R-tree • Combine small MBRs • Get the index information

  21. How to insert points into MBRs • Group the points into MBR with a fixed-number • Group the points into MBR with a variable-number

  22. I-adaptive method • One greedy algorithm • number of disk access • cost function • average cost function

  23. Algorithm • Assign the first point of the trail in a sub-trail • For each successive point • If it increases the average cost of current sub-trail • Then start another sub-trail • Else include this point in current sub-trial

  24. Skyline index for time series data • “Skyline index for time series data”Quanzhong Li, Ines Fernando Vega Lopez, Bongki Moon

  25. Adaptive Piecewise Constant Approximation (APCA) • What is APCA?

  26. Adaptive Piecewise Constant Approximation (APCA) • Limitation of APCA • Internal overlap in MBRs

  27. Skyline Bounding Region (SBR) • SBR • N time series data objects of length l • Specify 2-dimensional regions by top and bottom skylines

  28. Approximate SBR • Many approaches • Equal-length constant-valued segments • Variance-length constant-valued segments • ASBR will cover the original SBR

  29. Index Approximation SBR • R-Tree based Skyline index • Internal node • Approximation SBR • Pointer to child node • Leaf node • Pointer to time series data

  30. The End Thank You

More Related