Time Series Sequence Matching

Time Series Sequence Matching Jiaqin Wang CMPS 565

Papers • “Fast subsequence Matching in time-series database”Christos Faloutsos, M.Ranganathan Yannis Manolopoulos • “Skyline index for time series data”Quanzhong Li, Ines Fernando Vega Lopez, Bongki Moon

Types of Time Series sequence • Financial, marketing area • Stock prices • Sales numbers • Scientific databases • Weather data • Environmental data

Categories for time series sequencematching • Whole matching • data sequences and query sequence have the same length • Subsequence matching • Query sequence and data sequence have different length

Whole matching • Given N sequences with the same length l • Use features extraction function to convert sequences into n-dimensional values • DFT • N-dimensional value (Q1,Q2,…,Qn) • Most energy in first few coefficients • Keep first few coefficients • Reduce dimensions of sequence

Whole matching • Map each sequence as a n-dimensional point into the feature space • Only take first 2 coefficients • Organize these points into R-tree • For index and search in R-tree

Whole matching • New coming query sequence • Use DFT convert to feature point • Map the query feature point into feature space • Find out points whose distance to query point within tolerance e • Consider them similar

Some pictures of time series data and DFT • Discrete Fourier Transform (DFT ) • keep first few (2-3) coefficients • The first few coefficients contain most energy of the feature

Feature space • TS1(0.05,3) • TS2(0.01,12) • ……

Feature space • The distance e < minimum query distance

Subsequence matching • A collection of N sequences, each one has different length • A query Q with tolerance e • Find out all sequence Sі(1<i<N), along with the correct offsets k,such that the sequence Sі[k:k+Len(Q)-1] matches the query sequence: D(Q, Sі[k:k+Len(Q)-1] ) <= e

ST-index • Assuming the minimum query length w • Using a sliding window of size w and place it on the date sequence at every possible offsets of the whole data sequences • Extract the features in window at each possible offset and map each feature as a point into feature space

Figure • Sliding window on sequence from offset 0 to Len(S)-w+1 • The length of window is w

Result • A series of points in the feature space is curve • R-tree

MBRs • Store points in R-tree is inefficient • Divide trial into sub-trials using minimum bounding rectangles (MBRs)

MBRs in R-tree • Combine small MBRs • Get the index information

How to insert points into MBRs • Group the points into MBR with a fixed-number • Group the points into MBR with a variable-number

I-adaptive method • One greedy algorithm • number of disk access • cost function • average cost function

Algorithm • Assign the first point of the trail in a sub-trail • For each successive point • If it increases the average cost of current sub-trail • Then start another sub-trail • Else include this point in current sub-trial

Skyline index for time series data • “Skyline index for time series data”Quanzhong Li, Ines Fernando Vega Lopez, Bongki Moon

Adaptive Piecewise Constant Approximation (APCA) • What is APCA?

Adaptive Piecewise Constant Approximation (APCA) • Limitation of APCA • Internal overlap in MBRs

Skyline Bounding Region (SBR) • SBR • N time series data objects of length l • Specify 2-dimensional regions by top and bottom skylines

Approximate SBR • Many approaches • Equal-length constant-valued segments • Variance-length constant-valued segments • ASBR will cover the original SBR

Index Approximation SBR • R-Tree based Skyline index • Internal node • Approximation SBR • Pointer to child node • Leaf node • Pointer to time series data

The End Thank You

Time Series Sequence Matching

Time Series Sequence Matching

Presentation Transcript

Time Series

Time Series 2 Time Series 1

Time Sequence Diagram

Turn angle function and elastic time series matching

Fast Subsequence Matching in Time-Series Databases

Time Series

Time series

Time Series

Time and Sequence

Sequence and Series

Time Series

Distance Functions for Sequence Data and Time Series

Time series

Time Series

Time Series

Time Series

Subsequence Matching in Time Series Databases

Fast Subsequence Matching in Time-Series Databases

Time Sequence Diagram

Subsequence Matching on Structured Time Series Data

Time Series