260 likes | 377 Views
Indexing of Time Series by Major Minima and Maxima. Eugene Fink Kevin B. Pratt Harith S. Gandhi. Example:. 0, 3, 1, 2, 0, 1, 1, 3, 0, 2, 1, 4, 0, 1, 0. 4. 3. 2. 1. 0. Time series. A time series is a sequence of real values measured at equal intervals. Results.
E N D
Indexing of Time Seriesby Major Minima and Maxima Eugene Fink Kevin B. Pratt Harith S. Gandhi
Example: 0, 3, 1, 2, 0, 1, 1, 3, 0, 2, 1, 4, 0, 1, 0 4 3 2 1 0 Time series A time series is a sequence of real values measured at equal intervals.
Results • Compression of a time series by extracting its major minima and maxima • Indexing of compressed time series • Retrieval of series similar to a given pattern • Experiments with stock and weather series
Outline • Compression • Indexing • Retrieval • Experiments
Compression We select major minima and maxima, along with the start point and end point, and discard the other points. We use a positive parameter R to control the compression rate.
a[i] a[j] R R a[m] Major minima • A point a[m] in a[1..n] is a major minimum if there are i and j, where i < m < j, such that: • a[m] is a minimum among a[i..j], and • a[i] – a[m] R and a[j] – a[m] R.
Major maxima • A point a[m] in a[1..n] is a major maximum if there are i and j, where i < m < j, such that: • a[m] is a maximum among a[i..j], and • a[m] – a[i] R and a[m] – a[j] R. a[m] R R a[i] a[j]
Compression procedure The procedure performs one pass through a given series. It takes linear time and constant memory. It can compress a live serieswithout storing it in memory.
Outline • Compression • Indexing • Retrieval • Experiments
Indexing of series We index series in a database by their major inclines, which are upward and downward segments of the series.
a[j] a[i] Major inclines • A segment a[1..j] is a major upward incline if • a[i] is a major minimum; • a[j] is a major maximum; • for every m [i..j], a[i] < a[m] < a[j]. The definition of a major downward inclineis symmetric.
Identification of inclines The procedure performs two passes through a list of major minima and maxima.
Identification of inclines The procedure performs two passes through a list of major minima and maxima. Its time is linear in the number of inclines.
incline height height length length Indexing of inclines We index major inclines of series in a database by their lengths and heights. We use a range tree, which supports indexing of points by two coordinates.
Outline • Compression • Indexing • Retrieval • Experiments
Example: Database 3 2 1 Retrieval The procedure inputs a pattern series andsearches for similar segments in a database. Pattern
Retrieval The procedure inputs a pattern series andsearches for similar segments in a database. • Main steps: • Find the pattern’s inclines with the greatest height • Retrieve all segments that have similar inclines • Compare each of these segments with the pattern
1 2 height length1 length2 Highest inclines First, the retrieval procedure identifies the important inclines in the pattern. , and selects the highest inclines.
height · C incline height / C length / C length · C Candidate segments Second, the procedure retrieves segments with similar inclines from the database. • An incline is considered similar if • its height is betweenheight / C and height· C; • its length is betweenlength / D and length· D. We use the range tree toretrieve similar inclines.
Similarity test Third, the procedure compares the retrieved segments with the pattern. ,using a given similarity test.
Outline • Compression • Indexing • Retrieval • Experiments
Experiments We have tested a Visual-Basic implemen- tation on a 2.4-GHz Pentium computer. • Data sets: • Stock prices: 98 series, 60,000 points • Air and sea temperatures: 136 series, 450,000 points
400 331 perfect ranking perfect ranking 0 0 200 151 0 0 fast rankingC = D = 2 time: 0.02 sec fast rankingC = D = 1.5 time: 0.01 sec Stock prices (60,000 points) Search for 100-point patterns The x-axes show the ranks of matches retrieved by the developed procedure, and the y-axes are the ranks assigned by a slow exhaustive search. 210 perfect ranking 0 0 200 fast rankingC = D = 5 time: 0.05 sec
Stock prices (60,000 points) Search for 500-point patterns The x-axes show the ranks of matches retrieved by the developed procedure, and the y-axes are the ranks assigned by a slow exhaustive search. 400 328 202 perfect ranking perfect ranking perfect ranking 0 0 0 200 167 0 0 0 200 fast rankingC = D = 5 time: 0.31 sec fast rankingC = D = 2 time: 0.12 sec fast rankingC = D = 1.5 time: 0.09 sec
Temperatures (450,000 points) Search for 200-point patterns The x-axes show the ranks of matches retrieved by the developed procedure, and the y-axes are the ranks assigned by a slow exhaustive search. 400 400 202 perfect ranking perfect ranking perfect ranking 0 0 0 82 0 151 0 0 200 fast rankingC = D = 5 time: 1.18 sec fast rankingC = D = 2 time: 0.27 sec fast rankingC = D = 1.5 time: 0.14 sec
3 3 1 1 1 1 1 1 3 3 Conclusions Main results: Compression and indexing of time series by major minima and maxima. Current work: Hierarchical indexing by importance levels of minima and maxima. 4