Fast Time Series Classification Using Numerosity Reduction

Fast Time Series Classification Using Numerosity Reduction Xiaopeng Xi Eamonn Keogh Christian Shelton Li Wei Chotirat Ann Ratanamahatana xxi@cs.ucr.edu Department of Computer Science and Engineering University of California, Riverside June 28, 2006 ICML 2006, CMU, Pittsburgh

Overview • Background • Motivation • Naïve rank reduction • Adaptive warping window in DTW • Experimental results • Conclusions

A B C 0 0 200 200 400 400 600 600 800 800 1000 1000 1200 1200 1400 1400 0 0 0 200 200 200 400 400 400 600 600 600 800 800 800 1000 1000 1000 1200 1200 1200 1400 1400 1400 0 200 400 600 800 1000 1200 1400 Time Series ECG Heartbeat Stock Flat-tailed Horned Lizard

Time Series Classification • Applications • Insect species, heart beat, etc • 1-Nearest Neighbor Classifier • Distance measures: Euclidean, LCSS, DTW…

Warping path w Dynamic Time Warping Euclidean warping window r Dynamic Time Warping Sakoe-Chiba Band

Flat-tailed Horned Lizard Alignment by Dynamic Time Warping Texas Horned Lizard

Observation I • 1-Nearest Neighbor with DTW distance is hard to beat in time series classification

Comparison of classification error rate (%) Control Chart 8 7 6 error rate (%) 5 4 3 2 1 0 1-NN DTW Multiple classifier First order logic rule Multi-scale histogram Perceptron neural network Super-kernal fusion scheme

Observation I • 1-NN DTW achieves high accuracy but slow • 1-NN needs O(N 2), N = dataset size • DTW is computationally expensive • Can we speed 1-NN DTW up?

C Q Band width d Observation II • As the data size decreases, larger warping window achieves higher accuracy - Numerosity Reduction • The accuracy peaks very early with small warping window - Accelerate DTW Computation

100 100 95 95 100 instances 100 instances 100 instances 90 90 50 instances 50 instances 50 instances 85 85 24 instances 24 instances 24 instances Accuracy(%) 80 80 12 instances 12 instances 12 instances 75 75 70 70 6 instances 6 instances 6 instances 65 65 60 60 0 0 10 10 20 20 30 30 40 40 50 50 60 60 70 70 80 80 90 90 100 100 Warping Window r(%) Gun-Point

DTW gives better accuracy than Euclidean distance, peaks very early with small warping window

C Q Band width d Speed Up 1-NN DTW • Numerosity reduction (data editing) • Heuristic order searching, prune worst examplar first – Naive Rank • Dynamically adjust warping window

Naïve Rank Reduction • Assign rank to each instance • Prune the one with lowest rank • Reassign rank for other instances • Repeat above steps until stop criterion

- + P + + + Naïve Rank Reduction break ties by Rank( P ) = 1 + 1 + 1 - 2 = 1

C Q Band width d Adaptive Warping Window • Basic ideas: • Adjust warping window dynamically during the numerosity reduction • Prune instances one at a time, increase the warping band by 1% if necessary 100 95 100 instances 100 instances 90 50 instances 50 instances 85 24 instances 24 instances 80 12 instances 12 instances 75 70 6 instances 6 instances 65 60 0 0 10 10 20 20 30 30 40 40 50 50 60 60 70 70 80 80 90 90 100 100

warping window searching (%) 2 2 3 4 5 data 97% 99% 99% 99% 99% 92% 98% 98% data pruning 97% 98% 97% 96% 95% 96% 94% 96%

Speeding-up DTW classification • Solution: • LB_Keogh lower bounding, amortized cost O(n) • Store DTW distance matrix and nearest neighbor matrix, update dynamically • Compute accuracy by looking up matrices • 4 or 5 orders of magnitude faster

Experimental Results Two Patterns, on training set DTW 100 Euclidean 90 80 70 accuracy(%) 1-NN Euclidean Random, Euclidean 60 Random, Fixed 1-NN DTW fix window NaiveRank , Fixed 1-NN DTW adaptive 50 NaiveRank , Adaptive 12% 6% 40 4% 13% 14% 11% 7% 10% 9% 8% 5% 4% 30 1000 0 0 100 200 300 400 500 600 700 800 900 data instances

Experiments Results Two Patterns, on test set Swedish Leaf, on test set 90 100 80 90 70 80 60 accuracy(%) 70 accuracy(%) 50 1-NN Euclidean 60 RT1 1-NN Euclidean RT2 RT1 40 RT3 50 RT2 1-NN DTW RT3 12% 6% 1-NN DTW 30 40 5% 4% 3% 13% 14% 11% 10% 6% 8% 4% 9% 5% 7% 30 20 0 100 200 300 400 500 600 700 800 900 1000 0 50 100 150 200 250 300 350 400 450 500 data instances data instances RT algorithms are introduced in Wilson, D.R. & Martinez, T.R. (1997). Instance Pruning Techniques. ICML’97

Conclusions • 1-NN DTW is very competitive in time series classification • We show novel observations of relationship between warping window size and dataset size • We produce an extremely fast accurate classifier

Thank You!

x108 brute force LB_Keogh fast DTW Two Patterns, 1,000 training / 4,000 test x105 brute force LB_Keogh fast DTW

Fast Time Series Classification Using Numerosity Reduction

Fast Time Series Classification Using Numerosity Reduction

Presentation Transcript

Maintaining long time series through industry classification changes

Time Series

Using SAS for Time Series Data

Fast Subsequence Matching in Time-Series Databases

Time Series Classification under More Realistic Assumptions

Fast Time Series Classification Using Numerosity Reduction

Positive Unlabeled Learning for Time Series Classification

Fast Packet Classification Using Bit Compression with Fast Boolean Expansion

Fast Calculations of Simple Primitives in Time Series

Introduction To Time Series Classification:

Using R for Time Series Analysis

Fast Packet Classification Using Multi-Dimensional Encoding

A fast time series data server

Fast Packet Classification Using Bloom filters

Maintaining long time series through industry classification changes

Semi-Supervised Time Series Classification

Using R for Time Series Analysis

Fast Calculations of Simple Primitives in Time Series

Introduction To Time Series Classification:

Fast Subsequence Matching in Time-Series Databases