200 likes | 402 Views
Space Shuttle Engine Valve Anomaly Detection by Data Compression. Matt Mahoney. Outline. Problem Statement Related Work Anomaly Detection by Data Compression Future Work. Normal Solenoid Current. Abnormal. Problem: How to Detect Anomalies in Space Shuttle Valves. Current Method.
E N D
Space Shuttle Engine Valve Anomaly Detection by Data Compression Matt Mahoney
Outline • Problem Statement • Related Work • Anomaly Detection by Data Compression • Future Work
Normal Solenoid Current Abnormal Problem: How to Detect Anomalies in Space Shuttle Valves
Current Method • Identify features (zero crossings, peaks…) • Specify correct behavior using SCL rules
Goal • Reduce the human workload in specifying “normal” behavior of time-series data • Rule output should be in Space Command Language (SCL, an expert system language) to allow manual adjustments • Anomaly detection must be real time (1K-10K samples per second)
Related Work • Automated waveform segmentation (Gecko, Stan Salvador) • Segment characteristics (level, slope, curvature) identify states • Rules are specified as allowed state transitions • Problem: segmentation is slow
Proposal: Modeling using Data Compression • Train model on “normal” time series • Test by measuring goodness of fit to the trained model
Cross Entropy • Measures fitness of a model M relative to a true (but unknown) probability distribution, P • Minimized when M = P • Estimated by a data compressor that uses M HM(P) = x X -P(x) log M(x) • HM(P) = Cross entropy (compressed data size) • X = set of all possible inputs (waveforms) • P(x) = true probability of x • M(x) = estimated probability by model M
Measuring Cross Entropy Normal, uncompressed Abnormal, uncompressed Normal, compressed Abnormal, compressed Normal 1 Normal 2 Normal 1 or 2 Abnormal
Anomaly Score Score(y) = (C(xy) – C(x)) / C(y) • x = Training (normal) waveform • y = Test (possibly abnormal) waveform • xy = Concatenation of x and y • C(.) = Size after compression • A higher score (worse compression after training) indicates an anomaly
Data Compressors • GZIP (Gailly) • LZ77: duplicate strings are replaced by pointers to the previous occurrence • PAQ3 (Mahoney) • Weighted context mixing • Arithmetic coding of next-bit probability • RK 1.04 (Taylor) • PPMZ (models longest matching context) • Delta coding option for analog data
Data • TEK 0, TEK 1 = Normal on/off cycle of Marotta valve S/N 37898 • TEK {2, 3, 5, 10, 11, 15, 16, 17} = various forced failures • 1000 solenoid current samples at 1 ms intervals • Range: -3.1 to 7.06 A at 0.04 A resolution • Converted to 1000 8-bit values (1000 byte files)
Experimental Procedure • Nor 0: Train on TEK 0, test on TEK 1 (normal) • Nor 1: Train on TEK 1, test on TEK 0 (normal) • Ab 0: Train on TEK 0, average of tests on 8 abnormal traces • Ab 1: Train on TEK 1, average of tests on 8 abnormal traces
Run Time Performance(750 MHz PC) • Real Time = 1K sample/sec • GZIP – 3000K samples/sec • PAQ3 – 40K samples/sec • RK -mx3 –fd1 – 78K samples/sec
Summary • Data compression detects anomalies in the TEK valve data (2 normal, 8 abnormal traces) • GZIP and PAQ3 detect anomalies in 8 of 8 cases using either training set • RK detects 7 of 8 anomalies using either training set (TEK 15 appears more “normal” to all 3 compressors)
Future Work • Verify with more data sets (voltage, temperature, plunger blockage) • Identify anomalous points within the trace • Improve modeling of analog data • Translate models to SCL Work is preliminary. Much needs to be done.
Thank You • For more information, http://cs.fit.edu/~mmahoney/nasa/