420 likes | 436 Views
Bharath K Devaraju. TimeSeries Toolkit. Important Disclaimer. THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY.
E N D
Bharath K Devaraju TimeSeries Toolkit
Important Disclaimer THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF: • CREATING ANY WARRANTY OR REPRESENTATION FROM IBM (OR ITS AFFILIATES OR ITS OR THEIR SUPPLIERS AND/OR LICENSORS); OR • ALTERING THE TERMS AND CONDITIONS OF THE APPLICABLE LICENSE AGREEMENT GOVERNING THE USE OF IBM SOFTWARE. The information on the new product is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information on the new product is for informational purposes only and may not be incorporated into any contract. The information on the new product is not a commitment, promise, or legal obligation to deliver any material, code or functionality. The development, release, and timing of any features or functionality described for our products remains at our sole discretion. THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.
Agenda • Introduction to TimeSeries toolkit • Time series terminologies explained • TimeSeries operators and its applications • Demo Applications (Anomaly Detection and Forecasting)
TimeSeries toolkit TimeSeries toolkit consists of Modeling and Analytic operators which can be leveraged to understand/analyze time series data • The toolkit applications are varied, it can be applied in areas like signal processing, large systems monitoring, predictive analytics, etc.
Time series terminologies • Univariate time series represents the evolution of a single numerical over time. Example: daily temperature in New York Representation: – tuple<float64>, tuple<time, float64> • Vector time series represents the evolution of a collection of the numerical values sharing the same timestamp over time. Example: daily temperature and humidity level in New York Representation: – tuple<list<float64>>, tuple<timestamp, list<float64>> • Univariate Operator: A univariate operator processes a vector time series as a set of independent time series, each processed independently of each other. Each index in the list is treated as a single time series. The effect is similar to processing multiple single time series in parallel, each with its own operator. The sequence of data sharing the same index in the input time series represented as list<float64> have their output located at same index in the output list • Multivariate Operator: A multivariate operator treats the values of vector time series as a unique entity. The values in the input list<float64> are processed as a unique object (vector) and transformed into an output. The output could be a list or a single value depending on the underlying algorithm. Examples of multivariate operators include Kalman filter, VAR, DWT and FFT
TimeSeries operators and its applications • Namespace: com.ibm.streams.timeseries.generators • A set of operators or functions to synthesize time series • Namespace: com.ibm.streams.timeseries.preprocessing • A set of primitive operators in the toolkit for time series designed for dealing with time series preparation and conditioning • Namespace: com.ibm.streams.timeseries.analysis • A set of operators designed to provide underlying information on the time series. These include operators for statistics, correlations, decomposition, transformation of time series • Namespace: com.ibm.streams.timeseries.modeling • A set of operators that internally create a model of the time series, and use that model for prediction or regression, or tracking
TSWindowing In a given specified duration, the TSWindowing operator is typically used to isolate a portion of the signal for analysis, by performing a weighing of the portion of the signal by a weighing function that tampers down to zero at the edge. The windowing functions are currently supported are: Hamming, Hann, Blackman, Cosine and Triangle. References-Harris, Fredric j. (Jan 1978). "On the use of Windows for Harmonic Analysis with the Discrete Fourier Transform". Proceedings of the IEEE 66 (1): 51–83. doi:10.1109/PROC.1978.10837. Article on FFT windows which introduced many of the key metrics used to compare windows. -http://en.wikipedia.org/wiki/Window_function The above visuals show the result (right) of applying hann window function on input time series( left). The Hann window fuction is shown below
DSPFilter The Digital Signal Processing (DSP) Filter operator performs digital filtering on the input time series. Digital filtering refers to the process of isolating only a portion of input time series frequencies while rejecting others. For example, retain only high frequency components of the time series (High Pass) or retain low frequency components (Low Pass) , or doing a smoothing (moving average). The filter parameters are supplied either as parameters or loaded from file. The DSPFilter is a univariate operator. The above example shows a moving average of an IBM stock price for the months January 2011 to June 2011 as output of the DSPFilter. The coefficients of the filter were selected to implement a simple moving average.
Transformations • DWT (Discrete Wavelet Transform): The wavelet transform maps the original time series into a space where general trends and fine details of the time series are made more prominent. Various basis functions are available for time series projection. Wavelet transform can be used for time series approximation (trends) or to analyse small details in the signal. The first index of the transform contains average (trend) of the signal, the higher index contains details information on the time series. • FFT (Fast Fourier Transformation): This operator computes the Fast Fourier Transformation (FFT) of an incoming time series. The Fourier transform is a mathematical operation that decomposes a signal into its constituent frequencies. The term Fourier transform refers both to the frequency domain representation of the signal and the process that transforms the signal to its frequency domain representation. FFT is an efficient algorithm to compute the discrete Fourier transform (DFT) and its inverse.
Illustration: DWT transform is used to detect small glitch Time series glitch glitch amplified in DWT space 32 samples 32 samples 32 samples The red curves displays the wavelet transform using Daubechies 4 wavelet transform where only the last 8 DWT values have been retained, the rest set to nil. The time series is processed by sequence of windowed data, and the DWT is applied to each sequence. For this example, we have 3 sequences of 32 samples, and the glitch is found in the 2nd sequence.
Illustration: Shows the input times series and its corresponding spectrogram The top plot show a time series made of two sinewaves at 20Hz and 50 Hz, with sampling rate of 1000 Hz with added noise. The bottom plot show the sequence of FFT (magnitude) , derived from the applying a sliding window of 128 samples, to the time series. As expected, the FFTs show peaks at 20 Hz and 50 Hz.
Normalize: Time Series Normalization: Time series normalization is a useful pre-processing techniques designed to align the range of value of a time series. In particular, when applied on a vector time series with heterogeneous values of varying ranges, it generates more homogenous values, useful for equalize further analysis. The above example shows input to normalization operator, two time series values are provided as vector input IBM Stock price and Stock Trade Volume. The range of values for the volume are much higher than the IBM price The normalized values for IBM Stock Price and Trading Volumes. Now, the two time series are within comparable range
FunctionEvaluator: FunctionEvaluator estimate the value of a functions, given values a certain points (controls points). The control and knot points which are required to apply the function is specified as value of functionSpecification parameter which is of type map. Here x0 and x1 are control points and y0 and y1 represents knot points. Input time series is x and the function output y is evaluated. The key represents knot points present in the input time series whereas the value is a list of control points.
STD: The Seasonal Trend Decomposition operator transforms an input time series into 3 time series representing the season, the trend and the residuals using the Loess algorithm. Seasonal Trend Decomposition is a useful for: - decomposition a time series into simpler elements - adjusting for seasonal variation, prior to analysis Legend: X axis: months Y axis: passenger counts Illustration:The STD extract the trend and identified season from the input time series data representing the number of airlines passengers per month of a air company
NativeFunctions: • Cross Correlate Cross correlation is a measure of the similarity of two time series as a function of a time lag applied to one of them. Output reveals step where there was a maximum overlap of the two input timeseries. • Convolution A convolution is an integral that expresses the amount of overlap of one function as it is shifted over another function. • Generators • Sine wave • Square wave • Triangular wave • Saw tooth wave
Regression • GAMLearner • GAMScorer GAMLearner applies a GAM model to categorical and/or continuous time series data. If observed data is given, then the operator adapts the model parameters and outputs the filtered data. GAMLearner can enhance the given Generalized Additive Model (GAM) specified in Predictive Markup Modeling Language (PMML) file based on the observed data and also has the capability to score the data using the pmml model. • RLSFilter This operator fits a linear regression model to a series of covariates and dependent variables. A linear regression model is fitted to a series of pairs of covariates and dependent variables. The fitting is done adaptively using the Recursive Least Squares (RLS) method. The fitted model can be used to estimate the dependent variable for given new covariates.
Illustration: Prediction of housing starts using construction volume (thousands) and interest rate as predictors Scale: x axis: months Y axis: time series
Trackers • Kalman The KALMAN filter is an adaptive filter. It is an adaptive system which tracks the data movements, and continuously re-estimates the model using the current time series input value after eliminating the noisy interference. Kalman is a multivariate operator. • FMPFilter The FMP algorithm is a tracking algorithm similar to the kalmal algorithm but less complex. It is useful for: - real-time data smoothing -real-time data prediction - real-time anomaly detection (flagging of abnormal event)
Legend: X axis: milliseconds Y axis: Memory usage time series values Illustration: Anomaly detection on computer memory usage patterns using FMPFilter The time series simulates memory consumption from a computer. FMP is used for prediction and anomaly detection
VAR: (Prediction) Multivariate Autoregressive Model (VAR) operator tracks data movement and predicts the next expected time series. It uses correlation to determine the movement of data points in input timeseries in order to predict the next expected time series. VAR is multivariate operator and expect a vector time serie represented as list. Illustration: Multivariate prediction of IBM Stock Price and Trading volume
Forecasting • HoltWinters The HoltWinters algorithm is a widely used forecasting technique. The algorithm incrementally estimate the trend, the season, and the level of the time series, and use these estimate for prediction. • ARIMA This operator implements the AutoRegressive Integrated Moving Average (ARIMA) modeling approach. It is made up of an AutoRegressive (AR) component, an Integrator (I) component and Moving Average Component (MA). The ARIMA operator is designed to do forecasting of future time series values. It initializes itself based on a parameter values provided by the user .
Illustration: Forecasting of airline passenger count for next month and 24 months ahead using Holt Winters Legend: X axis: months Y axis: airline passenger count Holt Winters algorithm used for predicting next month and next 24 months ahead airline passengers count
GMM : Illustration: detecting outlier on computer memory usage The input time series on top represents simulated computer memory usage data in MB for every 6 minutes. A one-mixture GMM is trained for 3400 samples, representing two weeks of data. The bottom plot shows a zoom on the area (third week) where an outlier is detected with high probability.
Illustration: Separating a sinewave and the noise out of a noisy sinewave Time [in samples] The simple example shows how the DSPFilter can be used to smooth, alter, modify, and extract signal components.
DWT The DWT operator implements the Discrete Wavelet Transform (DWT) on a vector time series. In numerical and functional analysis, a DWT is any wavelet transform for which the wavelets are discretely sampled. As with other wavelet transforms, a key advantage it has over Fourier transforms is temporal resolution: it captures both frequency and time information. The DWT operator internally uses Daubechies wavelet of order 2 or 4 based on the parameter values specified. When order 2 is selected, the input time series list should have at least 2 time series values and if order is chosen as 4, input time series list should have a least 4 time series values, else exception is thrown. If the input timeseries is a single float64 value then window configurations needs to be provided otherwise a compile time error is thrown. Similarly if the input is a list<float64> and window configurations is provided a compile time error is thrown. DWT is a multivariate operator. Wavelet Transform: The wavelet transform maps the original time series into a space where general trends and fine details of the time series are made more prominent. Various basis functions are available for time series projection. Wavelet transform can be used for time series approximation (trends) or to analyse small details in the signal. The first index of the transform contains average (trend) of the signal, the higher index contains details information on the time series. Ref: S. Mallat, A Wavelet Tour of Signal Processing, 2nd ed. San Diego, CA: Academic, 1999.
Legend: X axis: duration Y axis: time series Illustration: DWT is used to compress a time series The red curves displays the wavelet transform using Daubechies 4. We retain the lower 32 DWT coefficients of a processed time series window and set the higher index to zeros. The inverse DWT is then applied to generate the approximation of the original curve. This illustrates the amazing compression properties of the DWT.
FFT This operator computes the Fast Fourier Transformation (FFT) of an incoming time series. The Fourier transform is a mathematical operation that decomposes a signal into its constituent frequencies. The term Fourier transform refers both to the frequency domain representation of the signal and the process that transforms the signal to its frequency domain representation. FFT is an efficient algorithm to compute the discrete Fourier transform (DFT) and its inverse. FFT is a multivariate operator that expects the time series as a list. The incoming time series is read as a sequence of list<float64> values. This operator can produce three types of outputs: • The FFT as lists of complex64. • Magnitude Spectrum as a list of float64, if type of transformation (algorithm parameter) applied is magnitude or • Power Spectrum as list flost64, if type of transformation (algorithm parameter) applied is power. FFT The FFT a classical and the most widely used spectral estimation techniques. Ref: : Alan V. Oppenheim, Ronald W. Schafer, Digital Signal Processing.
General Additive Model (GAM) Learner Description: GAMLearner applies a GAM model to categorical and/or continuous time series data. If observed data is given, then the operator adapts the model parameters and outputs the filtered data. GAMLearner can enhance the given Generalized Additive Model (GAM) specified in Predictive Markup Modeling Language (PMML) file based on the observed data and also has the capability to score the data using the pmml model. If observed data is given, then the operator updates the model parameters and outputs the filtered data. Note: if the model is only applied to scoring input data (not updating the model Parameters), then the SPL operator GAMScorer should be used. It requires less memory and computation time. GAMLearner is a univariate operator. General Additive Model The general additive model is regression technique that estimate a function of the expectation of a target time series , given input time series, using the following model: is the link functions and the are arbitrary functions estimated from the data. Ref: Hastie, T. J. and Tibshirani, R. J. (1990). Generalized Additive Models. Chapman & Hall/CRC
General Additive Model (GAM) Learner GAMScorer applies a GAM model to score the input time series values.This operator applies a Generalized Additive Model (GAM) specified in a Predictive Markup Modeling Language (PMML) file to a stream of time series data. The mapping between the attributes of the input tuples in Streams and the covariates in the GAM is specified by the user in an operator parameter. Each model has its own modelID (an unsigned 64bit integer). The operator is capable to manage several GAMs at the same time; input tuples are assigned to the correct model using the modelID.
Illustration: using GAM to predict a target time series from 4 input time series The top represents a set of 4 time series that are use as input to predict time series in blue in the bottom graph, The GAM is learned from the association of the input time series and the target time series and, is able to predict future value of the target time series. GAM could be useful to predict for example, the electricity consumption based on the price, and weather data.
FMPFilter: FMPFilter (Fading-Memory Polynomial Filter) is an adaptive filter. This operator tracks the data movements and simultaneously predicts the next expected time series. It also flags anomalous samples. It models data by a polynomial that is continuously re-estimated given the current data. As such, it is an adaptive system, which can mimic data movement, and predict expected behaviour. FMPFilter operator can operate on univariate and vector time series. The FMPFilter is a univariate operator and as such will process a vector time series as parallel sequence of univariate data. Fading-Memory Filter: The FMP algorithm is a tracking algorithm similar to the kalmal algorithm but less complex. It is useful for: - real-time data smoothing -real-time data prediction - real-time anomaly detection (flagging of abnormal event) Ref: - N. Morrison, an Introduction to Sequential Smoothing and Prediction - N. Morrison, Smoothing and Extrapolation of Time Series by Means of Discrete Laguerre Polynomials , SIAM J. Appl. Math., 15(3), 516–538. (23 pages), May, 1967.
Illustration: Predicting IBM stock price and also detecting anomalies with FMPFilter where an anomaly was manually introduced Legend: X axis: timestamp (DD/MM/YY) Y axis: IBM stock time series values Next day prediction of IBM stock price using FMPFilter algorithm, the anomaly flag is 1 whenever an unpredictable data occurs negative is false meaning input is not an anomaly.
HoltWinters: Holt Winters operator is a forecasting operator that uses the HoltWinter algorithm to do long term forecasting. The HoltWinters is a univariate operator and accept time series in the following format. A univariate time series as tuple<float64> or tuple<timestamp timestamp, float64 value> A vector time series as tuple<list<float64> > or tuple<list<timestamp> timestamps, list<float64> values> The HoltWinters provides two ways of getting the forecasted time series values and timestamp values. It can forecast a value for a single point in future or it can provide a range of forecasts up to a point in time in future. The HoltWinters algorithm is a widely used forecasting technique. The algorithm incrementally estimate the trend, the season, and the level of the time series, and use these estimate for prediction. Ref: -Charles C. (1957). "Forecasting Trends and Seasonal by Exponentially Weighted Averages". Office of Naval Research Memorandum 52. reprinted in Holt, Charles C. (January–March 2004). "Forecasting Trends and Seasonal by Exponentially Weighted Averages". International Journal of Forecasting 20 (1): 5–10. doi:10.1016/j.ijforecast.2003.09.015. -Winters, P. R. (April 1960). "Forecasting Sales by Exponentially Weighted Moving Averages". Management Science 6 (3): 324–342.
Kalman: The KALMAN filter is an adaptive filter. It is an adaptive system which tracks the data movements, and continuously re-estimates the model using the current time series input value after eliminating the noisy interference. Kalman operator can operate on both univariate and multivariate time series.
Legend: X axis: timestamp (DD/MM/YY) Y axis: IBM stock time series values Input/Output Visualization: Next step prediction of IBM stock data using Kalman
ARIMA: This operator implements the AutoRegressive Integrated Moving Average (ARIMA) modeling approach. It is made up of an AutoRegressive (AR) component, an Integrator (I) component and Moving Average Component (MA). The ARIMA operator is designed to do forecasting of future time series values. It initializes itself based on a parameter values provided by the user .The model values can be provided as parameter values i.e AR,MA,differentiator value,Historical Data, mean and residuals or ARIMA will estimate model if parameter initsamples and AROrder are provided. The Historical Data are the last data (in temporal order) that were used to train the model. The minimum length is the order of the model, which is the highest number of AR or MA coefficient. The Residuals refer the difference between the training data and forecasted data. Only the last residuals of the order of the model are required. The ARIMA can be use to estimated an AR model. However, it only act as a scoring operato for non-AR model. Arima is also called Box-Jenkins’ algorithm. Ref: Box, George and Jenkins, Gwilym (1970) Time series analysis: Forecasting and control, San Francisco: Holden-Day.
Illustration: Electricity usage in US Legend: X axis: months Y axis:US electricity usage The above visual shows the electricity usage prediction using ARIMA operator
GMM: Gaussian Mixture Model operator estimates the probability density function (smoothed histogram) of a time series. GMM is a univariate operator. This version of the operator accepts only univariate times series.
Recursive Least Square: RLSFilter This operator fits a linear regression model to a series of covariates and dependent variables. A linear regression model is fitted to a series of pairs of covariates and dependent variables. The fitting is done adaptively using the Recursive Least Squares (RLS) method. The fitted model can be used to estimate the dependent variable for given new covariates. RLS is a univariate operator. The Recursive Least Square algorithm iteratively minimize the weighted least square sum between the target and the estimate. It a useful algorithm for: - Noise or echo cancellation - DSPFiler’s parameter estimate • Prediction of outcome based on input • Ref: Hayes, Monson H. (1996). ”Recursive Least Squares". Statistical Digital Signal Processing and Modeling. Wiley.