270 likes | 390 Views
RACE: Time Series Compression with Rate Adaptivity and Error Bound for Sensor Networks. Huamin Chen, Jian Li, and Prasant Mohapatra Presenter: Jian Li. Agenda. Motivation Background RACE Algorithm Numerical Evaluation Conclusion. Motivation. Sensor Networks Limited energy source
E N D
RACE: Time Series Compression with Rate Adaptivity and Error Bound for Sensor Networks Huamin Chen, Jian Li, and Prasant Mohapatra Presenter: Jian Li
Agenda • Motivation • Background • RACE Algorithm • Numerical Evaluation • Conclusion
Motivation • Sensor Networks • Limited energy source • Limited link bandwidth, may be time-varying • Monitoring process • Continuous data generation and dissemination • Data rate may be large, and time-varying • How to disseminate efficiently? • Compression and aggregation
Data Quality: Impact factors • Sampling frequency • Number of sampling nodes • Data dissemination • Compression • Aggregation
Why Compress? • How to get “properly small” data rate? • Lower sampling frequency • Reduce the number of sensors • Lossy/lossless compression • Low sampling frequency is not equivalent to (lossy) compression of higher-precision raw data. • E.g.: whether detailed features along timeline can be retained? • Lossy compression is able to adapt to various link constraints.
But, how about Error Bound? • Volatile physical process • Data rate of time series could vary in a large range • Different compressibility at different time instances • Lossy compression cannot guarantee error bound, given a target output data rate • Consistency of data quality? • Multihop network transmission • Multiple time series compression
So, Our goal is … • Adaptive compression • Compress time series into CBR/LBR flow • Trade-off: network capacity v.s. data quality • Improve data quality • Exploit different compressibility along timeline to achieve certain error bound • Consistency of data quality among multiple time series compression
Data Quality: Error Norm • Normalized data element • Normalized data error ei = • Error norm of time series
Haar Wavelet Transformation • Compute neighboring elements’ average and difference • Average: trend of time series • Difference: details of time series • An example: original time series is [2, 6, 5, 11], we get transformation output [6, -2, -2, -3].
Wavelet coefficient tree Time series: [3, 4, 3, 2, 6, 8, 9, 7, 2, 3, 1, 2, 10, 8, 7, 9] Output coefficients: [5.25, 0, -2.25, -3.25, 0.5, -0.5, 0.5, 0.5, -0.5, 0.5, -1, 1, -0.5, -0.5, 1, -1]
Data Element Reconstruction and, Cj is individual coefficient.
Reconstruction: example Calculation: +(5.25) +(0) -(-2.25) +(-0.5) +(-1) 6
Magnitude-based zeroing • Given a threshold a • if coefficient Cj < a, then this coefficient leaf is cut off and does not participate in reconstruction process.
RACE Algorithm • Generating gradient error tree • Error-based zeroing (i.e., compression process) • Smoothing error bound via patching process
Gradient Error Tree • Gradient Error G(V) • V is a coefficient in wavelet coefficient tree • G(V) is defined as the max error that is incurred when the sub-tree rooted from node V is cut off: • Gradient Error Tree • Computed from corresponding wavelet coefficient tree
Gradient Error Tree: an example Time series: [3, 4, 3, 2, 6, 8, 9, 7, 2, 3, 1, 2, 10, 8, 7, 9] Coefficients: [5.25, 0, -2.25, -3.25, 0.5, -0.5, 0.5, 0.5, -0.5, 0.5, -1, 1, -0.5, -0.5, 1, -1]
Error based zeroing • Using error bound as threshold value, according to gradient error tree, apply magnitude-based zeroing to wavelet coefficient tree • Use symbol “t” to represent a zero-ed subtree
Error based zeroing • Example: threshold = 2 result in 8 symbols to encode
Error based zeroing • Example: threshold = 4 results in 6 symbols to encode
Important Properties • Error bound additivity • Multihop network transmission • Multiple time series aggregation • Patch-ability • Exploiting varying compressibility of input stream along timeline • Smoothing error range of output stream
Numerical evaluation • Data set • Real world data from TAO project (http://www.pmel.noaa.gov/tao) • Including air temperature and subsurface temperature at different depths • Air temperature characteristics
Preservation of statistical interpretation • How well to preserve multivariate correlationship? • Cross correlation between variables x and y is defined as: Where d is temporal delay between x and y.
Data sets • Subsurface temperatures at depths 25m and 50m
Conclusion • Rate adaptive compression scheme • Improve error bound, achieving soft guarantee • Preservation of multivariate correlationship