RACE: Time Series Compression with Rate Adaptivity and Error Bound for Sensor Networks

RACE: Time Series Compression with Rate Adaptivity and Error Bound for Sensor Networks Huamin Chen, Jian Li, and Prasant Mohapatra Presenter: Jian Li

Agenda • Motivation • Background • RACE Algorithm • Numerical Evaluation • Conclusion

Motivation • Sensor Networks • Limited energy source • Limited link bandwidth, may be time-varying • Monitoring process • Continuous data generation and dissemination • Data rate may be large, and time-varying • How to disseminate efficiently? • Compression and aggregation

Data Quality: Impact factors • Sampling frequency • Number of sampling nodes • Data dissemination • Compression • Aggregation

Why Compress? • How to get “properly small” data rate? • Lower sampling frequency • Reduce the number of sensors • Lossy/lossless compression • Low sampling frequency is not equivalent to (lossy) compression of higher-precision raw data. • E.g.: whether detailed features along timeline can be retained? • Lossy compression is able to adapt to various link constraints.

But, how about Error Bound? • Volatile physical process • Data rate of time series could vary in a large range • Different compressibility at different time instances • Lossy compression cannot guarantee error bound, given a target output data rate • Consistency of data quality? • Multihop network transmission • Multiple time series compression

So, Our goal is … • Adaptive compression • Compress time series into CBR/LBR flow • Trade-off: network capacity v.s. data quality • Improve data quality • Exploit different compressibility along timeline to achieve certain error bound • Consistency of data quality among multiple time series compression

Data Quality: Error Norm • Normalized data element • Normalized data error ei = • Error norm of time series

Haar Wavelet Transformation • Compute neighboring elements’ average and difference • Average: trend of time series • Difference: details of time series • An example: original time series is [2, 6, 5, 11], we get transformation output [6, -2, -2, -3].

Wavelet coefficient tree Time series: [3, 4, 3, 2, 6, 8, 9, 7, 2, 3, 1, 2, 10, 8, 7, 9] Output coefficients: [5.25, 0, -2.25, -3.25, 0.5, -0.5, 0.5, 0.5, -0.5, 0.5, -1, 1, -0.5, -0.5, 1, -1]

Data Element Reconstruction and, Cj is individual coefficient.

Reconstruction: example Calculation: +(5.25) +(0) -(-2.25) +(-0.5) +(-1)  6

Magnitude-based zeroing • Given a threshold a • if coefficient Cj < a, then this coefficient leaf is cut off and does not participate in reconstruction process.

RACE Algorithm • Generating gradient error tree • Error-based zeroing (i.e., compression process) • Smoothing error bound via patching process

Gradient Error Tree • Gradient Error G(V) • V is a coefficient in wavelet coefficient tree • G(V) is defined as the max error that is incurred when the sub-tree rooted from node V is cut off: • Gradient Error Tree • Computed from corresponding wavelet coefficient tree

Gradient Error Tree: an example Time series: [3, 4, 3, 2, 6, 8, 9, 7, 2, 3, 1, 2, 10, 8, 7, 9] Coefficients: [5.25, 0, -2.25, -3.25, 0.5, -0.5, 0.5, 0.5, -0.5, 0.5, -1, 1, -0.5, -0.5, 1, -1]

Error based zeroing • Using error bound as threshold value, according to gradient error tree, apply magnitude-based zeroing to wavelet coefficient tree • Use symbol “t” to represent a zero-ed subtree

Error based zeroing • Example: threshold = 2  result in 8 symbols to encode

Error based zeroing • Example: threshold = 4  results in 6 symbols to encode

Important Properties • Error bound additivity • Multihop network transmission • Multiple time series aggregation • Patch-ability • Exploiting varying compressibility of input stream along timeline • Smoothing error range of output stream

Numerical evaluation • Data set • Real world data from TAO project (http://www.pmel.noaa.gov/tao) • Including air temperature and subsurface temperature at different depths • Air temperature characteristics

Adaptive Compression : Max normalized error

Adaptive Compression:smoothed max normalized error

Preservation of statistical interpretation • How well to preserve multivariate correlationship? • Cross correlation between variables x and y is defined as: Where d is temporal delay between x and y.

Data sets • Subsurface temperatures at depths 25m and 50m

Cross relation under different compression ratios

Conclusion • Rate adaptive compression scheme • Improve error bound, achieving soft guarantee • Preservation of multivariate correlationship

RACE: Time Series Compression with Rate Adaptivity and Error Bound for Sensor Networks

RACE: Time Series Compression with Rate Adaptivity and Error Bound for Sensor Networks

Presentation Transcript

Data Management in Sensor Networks

Towards A Holistic Approach for System Design in Sensor Networks

Time series Decomposition

Wireless Sensor Networks MAC Layer

Arc Hydro and Time Series

Chapter 12 Wireless Sensor Networks

Abbreviated Interrupted Time-Series

Time Series Analysis

Sensing Through the Continent: Towards Monitoring Migratory Birds Using Cellular Sensor Networks

Reconfiguration in Sensor Networks

Mobile Programming Lecture 9

Time Series Analysis: Method and Substance Introductory Workshop on Time Series Analysis

Carman scan ONE application

Coverage, Connectivity and Mobility in Wireless Mobile Sensor Robots

Mobile and Ad hoc Networks

Bounding the Lifetime of Sensor Networks

Algorithms for Ad Hoc and Sensor Networks

Chapter05 Sensor Networks

Wireless Sensor Networks for High Fidelity Sampling

Chapter 12 Wireless Sensor Networks

YSI 6-Series Environmental Monitoring Systems

Data Aggregation In Wireless Sensor Networks