480 likes | 634 Views
Model Based Techniques for DATA RELIABILITY in Wireless Sensor Networks. - Hambi. Problem Definition. Assuring the reliability of the data is going to be one of the major design challenges of future sensor networks
E N D
Model Based Techniques for DATA RELIABILITY in Wireless Sensor Networks. -Hambi
Problem Definition • Assuring the reliability of the data is going to be one of the major design challenges of future sensor networks • How are we going to achieve that while having stringent requirements like tight energy budgets, low-cost components, limited processing resources, and small footprint devices ?
Outline • Brief Introduction • Motivation • Overall approach • Predictive error correction • Data Modeling • Performance Analysis
Introduction • The growth in the wireless sensor networks is mainly driven by developments in semiconductor design technology, • e.g., The scaling of feature sizes and lowering of operating voltages, which are allowing sensor nodes to become smaller and power efficient. • As the nodes get smaller and cheaper, ensuring the reliability of sensor data becomes harder.
Sources of Errors • Operating conditions. • Aging which causes calibration drifts. • Cross talk, radiation effects. • Two types of errors- transient, permanent. • shrinking of feature sizes to nanometer scales and the lowering of supply voltages to sub-volt ranges are making them vulnerable to various noise and interference effects. • Errors in the communication channel.
Continued… • Communication errors - Channel coding or Automatic Retransmission Requests (ARQs) coupled with error detection. • Channel coding- Forward Error Correction (FEC) codes. • Adds extra bits to transmitted packets that allow correct decoding when some of the bits are corrupted. Eg: Reed-Solomon coding. • Processing data in fixed-size blocks, adding a fixed number of overhead bits during encoding. • The decoder can recover a block of data when the number of errors for the block does not exceed half the number of overhead bits.
Approach • Model-based error correction techniques. • Use sensor data for error correction. • Increased redundancy- Yes, but, can be handled in different ways as part of the system design. • Reducing redundancy may reduce robustness of data against errors. • why not use this redundancy to ensure reliability ? • Advantages: • Network is much more efficient. • Multiple types of errors are handled together. • Sensor nodes need not worry about the error-correction. • Follows the KIS principle- sensors are meant just for sensing.
Continued… • The sensing is done by clusters of dedicated sensor nodes that report the sensor data to more complex cluster-head nodes. • Cluster heads • Have more processing and energy resources. • Capable of multiple complex functions that involve data processing and storage. • Configures scheduling of sensing, data-reporting and sleeping cycles of the nodes around it. • Functionality implemented in software. • Approach can be used with the traditional approach.
Model-based error correction • The idea is to use the Data properties for reliability. • Analyze the sensor data. • Capture relevant properties in a data model. • Properties like …? • Depends on the type of application. • E.g.: Knowing that correlation time is in hours rather than milli seconds. • Use this model for error detection and correction.
what exactly are we doing here ? • Before each sample is received, predicted value(Xp) is calculated- using the data-model. • Now get the observed value (X). • Use Xp and the past observations (given in the data-model). • Decide whether the sample is erroneous- Algorithm decides that. • Report the corrected value Xc-depends in the O/P of error detection.
What else do we have ? • Adopt the model-online based on the collected data. • Improves accuracy of prediction, correction. Note: we are NOT dealing with sensors that have slow variations in time in relation to sampling rate. E.g.: Event detection applications/ low sampling rate applications. • Data is normally expected to contain sharp variations randomly spread.
Predictive Error Correction-Intro • Correlation characteristics of the data should be different than that of the error. • Most of the errors are random-bit errors, uniformly distributed across bit positions. • Dealing with transient errors • Probabilistic models for process variations, random particle strikes. • Deterministic models-ckt layout, cross talk. • Effect visible only at the gate-level or register-level. • Uniform structure of the logic and layout make cells equally susceptible to radiation effects.
Continued… • For communication channel errors, channels conditions are measured. • Hard to model for large packet arrival intervals. • In absence of any estimates, use the model of uncoordinated random bits. • We use Bernoulli process with uniform error probability for all the bits in the data. • If sampling happens every few seconds, 50-200ms bursts have no influence.
Correction Methodology • Calculate the difference between the predicted and the observed value. • There will be a level of prediction error. • Complexities like size of history can also introduce inadequacy. • Main challenge: • Identify the cause of the error- Randomness of data or error introduced after sensing. • Compare the prediction errors of the samples sequentially.
Continued… • Delay the reporting by few sample periods. • Compare with the future samples. • Report depending on how the choice (Xp or X ) affects the future samples. • How do we identify errors ? • Erroneous observation leads to continuous degradation of the predicted value of the future samples. • Modeling error- unlikely to introduce degrading predicted values. • whenever the decision algorithm detects an error in an observed data sample, the observation is marked and treated as an erasure.
Continued… • Prediction, decision blocks- control blocks. • Implements the prediction model. • Produces predicted value based on recent history. • Observation history, prediction history blocks-Storage blocks. • O/P prediction block stored in PHT. • Both are I/Ps to prediction block to predict future samples. • Decision block: • PHT is processed by the decision algorithm in Decision block.
Data Structure of PHT • Holds the few most recently observed samples at any time. • All possible sequences of Xps’ and the corresponding error values (Xe) are stored. • It’s a binary tree. • Root node contains last corrected data sample. • Its (Root node) children contain X and Xp values of the very next sample. • The LEAF NODES hold the predicted values of the current sample.
Continued… • Each path from root to leaf holds a possible sequence of observed or predicted values. • Decision delay : DEPTH of the PHT (N). • Tree has N+2 Levels. • Each node in a level hold a pair of values. • <Observed/predicted value , prediction error> • Nodes are sequentially numbered starting from 0. • 2i+ 1 Observed values (Odd Numbered) • 2i + 2 Predicted values (Even Numbered)
Continued… • Root node contains the last corrected value Xc[n-3]= 100. • Even numbered leaf nodes contain different predicted values of Xp[n] that would be computed for different choice of previous values. • Once the new sample X[n] is observed, the prediction errors for all the values of Xp[n] are computed. • Decision algorithm decides between Nodes 1 or 2. • Prediction errors are used for comparison. • Finally Xc[n-2] is chosen.
Continued… • The algorithm updates the PHT for the next sample. • Node 1 Observation Sub tree. • Node 2 Prediction sub tree. • One sub tree is chosen. Nodes are moved up by one level. • The other sub tree is discarded. • Decision and the update process is repeated next.
Decision Algorithms • Four Algorithms: • Min-Err Algorithm • Min-Max Algorithm • Peer Algorithm • Hybrid Peer with CRC check
Min-Err Algorithm • Decision is based on how the choice affects the prediction accuracy of the next N samples. • Based on different choices of Observed/ predicted values, different sequence of samples are available. • Select the sub-tree of the PHT that contains root-leaf path with minimum RMS correction error. • Ex: Among paths ending with Nodes 8,10,12,14. • The RMS value is Min. for nodes ending with path 14. • Hence Node 2 is chosen.
Min-err drawbacks • Highly sensitive to the modeling performance. • You might end up getting prediction error values even for the authentic observed values. • The effect of the modeling error is amplified for the paths that have small number of predicted samples. • A single sample might be used for the decision making. • Ex: Lets say the observed value is 111 (go back to slide 23). Errors for the path would have been 19 and 21, which means the other node would have been chosen.
Min-max algorithm • Sub-trees of nodes 1 and 2 are considered separately. • Path with the Max-average error is found. • Sub-tree with the smaller maximum average error is selected for the decision. • Pros: • More resilient to modeling errors. • In the example (slide-29), only paths ending with nodes 10 and 12 are compared. • N should be as big as 145 to affect the final value.
Min-max continued… • Cons: • Doesn’t take into account certain cases which causes spurious error for certain models. • Ex: size of history is smaller than the depth of the PHT. • Lets say the example model on slide 29 uses just the previous sample for prediction. • So, value predicted at node 1 will have no affect on predicted value at node 12.
Peer Algorithm • Individual pairs of nodes in each sub tree are compared, instead of full paths. • Nodes in the peer positions are compared. • Absolute prediction errors are compared. • Sub-tree that has more samples with lower prediction errors. • Predictions that are independent of the choice are excluded from the decision-making process. • Nodes 4,8,10 are compared against 6,12 and 14 respectively.
Continued.. • Model parameter-M is the number of samples used for prediction from the PHT. • Before each comparison, it is ensured that nodes 1 and 2 or a sample directly predicted from it is among the previous M samples. • If difference between prediction errors in a pair < average modeling error, that pair is discarded. • ETH (Error threshold) is used to do that.
Hybrid with CRC • No additional info apart from the sensor data used in the earlier models. • Check Sum function built in with hardware. • Can be complemented with model-based error detection. • Result of the check sum is fed to the decision algorithm when it is available. • When error is detected by the CRC, its treated as a missing sample and predicted value for the model is used.
DATA-MODEL • How do we create the data-model ? • Properties of data source are used for predictions. • What are the requirements ? • Maximize the prediction accuracy. • Prediction needs to be fast • Low computation and storage overheads. • Ex: when the data source is strictly not stationary.
Auto-Regressive models • Used for our implementation. • Capture the effect of recent history through “aging” process. • Computationally very efficient since linear prediction functions are used. • Prediction is expressed as a linear combination of previous samples.
Implementation continued.. • Modeling is done in two parts. • Offline Analysis through Statistical properties. • Runtime updates to the model. • Offline model – compute the order of the AR model based on the correlation time & sampling rate. • Runtime Updates: • Track the prediction accuracy • Do the model update
Runtime model • Special model of operation- Estimation mode. • Sensors temporarily report the data with additional protection. • More reliable data is available for computing updated models at the cluster heads. • Implemented by making redundant readings. • Trends in the prediction errors are continuously monitored. • Model-update request triggered when necessary. • Stop the data gathering process. • Temporarily switch to the estimation mode. • Update the model with the protected data. • Get back to correction mode i.e, normal mode of operation.
Pros and cons… • Sensor data collected during the update mode is still transparent to the application. • Estimation mode implemented without additional hardware overhead. • Redundancy leads to increase in energy costs per bit. • Can be made efficient by sharing across multiple cluster-heads.
Model tracking and model Updates • During correction mode, a running windowed average of the prediction error is maintained. • Threshold is scaled for the number of correct samples in the averaging window. • Threshold value and the size of the averaging window, determine the frequency of updates. • Optimal choice will depend on the characteristics of the data source and the system. • Update stages has minimal set of data points as it runs in resource heavy estimation mode.
Model updates… • States can be associated with pre-computed models. • Increasing number of samples may increase accuracy, but, at the cost of additional resource overhead.
Performance evaluation • Peer >Min-max>>Min-err – Offline mode. • For dynamic model updates, when the error in the input is higher, the improvement attained with updates is higher. • Correction performance is better with runtime than with that of offline mode. • Reed-solomon leads to 86% overhead. • Using CRC o/p can reduce errors by 50% under high error conditions.