Innovative Data Handling Architecture for Sensor Networks

DIMENSIONS: Why do we need a new Data Handling architecture for sensor networks? Deepak Ganesan, Deborah Estrin (UCLA), John Heidemann (USC/ISI) Presenter: Vijay Sundaram

Deployment: Microclimate monitoring at James Reserve Park (UC Riverside) How well does data fit model <M> of variation of temperature with altitude. Send robotic agent to edge between low and high precipitation regions Weather Sensor Network Hmm…I wonder why packet-loss is so high. Get a connectivity map of the network for all transmit power settings Get detailed data from node with maximum precipitation from Sept to Dec 2003

Goals • Flexible spatio-temporal querying • Provide ability to mine for interesting patterns and features in data. • Drill-down on details • Distributed Long-term networked data storage • Preserve ability for long-term data mining, while catering to node storage constraints • Performance • Reasonable Accuracy for wide range of queries • Low communication (energy) overhead

How can we achieve goals? • Exploit redundancy in data • Potentially huge gains from lossy compression exploiting spatio-temporal correlation • Exploit rarity of interesting features • Preserve only interesting features. • Exploit scale of sensor network. • large distributed storage, although limited local storage. • Exploit low cost of approximate query processing • allow approximate query processing that obtain sufficiently accurate responses.

Data Correlation Vs Decentralization Geo-Spatial Data Mining, Streaming Media (MPEG-2) Wireless Sensor Networks Spatial Temporal Exploited Data Correlation Centralized Data Collection P2P: DHT Gnutella Web Caches None Centralized Hierarchical Fully Distributed Degree of Decentralization Can existing systems satisfy design goals?

DIMENSIONS Design: Key Ideas • Construct hierarchy of lossy compressedsummaries of data using wavelet compression. • Queries “drill-down” from root of hierarchy to focus search on small portions of the network. • Progressively age lossy data along spatio-temporal hierarchy to enable long-term storage Level 2 Level 1 PROGRESSIVELY LOSSY PROGRESSIVELY AGE Level 0

Roadmap • Why wavelets? • Example Precipitation Hierarchy • Spatial and Temporal Processing internals • Initial Results: Precipitation Dataset

Enabling Technique: Wavelets • Very popular signal processing approach, that provides good time and frequency localization. • JPEG2000, Geo-Spatial Data Mining • preserves spatio-temporal features (edges, discontinuities) while providing good approximation of long-term trends in data • Efficient distributed implementation possible.

Sample Architecture: Precipitation Hierarchy What is the maximum precipitation between Sept-Dec 2002? • Local Processing: Construct lossy time-series summary(zero communication cost) • Spatial Data Processing: Hierarchical Lossy Compression • Organize network into hierarchy. At each higher level, reduce number of participating nodes by a factor of 4. • At each step of the hierarchy, summarize data from 4 quadrants, and propagate Direct query to quadrant that best matches query decreasing spatial resolution decreasing temporal resolution Wavelet Coeffs

Spatial Decomposition • Recursively split network into non-overlapping square grids. • At each level of the hierarchy, • Elect clusterhead • Cluster-head combines and summarizes data from 4 quadrants • Cluster-head propagates compressed data to the next level of the hierarchy. • Routing protocol: GPSR variant (DCS - Ratnasamy et al,) Hierarchy construction

time y x Wavelet Compression Internals Compressed Output Thresholding + Quantization + Drop Subbands Wavelet Subband Decomposition Lossless Encoder Input Data time y Filter x Cost Metric • Communication Budget • Error bound • Haar Filter • Debauchies 9/7 filter

Initial Results with Precipitation Dataset: Communication Overhead • 15x12 grid (50km edge) of precipitation data from 1949-1994, from Pacific Northwest†. Gridded before processing. • Handpicked choice of threshold, quantization intervals, subbands to drop. Huffman Encoder at output. • Very large compression ratio up the hierarchy †M. Widmann and C.Bretherton. 50 km resolution daily precipitation for the Pacific Northwest, 1949-94.

Exact Answer for 89% of queries. Within 90% of answer for >95% of queries. Queries require less than 3% of network. Good performance on average with very low lookup overhead Find maximum annual precipitation for each year.

Error Metric: Number of nodes greater than 1 pixel distance from drill-down boundary Accuracy: Within 25% error for 93% of the queries (or within 13% error for 75% of the queries) Less than 5% of the network queried. Locate boundaryin annual precipitation between Low and High Precipitation Areas

Open Issues • Load Balancing and Robustness • Hierarchical Model vs Peer Model: lot of work in p2p systems… • Irregular Node Placement • Use wavelet extensions for irregular node placement. Computationally more expensive • Gridify dataset with interpolation • Providing Query Guarantees • Can we bound error in response obtained for a drill-down query at a particular level of hierarchy? • Implementation on IPAQ/mote network

Summary • DIMENSIONS provides a holistic data handling architecture for sensor networks that can • Support a wide range of sensor-network usage and query models (using drill-down querying of wavelet summaries) • Provide a gracefully degrading lossy storage model (by progressively ageing summaries) • Offer ability to tune energy expended for query performance. (tunable lossy compression)

Different optimization metrics

Other Examples: Packet Loss • Different example of dataset that exhibits spatial correlation • Throughput from one transmitter to proximate receivers is correlated • Throughput from multiple proximate transmitters to one receiver is correlated. • Typically, what we want to query is the deviations from normal and average throughput.

Packet-Loss Dataset: Get Throughput Vs Distance Map • Involves expensive transfer of 12x14 map from each node. • Good approximate results can be obtained from querying compressed data.

Slower Ageing Wavelet Coefficients Long-term Storage: Concepts • Data is progressively aged, both locally, and along the hierarchy. • Summaries that cover larger areas and longer time-periods are retained for much longer than raw time-series.

Load Balancing and Robustness: Concepts • Hierarchical Model • Naturally fits wavelet processing • Strict hierarchies are vulnerable to node failures. Failures near root of hierarchy can be expensive to repair • Decentralized Peer Model • Summaries communicated to multiple nodes probabilistically. • Better robustness, but incurs greater communication overhead.

Innovative Data Handling Architecture for Sensor Networks

Innovative Data Handling Architecture for Sensor Networks

Presentation Transcript

Data Management in Sensor Networks

The Cougar Approach to In-Network Query Processing in Sensor Networks By Yong Yao and Johannes Gehrke Cornell University

Chapter 12 Wireless Sensor Networks

TinySec: A Link Layer Security Architecture for Wireless Sensor Networks

Wireless sensor networks: a survey

Вычисление в беспроводных сенсорных сетях

Security Framework for Wireless Sensor Networks

1. Introduction to Sensor Networks

Uncertain Data Management for Sensor Networks

Securing Wireless Sensor Networks

Routing Protocols for Sensor Networks

Energy Efficient Data Gathering Algorithms in Sensor Networks

Wireless Sensor Networks

A survey on routing protocols for wireless sensor networks

Sensorweb Architecture and Dynamic Sensor Tasking in Mobile Sensor Networks

Bounding the Lifetime of Sensor Networks

Mobility Management in Sensor Networks

Model Based Techniques for DATA RELIABILITY in Wireless Sensor Networks.

Adaptive QoS Framework for Wireless Sensor Networks

Chapter 12 Wireless Sensor Networks

TinySec : Link Layer Security Architecture for Wireless Sensor Networks

Wireless Sensor Networks