Chong Liu, Kui Wu and Jian Pei Computer Science Dept. University of Victoria (UVIC) Canada.

A Dynamic Clustering and Scheduling Approach to Energy Saving in Data Collection from Wireless Sensor Networks Chong Liu, Kui Wu and Jian Pei Computer Science Dept. University of Victoria (UVIC) Canada. From IEEE Sensor and Ad Hoc Communications and Networks (SECON 2005)

Outline • Introduction • Problem Formulation • The Energy Efficient Data Collection Framework (EEDC) • Performance Evaluation • Conclusion

Introduction • Coverage-based scheduling methods to energy saving in wireless sensor networks • sensing coverage • Uniform scheduling • Readings of two proximate nodes may be dissimilar Presented papers: [1], [2], [3]

Introduction • Scheduling from data viewpoint • Suppose the time series of sensor nodes x, y, and zare very similar in the past. • We may conjecture that the readings of x, y, and z would also likely be similar in the future.  we can let one of the three sensor nodes report the data and put the other two nodes into sleep, instead of all the three sensor nodes reporting data simultaneously.

Problem Formulation • Problem • how the sink node can collect data from highly-redundant geographically-distributed sensor nodes with high observation fidelity and low energy consumption. • Idea • dynamically group sensor nodes into a set of disjoint clusters such that the sensor nodes within a single cluster have current strong spatial correlation.

The Energy Efficient Data Collection Framework (EEDC)

The Energy Efficient Data Collection Framework (EEDC) • The data storage module • It stores all sampling data received from the sensor nodes. This module records a time series for each sensor. • The dissimilarity measure module • It calculates the pairwise dissimilarity measure of time series.

The Energy Efficient Data Collection Framework (EEDC) • The clustering module • This module divides the sensor nodes into clusters, such that the dissimilarity measure of any two sensor nodes within a cluster is less than max_dst. • The sensor node working schedule generator module • It generates a working schedule for each sensor node based on the clusters obtained from the clustering module.

Data Collection Procedure (2) (3) (1) (4)

Clustering Sensor Nodes • Given thepairwise dissimilarity values, how to partition the sensor nodes into exclusive clusters • such that within each cluster, the pairwise dissimilarity measure of the sensor nodes is below a given intra-cluster dissimilarity threshold max_dst. • This problem is equivalent to Clique-covering Problem. • NP-complete

a cluster is a clique in the graph. • (2) the clustering problem is to use the minimum number of cliques to cover • all vertices in the graph.

Greedy Algorithm

Scheduling Sensor Nodes

Dynamic Adjustment • The environment being monitored by the sensor network might change, and thus the previous clusters might not be valid any more. • accommodate environment changes and dynamically adjust the clusters.

Dynamic Adjustment

Performance Evaluation • Environment Setting • 18 MICA2 sensor nodes • in a 36 grid layout on a big table • light intensity • sampling rate: 2 samples per second

Dissimilarity Measurement

Clustering Criteria • Separate two time series into different groups if any of the following constraints is violated: 1) They have a small difference in magnitude on average (magnitude) 2) They have the same trends in most of time (Trend) 3) They are geographically close (gmax_dst)

The Correctness of Clustering with EEDC • m = 30, t = 95%, gmax_dst = 3 feet.

The Correctness of Clustering with EEDC

The Observation Fidelity with EEDC • The joint entropy of the total 18 sensor nodes’ observations H(S1, S2, ..., S18)

The Observation Fidelity with EEDC • If H(Ci) denotes the joint entropy of all sensor nodes belonging to cluster Ci, then

The Observation Fidelity with EEDC • Let Xidenote the current working node of cluster Ci with EEDC. • The expectation of the joint entropy of working nodes, H(X1,X2, ...,X7), is the information gathered with EEDC on average.

The Observation Fidelity with EEDC

Energy Saving • Since the extra working time of each sensor after its working shift, Δt, is far less than the one round of scheduling time, T, the energy cost during Δt can be safely ignored. • Energy saving: • we can see that without using EEDC, on average each sensor will spend three times more energy in sampling and data transmission

Large-scale Synthetic Data Generation • We utilized the software toolkit provided by [11] to extract the model parameters from small-scale real datasets and generate large-scale synthetic datasets based on the model parameters. • [11] A. Jindal and K. Psounis, “Modeling Spatially-correlated Sensor Network Data,” Proceedings of Sensor and Ad Hoc Communications and Networks 2004 (SECON2004), Santa Clara, CA, October 2004.

The Field with Nine Distinguished Subregions

The Observation Fidelity with EEDC • We use another performance metric, the difference distortion measure

The Observation Fidelity with EEDC • Normalized difference distortion measure

Energy Saving

Conclusion • This paper designs an Energy Efficient Data Collection (EEDC) framework that utilizes the spatial correlation to group sensor nodes into disjoint clusters. • Since the clusters are based on the features of sampling data, scheduling based on the clusters is much more accurate than scheduling based purely on the sensing range of sensor nodes.

Chong Liu, Kui Wu and Jian Pei Computer Science Dept. University of Victoria (UVIC) Canada.