310 likes | 566 Views
A Statistics-Based Sensor Selection Scheme for Continuous Probabilistic Queries in Sensor Networks. Song Han 1 , Edward Chan 1 , Reynold Cheng 2 , and Kam-Yiu Lam 1. Department of Computing 2 Hong Kong Polytechnic University PQ706, Mong Man Wai Building Hung Hom, Kowloon, Hong Kong.
E N D
A Statistics-Based Sensor Selection Scheme for Continuous Probabilistic Queries in Sensor Networks Song Han1, Edward Chan1, Reynold Cheng2, and Kam-Yiu Lam1 Department of Computing2 Hong Kong Polytechnic University PQ706, Mong Man Wai Building Hung Hom, Kowloon, Hong Kong Department of Computer Science1, City University of Hong Kong 83 Tat Chee Avenue, Kowloon, HONG KONG
Agenda • Introduction • Objective • System Model • Methodology • Performance Analysis • Conclusion
Introduction • Constantly-evolving Environment • Uncertainty of Sensor Data • Sensor Data are erroneous, unreliable and noisy • Database may store inaccurate values • Query results can be incorrect
Introduction • Statistical Model of Sensor Uncertainty • A sensor value can be described more accurately as a Gaussian Distribution • Mean µ • Variance σ2 Gaussian Distribution (,2)
Introduction • Probabilistic Queries [SIGMOD03] • Represent the imprecision in the value of the data as a probability density function. e.g., Gaussian • Augment query answers with probabilities • Give us a correct (possibly less precise) answer, instead of a potentially incorrect answer
Introduction • Query Quality and Variance • Query quality can be improved with lower variance • To obtain a smaller σ2, a simple idea is to use more sensors • Get an average of these readings • N(µ,σ2) becomes N(µ,σ2/ns), where nsis the number of “redundant” sensors
Introduction • Deploying Redundant Sensors • Exploit the fact that sensors are cheap • Example: 1000 sensors in the room to obtain average temperature • Variance decreased by a factor of 1000 • Resource Limitation Problem • Wireless network has limited bandwidth • Sensors have limited battery power • Can’t afford too many sensors!
Introduction • The Sensor Selection Problem • How to decide sensors’ sampling period • How many sensors to use for the guaranteed level of query quality? • Select which sensors?
Objective • Adaptive Sampling Period Decision Scheme • Find out the minimum variance of each entity being monitored to meet the probabilistic query quality requirement • Select minimum number of “good” sensors to achieve the required variance • Decide which sensors should be selected
region region User Wireless Network Base Station region region System Model
User coordinator Base Station System Model
Methodology • Adaptive Sampling Period Decision • Sensor Selection Process 1. obtain (, max 2) from sensors in region 2. Derive max 2for each item to satisfy quality 3. Determine sensor nodes to be used
Adaptive Sampling Period Decision • The region’s value is changing continuously • Periodical Sample will consume excessive system resource • Adaptive Sample Scheme for MAX/MIN query • ESSENCE: To increase the sampling period for the regions whose values have little effect on the query result.
Adaptive Sampling Period Decision • Adaptive Sample Scheme for MAX/MIN query • Predicted Sampling Time (PST)
Sensor Selection Process • Types of Probabilistic Queries • Factors Affecting Query Quality • Probabilistic Query Quality • An Example: MAX Query • Reselection of Sensors for Continuous Queries
Types of Probabilistic Queries • MAX/MIN: Which region has max or min temperature? (A, 60%), (B, 30%), (C, 10%) • AVG/SUM: What is the average temperature of regions A, B and C? • Range Count: How many objects are within 50m from me?
Factors Affecting Query Quality Error distribution of each sensor reading Variance of Gaussian distribution Each query has its own correctness requirement 1. MAX / MIN 2. AVG / SUM 3. Range Count Query
Probabilistic Query Quality • Probabilistic queries allow specification of answer quality 1. MIN/MAX: highest probability ≥ P 2. AVG/SUM: variance of answer ≤ T 3. Range count: Top K counts contribute total probability ≥ P
Example: MAX Query • Let the probability of the i-th region be pi, where fi(s) is the pdf of N(µ,σ2) • Quality requirement: the maximum of pi must be larger than P
Finding variance for MAX 1. Set the variance of each region (σ1,σ2,…, σn) to their maximum possible 2. Find pimax, the maximum of pi’s 3. Find jmax, the index of the maximum of i.e., the sensor with greatest impact to pimax
Finding variance for MAX (Cont.) 4. Adjust variance of the jmaxth sensor σjmax=σjmax-∆σ 5. Keep reducing variances until pimax(σ1,σ2,…, σn) P 6. Return σ1,σ2,…, σn as the variances for the n regions
Deciding Set of Sensors • Distribution of ns samples follows normal distribution N(µ,σ2/ns) • Compute nssatisfying σ2/ns ≤ max variance • Compute expected value of E(s) • Select ns sensors with the lowest difference of readings from E(s) • Only these sensors send their sampled values to the coordinator for computing N(µ,σ2/ns)
Reselection of Sensors for CQ • Sensor selection runs again when: 1. Probabilistic query quality cannot be met (e.g., due to change of mean) 2. Coordinator detects some sensor is faulty (e.g., its value deviates significantly from the majority) or gives no response after some timeout period
Simulation Model • Continuous query length: 1000 sec • Sensor sampling interval: 5 sec • Number of regions: 4 • Number of sensors per region: U [100,150] • Sensor error variance range: 5-25% • Difference in the values of different regions: 2-10% • Quality requirement for MIN/MAX Query : 95% • Variance Change Step (∆σ): 0.3
Performance Analysis Accuracy vs. Difference in Region’s Values % in Sensor Selected vs. Difference in Region’s Values
Performance Analysis Accuracy vs. Sensor Error Variance Percentage Percentage of Sensors Selected vs. Sensor Error Variance
Performance Analysis Percentage of Sensors Selected over Time for Continuous Changes in Valuesof Regions Changes in Value of Regions over Time
Conclusion • Accuracy improved through multiple sensors • Adaptive Sample Period Decision Scheme • Limited network bandwidth allows only limited number of redundant sensors • Sensor selection algorithm selects good sensors for reliable readings
Future Work • Region Selection • Reducing the Computational Complex of the sensor selection progress • Differentiating bad sensors from “good ones” that report true surprising events • Hierarchical organization of coordinators • How to assign coordinators?
References 1. [VSSN04] K.Y. Lam, R. Cheng, B. Y. Liang and J. Chau. Sensor Node Selection for Execution of Continuous Probabilistic Queries in Wireless Sensor Networks. In Proc. of ACM 2nd Intl. Workshop on Video Surveillance and Sensor Networks, Oct, 2004. 2. [SIGMOD03] R. Cheng, D. Kalashnikov and S. Prabhakar. Evaluating Probabilistic Queries over Imprecise Data. In Proc. of ACM SIGMOD, June 2003. 3. [Mobihoc04] D. Niculescu and B. Nath. Error characteristics of adhoc positioning systems. In Proceedings of the ACM Mobihoc 2004, Tokyo, Japan, May 2004. 4. [WSNA03] E. Elnahrawy and B. Nath. Cleaning and Querying Noisy Sensors. In ACM WSNA’03, September 2003, San Diego, California.
Thank you! HAN Song han_song@cs.cityu.edu.hk