360 likes | 512 Views
The Design of an Acquisitional Query Processor for Sensor Networks. CS851 Presentation 2005 Presented by: Gang Zhou University of Virginia. Outline. Application Structure & Design Goals Acquisitional Query Language Power-Aware Optimization Power Sensitive Dissemination and Routing
E N D
The Design of an Acquisitional Query Processor for Sensor Networks CS851 Presentation 2005 Presented by: Gang Zhou University of Virginia
Outline • Application Structure & Design Goals • Acquisitional Query Language • Power-Aware Optimization • Power Sensitive Dissemination and Routing • Processing Queries • Conclusions and Future Work • Discussion
Application Structure • Queries submitted in PC • Parsed, optimized in PC • Disseminated and processed in network • Results flow back through the routing tree
Design Goals • Provide a query processor-like interface to sensor networks • Use acquisitional techniques to reduce power consumption compared to traditional passive systems
How? • What is meant by acquisitional techniques? • Where, when, and how often data is acquired and delivered to query processing operators • Four related questions • When should samples be taken? • What sensors have relevant data? • In what order should samples be taken? • Is it worth to process and relay samples?
What’s the big deal? • Radio is expensive • Sensing takes significant energy • Four Energy Levels: • Snoozing • Processing • Processing and receiving • Transmitting
Roadmap • Application Structure & Design Goals • Acquisitional Query Language • Power-Aware Optimization • Power Sensitive Dissemination and Routing • Processing Queries • Conclusions and Future Work • Discussion
An Acquisitional Query Language • SQL-like queries in the form of SELECT-FROM-WHERE SELECT nodeid, light, temp FROM sensors SAMPLE INTERVAL 1s FOR 10s • Sensors viewed as a single table • Unbounded, continuous data stream of values • Columns are sensor data • Rows are individual sensors
Why Windows? • Sensors table is an unbounded, continuous data stream • Operations such as sort and symmetric join are not allowed on streams • They are allowed on bounded subsets of the stream (windows)
Windows • Windows in TinyDB are fixed-size materialization points over sensor streams. • Materialization points can be used in queries • ExampleCREATE STORAGE POINT recentlight SIZE 8 AS (SELECT nodeid, light FROM sensors SAMPLE INTERVAL 10s)SELECT COUNT(*) FROM sensors AS s, recentlight AS r1 WHERE r.nodeid = s.nodeid AND s.light < r1.light SAMPLE INTERVAL 10s
Temporal Aggregation • Why Aggregation? • Reduce the quantity of data that must be transmitted through the network • Example SELECT WINAVG (volume, 30s, 5s) FROM sensors SAMPLE INTERVAL 1s • Report the average volume over the last 30 seconds once every 5 seconds, sampling once per second • How about spacial aggregation or spacial-temporal aggregation? • It’s hard; needs communication; depending on routing tree…
Event-Based Queries • An alternative to continuous polling for data • ExampleON EVENT bird-detector(loc): SELECT AVG(light), AVG(temp), event.loc FROM sensors AS s WHERE dist(s.loc, event.loc) < 10m SAMPLE INTERVAL 2s FOR 30s • Currently, events are only signaled on the local node. • How about a fully distributed event propagation system? • What is the gain? • What is the pay?
Lifetime-Based Queries • ExampleSELECT nodeid, accel FROM sensors LIFETIME 30 days • The query specifies that the network should • Run for as least 30 days • Sampling light and acceleration sensors as quick as possible and still maintains the life time goal
Lifetime-Based Queries • Nodes perform cost-based analysis in order to determine data rate for each node ???
Lifetime-Based Queries • Tested a mote with a 24 week query • Sample rate was 15.2 seconds per sample • Took 9 voltage readings over 12 days • Reasonable to drop the first two data? • Reasonable to use data from the first 12 days to fit a line which covers 168 days?
Roadmap • Application Structure & Design Goals • Acquisitional Query Language • Power-Aware Optimization • Power Sensitive Dissemination and Routing • Processing Queries • Conclusions and Future Work • Discussion
Power-Aware Optimization • Where? • Queries optimized by base station before dissemination • why? • Cost-based optimization to yield lowest overall power consumption • Cost dominated by sampling and transmitting • How? • Optimizer focuses on ordering joins, selections, and sampling on individual nodes
Reordering Sampling and Predicates • Consider the querySELECT accel, mag FROM sensors WHERE accel > c1 AND mag > c2 SAMPLE INTERVAL 1s • Three options • Measure accel and mag; then process select • Measure mag; filter; then measure accel • Measure accel; filter; then measure mag • First option always more expensive. • Second option is more expensive than third, when Saccel is more selective than Smag. • Second option can be cheaper if the Smag is highly selective.
Example 2 • Another exampleSELECT MAX (light) FROM sensors WHERE mag > x SAMPLE INTERVAL 8s • Unless mag > x is very selective, it is cheaper to check if current light is greater than the previous maximum and then apply the predicate over mag, rather than first sampling mag. • Reordering is called exemplary aggregate pushdown
Event Query Batching • Have a query ON EVENT e (nodeid) SELECT a1 FROM sensors AS s WHERE s.nodeid = e.nodeid SAMPLE INTERVAL d FOR k • Every time e occurs, an instance of the internal query is started. • Multiple independent instances at the same time, independent sampling and data delivering
SELECT s.a1 FROM sensors AS s, events AS e WHERE s.nodeid = e.nodeid AND e.type = e AND s.time – e.time <= k AND s.time > e.time SAMPLE INTERVAL d ON EVENT e (nodeid) SELECT a1 FROM sensors AS s WHERE s.nodeid = e.nodeid SAMPLE INTERVAL d FOR k • Solution: • Convert event e into an event stream • Rewrite the internal query as a sliding window join between the event stream and sensors
Roadmap • Application Structure & Design Goals • Acquisitional Query Language • Power-Aware Optimization • Power Sensitive Dissemination and Routing • Processing Queries • Conclusions and Future Work • Discussion
Semantic Routing Trees • Why SRT? • It is a routing tree designed to allow each node to efficiently determine if any of the nodes below it will need to participate in a given query over some constant attributes. • Used to prune the routing tree. • What is SRT? • An SRT is an index over constant attribute A that can be used to locate nodes that have data relevant to the query. • It is an overlay on the network.
How to use SRT? • When a query q with a predicate over A arrives at node n, n checks whether any child’s value of A overlaps the query range of A in q: • If yes, forward the query and prepare to receive results • If no, do not forward q • Is query q applied locally: • If yes, execute the query • If not, ignored
How to build SRT? • Flood the SRT build request down the network • Re-transmitted by every mote until every mote hears it • If a node has no children • Choose a parent p; report the value of A to p • should it be range? • If a node has children • Forward the request, and wait for reply • Upon reply from children, choose a parent p; report to p the range of values of A which it and its descendents cover • Since each constant attribute A may have a separate SRT, is the scheme scalable?
Evaluation of SRT • SRT are limited to constant attributes • Even so, maintenance is required • Possible to use for non-constant attributes but cost can be prohibitive
Evaluation of SRT • Compared three different strategies for building tree, random, closest, and cluster • Random: pick a random parent from the nodes with reliable communication • Closest: pick the parent whose attribute value (index attribute) is closest • Cluster: by snooping siblings’ parent selection, each node try to pick the right parent, to minimize the spread of attribute values underneath all of its available parents • Report results for two different sensor value distributions, random and geographic • Random: each attribute value is randomly selected from the interval [0,1000] • Geographic: values among neighbor are highly correlated
SRT Results • The Cluster scheme is superior to the random scheme and the closest scheme. • With the geographic distribution, the performance of the cluster scheme is close to the optimal. • Where is the data of SRT’s overhead?
Roadmap • Application Structure & Design Goals • Acquisitional Query Language • Power-Aware Optimization • Power Sensitive Dissemination and Routing • Processing Queries • Conclusions and Future Work • Discussion
Processing Queries • Queries have been optimized and distributed, what more can we do? • Aggregate data that is sent back to the root • Prioritize data that needs to be sent (why??) • Naïve - FIFO • Winavg – average the two results at the queue’s head to make room for the new data • Delta – Send result with most changes • Adapt data rates and power consumption
Prioritization Comparison • Sample rate was 4 times faster than delivery rate. • Readings generated by shaking the sensor • Delta seems to be better
Adaptation • Not safe to assume that network channel is uncontested • TinyDB reduces packets sent as channel contention rises • How much? No detail!
Roadmap • Application Structure & Design Goals • Acquisitional Query Language • Power-Aware Optimization • Power Sensitive Dissemination and Routing • Processing Queries • Conclusions and Future Work • Discussion
Conclusions & Future Work • Conclusions: • Design of an acquisitional query processor for data collection in sensor networks • Evaluation in the context of TinyDB • Future Work: • Selectivity of operators based upon range of sensor • Exemplary aggregate pushdown • More sophisticated prioritization schemes • Better re-optimization of sample rate based upon acquired data
Discussion • Is this the best way (right way?) to look at a sensor network? • Is their approximation of battery lifetime sufficient? • Was their evaluation of SRT good enough?