320 likes | 332 Views
This paper presents the design of a query processor for sensor networks using acquisitional techniques to reduce power consumption. It introduces an Acquisitional Query Language and discusses optimizations and future work in this field.
E N D
The Design of an Acquisitional Query Processor For Sensor Networks Samuel Madden, Michael J. Franklin, Joseph M. Hellerstein, and Wei Hong
Motes Mica Mote 4Mhz, 8 bit Atmel RISC uProc 40 kbit Radio 4 K RAM, 128 K Program Flash, 512 K Data Flash AA battery pack Based on TinyOS* *Hill, Szewczyk, Woo, Culler, & Pister. “Systems Architecture Directions for Networked Sensors.” ASPLOS 2000. http://webs.cs.berkeley.edu/tos
Earthquake monitoring in shake-test sites. Vehicle detection: sensors along a road, collect data about passing vehicles. • Traditional monitoring apparatus. Sensor Net Sample Apps Habitat Monitoring: Storm petrels on Great Duck Island, microclimates on James Reserve.
200-800 instructions per bit transmitted! Need high level abstractions! Programming Sensor Nets Is Hard • Months of lifetime required from small batteries • 3-5 days naively; can’t recharge often • Interleave sleep with processing • Lossy, low-bandwidth, short range communication • Nodes coming and going • ~20% loss @ 5m • Multi-hop • Remote, zero administration deployments • Highly distributed environment • Limited Development Tools • Embedded, LEDs for Debugging! High-Level Abstraction Is Needed!
A Solution: Declarative Queries • Users specify the data they want • Simple, SQL-like queries • Using predicates, not specific addresses • Same spirit as Cougar – Our system: TinyDB • Challenge is to provide: • Expressive & easy-to-use interface • High-level operators • Well-defined interactions • “Transparent Optimizations” that many programmers would miss • Sensor-net specific techniques • Power efficient execution framework • Question: do sensor networks change query processing? Yes!
Overview • Goals • Acquisitional Query Language • Optimizations • Future Work • Conclusions
Goals • Provide a query processor-like interface to sensor networks • Use acquisitional techniques to reduce power consumption compared to traditional passive systems
How? • What is meant by acquisitional techniques? • Where, when, and how often. • Four related questions • When should samples be taken? • What sensors have relevant data? • In what order should samples be taken? • Is it worth it?
What’s the big deal? • Radio consumes as much power as the CPU • Transmitting one bit of data consumes as much energy as 1000 CPU instructions! • Message sizes in TinyDB are by default 48 bytes • Sensing takes significant energy
An Acquisitional Query Language • SQL-like queries in the form of SELECT-FROM-WHERE • Support for selection, join, projection, and aggregation • Also support for sampling, windowing, and sub-queries • Not mentioned is the ability to log data and actuate physical hardware
An Acquisitional Query Language • Example:SELECT nodeid, light, temp FROM sensors SAMPLE INTERVAL 1s FOR 10s • Sensors viewed as a single table • Columns are sensor data • Rows are individual sensors
Queries as a Stream • Sensors table is an unbounded, continuous data stream • Operations such as sort and symmetric join are not allowed on streams • They are allowed on bounded subsets of the stream (windows)
Windows • Windows in TinyDB are fixed-size materialization points • Materialization points can be used in queries • ExampleCREATE STORAGE POINT recentlight SIZE 8 AS (SELECT nodeid, light FROM sensors SAMPLE INTERVAL 10s)SELECT COUNT(*) FROM sensors AS s, recentlight AS r1 WHERE r.nodeid = s.nodeid AND s.light < r1.light SAMPLE INTERVAL 10s
Temporal Aggregation • ExampleSELECT WINAVG(volume, 30s, 5s) FROM sensors SAMPLE INTERVAL 1s • Receive only 6 results from each sensor instead of 30
Event-Based Queries • An alternative to continuous polling for data • ExampleON EVENT bird-detector(loc): SELECT AVG(light), AVG(temp), event.loc FROM sensors AS s WHERE dist(s.loc, event.loc) < 10m SAMPLE INTERVAL 2s FOR 30s
Lifetime-Based Queries • ExampleSELECT nodeid, accel FROM sensors LIFETIME 30 days • Nodes perform cost-based analysis in order to determine data rate • Nodes must transmit at the root’s rate or at an integral divisor of it
Lifetime-Based Queries • Tested a mote with a 24 week query • Sample rate was 15.2 seconds per sample • Took 9 voltage readings over 12 days
Optimization • Three phases to queries • Creation of query • Dissemination of query • Execution of query • TinyDB makes optimizations at each step
Power-Based Optimization • Queries optimized by base station before dissemination • Cost-based optimization to yield lowest overall power consumption • Cost dominated by sampling and transmitting • Optimizer focuses on ordering joins, selections, and sampling on individual nodes
Metadata • Each node contains metadata about its attributes • Nodes periodically send metadata to root • Metadata also contains information about aggregate functions • Information about cost, time to fetch, and range is used in query optimization
Using Metadata • Consider the querySELECT accel, mag FROM sensors WHERE accel > c1 AND mag > c2 SAMPLE INTERVAL 1s • Order of magnitude difference between sample costs • Three options • Measure accel and mag, then process select • Measure mag, filter, then measure accel • Measure accel, filter, then measure mag • First option always more expensive. Second option an order of magnitude more expensive than third • Second option can be cheaper if the predicate is highly selective
Using Metadata • Another exampleSELECT WINMAX(light, 8s, 8s) FROM sensors WHERE mag > x SAMPLE INTERVAL 1s • Unless mag > x is very selective, it is cheaper to check if current light is greater than max • Reordering is called exemplary aggregate pushdown
Event Query Batching • Multiple instances of an event-based query can be running at the same time • Optimization based on rewriting as a sliding window join between events and sensors
Dissemination Optimization • Build semantic routing tree (SRT) • SRT nodes choose parents based on semantic properties as well as link quality • Parent nodes keep track of the ranges of values for children
Evaluation of SRT • SRT are limited to constant attributes • Even so, maintenance is required • Possible to use for non-constant attributes but cost can be prohibitive
Evaluation of SRT • Compared three different strategies for building tree, random, closest, and cluster • Report results for two different sensor value distributions, random and geographic
Query Execution • Queries have been optimized and distributed, what more can we do? • Aggregate data that is sent back to the root • Prioritize data that needs to be sent • Naïve - FIFO • Winavg – Average top queue entries • Delta – Send result with most change • Adapt data rates and power consumption
Prioritization Comparison • Sample rate was K times faster than delivery rate. • Readings generated by shaking the sensor • In this example, K = 4
Adaptation • Not safe to assume that network channel is uncontested • TinyDB reduces packets sent as channel contention rises
Future Work • More sophisticated prioritization schemes • Better re-optimization of sample rate based upon acquired data
Contributions & Summary • Declarative Queries via TinyDB • Simple, data-centric programming abstraction • Known to work for monitoring, tracking, mapping • Sensor network contributions • Network as a single queryable entity • Power-aware, in-network query processing • Taxonomy: Extensible aggregate optimizations • Query processing contributions • Acquisitional Query Processing • Framework for new issues in acquisitional systems, e.g.: • Sampling as an operator • Languages, indices, approximations to control when, where, and what data is acquired + processed by the system • Consideration of database, network, and device issues