710 likes | 886 Views
TAG: A Tiny Aggregation Service for Ad-Hoc Sensor Networks. Samuel Madden UC Berkeley with Michael Franklin, Joseph Hellerstein, and Wei Hong December 9th, 2002 @ OSDI. TAG Introduction. What is a sensor network? Programming Sensor Networks Is Hard Declarative Queries Are Easy
E N D
TAG: A Tiny Aggregation Service for Ad-Hoc Sensor Networks Samuel Madden UC Berkeley with Michael Franklin, Joseph Hellerstein, and Wei Hong December 9th, 2002 @ OSDI
TAG Introduction • What is a sensor network? • Programming Sensor Networks Is Hard • Declarative Queries Are Easy • Tiny Aggregation (TAG): In-network processing via declarative queries! • Example: • Vehicle tracking application: 2 weeks for 2 students • Vehicle tracking query: took 2 minutes to write, worked just as well! SELECT MAX(mag) FROM sensors WHERE mag > thresh EPOCH DURATION 64ms
Overview • Sensor Networks • Queries in Sensor Nets • Tiny Aggregation • Overview • Simulation & Results
Overview • Sensor Networks • Queries in Sensor Nets • Tiny Aggregation • Overview • Simulation & Results
Device Capabilities • “Mica Motes” • 8bit, 4Mhz processor • Roughly a PC AT • 40kbit CSMA radio • 4KB RAM, 128K flash, 512K EEPROM • TinyOS based • Variety of other, similar platforms exist • UCLA WINS, Medusa, Princeton ZebraNet, MIT Cricket
Sensor Net Sample Apps Habitat Monitoring: Storm petrels on great duck island, microclimates on James Reserve. Earthquake monitoring in shake-test sites. Vehicle detection: sensors along a road, collect data about passing vehicles. • Traditional monitoring apparatus.
Metric: Communication • Lifetime from one pair of AA batteries • 2-3 days at full power • 6 months at 2% duty cycle • Communication dominates cost • 100s of uS to compute • 30mS to send message • Our metric: communication!
A B C D F E Communication In Sensor Nets • Radio communication has high link-level losses • typically about 20% @ 5m • Ad-hoc neighbor discovery • Tree-based routing
Overview • Sensor Networks • Queries in Sensor Nets • Tiny Aggregation • Overview • Optimizations & Results
Sensors • SELECTAVG(sound) • FROM sensors • EPOCH DURATION 10s • SELECTAVG(sound) • FROM sensors • EPOCH DURATION 10s 2 2 • roomNo, • GROUP BY roomNo • HAVINGAVG(sound) > 200 Rooms w/ sound > 200 Declarative Queries for Sensor Networks • Examples: SELECT nodeid, light FROM sensors WHERE light > 400 EPOCH DURATION 1s 1
Overview • Sensor Networks • Queries in Sensor Nets • Tiny Aggregation • Overview • Optimizations & Results
TAG • In-network processing of aggregates • Common data analysis operation • Aka gather operation or reduction in || programming • Communication reducing • Benefit operation dependent • Across nodes during same epoch • Exploit semantics improve efficiency!
1 2 3 4 5 Query Propagation SELECT COUNT(*)…
1 2 3 4 5 Pipelined Aggregates Value from 2 produced at time t arrives at 1 at time (t+1) • In each epoch: • Each node samples local sensors once • Generates partial state record (PSR) • local readings • readings from children • Outputs PSR from previous epoch. • After (depth-1) epochs, PSR for the whole tree output at root Value from 5 produced at time t arrives at 1 at time (t+3) • To avoid combining PSRs from different epochs, • sensors must cache values from children
1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Depth = d
1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 1 1 Sensor # 1 1 1 Epoch # 1
1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 2 3 Sensor # 1 2 2 Epoch # 1
1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 3 4 Sensor # 1 3 2 Epoch # 1
1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 4 5 Sensor # 1 3 2 Epoch # 1
1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 5 5 Sensor # 1 3 2 Epoch # 1
Aggregation Framework • As in extensible databases, we support any aggregation function conforming to: Aggn={finit, fmerge, fevaluate} finit{a0} <a0> Fmerge{<a1>,<a2>} <a12> Fevaluate{<a1>} aggregate value (Merge associative, commutative!) Partial State Record (PSR) Example: Average AVGinit {v} <v,1> AVGmerge {<S1, C1>, <S2, C2>} < S1 + S2 , C1 + C2> AVGevaluate{<S, C>} S/C
Types of Aggregates • SQL supports MIN, MAX, SUM, COUNT, AVERAGE • Any function can be computed via TAG • In network benefit for many operations • E.g. Standard deviation, top/bottom N, spatial union/intersection, histograms, etc. • Compactness of PSR
Taxonomy of Aggregates • TAG insight: classify aggregates according to various functional properties • Yields a general set of optimizations that can automatically be applied
TAG Advantages • Communication Reduction • Important for power and contention • Continuous stream of results • In the absence of faults, will converge to right answer • Lots of optimizations • Based on shared radio channel • Semantics of operators
Simulation Environment • Evaluated via simulation • Coarse grained event based simulator • Sensors arranged on a grid • Two communication models • Lossless: All neighbors hear all messages • Lossy: Messages lost with probability that increases with distance
Simulation Result Simulation Results 2500 Nodes 50x50 Grid Depth = ~10 Neighbors = ~20 Some aggregates require dramatically more state!
Optimization: Channel Sharing (“Snooping”) • Insight: Shared channel enables optimizations • Suppress messages that won’t affect aggregate • E.g., MAX • Applies to all exemplary, monotonic aggregates
Optimization: Hypothesis Testing • Insight: Guess from root can be used for suppression • E.g. ‘MIN < 50’ • Works for monotonic & exemplary aggregates • Also summary, if imprecision allowed • How is hypothesis computed? • Blind or statistically informed guess • Observation over network subset
Experiment: Hypothesis Testing Uniform Value Distribution, Dense Packing, Ideal Communication
B C B B C C B B C C 1 A A A 1/2 1/2 A A Optimization: Use Multiple Parents • For duplicate insensitive aggregates • Or aggregates that can be expressed as a linear combination of parts • Send (part of) aggregate to all parents • In just one message, via broadcast • Decreases variance
With Splitting No Splitting Critical Link! Multiple Parents Results • Better than previous analysis expected! • Losses aren’t independent! • Insight: spreads data over many links
Summary • TAG enables in-network declarative query processing • State dependent communication benefit • Transparent optimization via taxonomy • Hypothesis Testing • Parent Sharing • Declarative queries are the right interface for data collection in sensor nets! • Easier to program and more efficient for vast majority of users • TinyDB Release Available - http://telegraph.cs.berkeley.edu/tinydb
Questions? TinyDB Demo After The Session…
TinyOS • Operating system from David Culler’s group at Berkeley • C-like programming environment • Provides messaging layer, abstractions for major hardware components • Split phase highly asynchronous, interrupt-driven programming model Hill, Szewczyk, Woo, Culler, & Pister. “Systems Architecture Directions for Networked Sensors.” ASPLOS 2000. See http://webs.cs.berkeley.edu/tos
In-Network Processing in TinyDB SELECT AVG(light) EPOCH DURATION 4s • Cost metric = #msgs • 16 nodes • 150 Epochs • In-net loss rates: 5% • External loss: 15% • Network depth: 4
Grouping • Recall: GROUP BY expression partitions sensors into distinct logical groups • E.g. “partition sensors by room number” • If query is grouped, sensors apply expression on each epoch • PSRs tagged with group • When a PSR (with group) is received: • If it belongs to a stored group, merge with existing PSR • If not, just store it • At the end of each epoch, transmit one PSR per group • Need to evict if storage overflows.
Group Eviction • Problem: Number of groups in any one iteration may exceed available storage on sensor • Solution: Evict! (Partial Preaggregation*) • Choose one or more groups to forward up tree • Rely on nodes further up tree, or root, to recombine groups properly • What policy to choose? • Intuitively: least popular group, since don’t want to evict a group that will receive more values this epoch. • Experiments suggest: • Policy matters very little • Evicting as many groups as will fit into a single message is good * Per-Åke Larson. Data Reduction by Partial Preaggregation. ICDE 2002.
Declarative Benefits In Sensor Networks • Vastly simplifies execution for large networks • Since locations are described by predicates • Operations are over groups • Enables tolerance to faults • Since system is free to choose where and when operations happen • Data independence • System is free to choose where data lives, how it is represented
Hypothesis Testing For Average • AVERAGE: each node suppresses readings within some ∆ of a approximate average µ*. • Parents assume children who don’t report have value µ* • Computed average cannot be off by more than ∆.
Free Bitmap Master Pointer Table Heap Free Bitmap Master Pointer Table Heap Free Bitmap Free Bitmap Master Pointer Table Master Pointer Table Heap Heap TinyAlloc • Handle Based Compacting Memory Allocator • For Catalog, Queries Handle h; call MemAlloc.alloc(&h,10); … (*h)[0] = “Sam”; call MemAlloc.lock(h); tweakString(*h); call MemAlloc.unlock(h); call MemAlloc.free(h); User Program Compaction
Schema • Attribute & Command IF • At INIT(), components register attributes and commands they support • Commands implemented via wiring • Attributes fetched via accessor command • Catalog API allows local and remote queries over known attributes / commands. • Demo of adding an attribute, executing a command.
Q1: Expressiveness • Simple data collection satisfies most users • How much of what people want to do is just simple aggregates? • Anecdotally, most of it • EE people want filters + simple statistics (unless they can have signal processing) • However, we’d like to satisfy everyone!
Query Language • New Features: • Joins • Event-based triggers • Via extensible catalog • In network & nested queries • Split-phase (offline) delivery • Via buffers
Sample Query 1 Bird counter: CREATE BUFFER birds(uint16 cnt) SIZE 1 ON EVENT bird-enter(…) SELECT b.cnt+1 FROM birds AS b OUTPUT INTO b ONCE
Sample Query 2 Birds that entered and left within time t of each other: ON EVENT bird-leave AND bird-enter WITHIN t SELECT bird-leave.time, bird-leave.nest WHERE bird-leave.nest = bird-enter.nest ONCE
Sample Query 3 Delta compression: SELECT light FROM buf, sensors WHERE|s.light – buf.light| > t OUTPUT INTO buf SAMPLE PERIOD 1s
Sample Query 4 Offline Delivery + Event Chaining CREATE BUFFER equake_data( uint16 loc, uint16 xAccel, uint16 yAccel) SIZE 1000 PARTITION BY NODE SELECT xAccel, yAccel FROM SENSORS WHERE xAccel > t OR yAccel > t SIGNAL shake_start(…) SAMPLE PERIOD 1s ON EVENT shake_start(…) SELECT loc, xAccel, yAccel FROM sensors OUTPUT INTO BUFFER equake_data(loc, xAccel, yAccel) SAMPLE PERIOD 10ms
Event Based Processing • Enables internal and chained actions • Language Semantics • Events are inter-node • Buffers can be global • Implementation plan • Events and buffers must be local • Since n-to-n communication not (well) supported • Next: operator expressiveness
Attribute Driven Topology Selection • Observation: internal queries often over local area* • Or some other subset of the network • E.g. regions with light value in [10,20] • Idea: build topology for those queries based on values of range-selected attributes • Requires range attributes, connectivity to be relatively static * Heideman et. Al, Building Efficient Wireless Sensor Networks With Low Level Naming. SOSP, 2001.