270 likes | 280 Views
REED Algorithm for complex data filtering in sensor networks, optimizing event detection and query performance. Includes efficient storage and transmission strategies.
E N D
REED: Robust, Efficient Filtering and Event Detection in Sensor Networks Daniel Abadi, Samuel Madden, Wolfgang Lindner MIT United States VLDB 2005
What Problem Are We Trying To Solve? • Complex data filtering in sensor networks
Example Filter Query MinTS MaxTS MinTemp MaxTemp 2:00PM 2:30PM 70 75 2:30PM 3:00PM 73 78 3:00PM 3:30PM 75 80 Timestamp Temp X 3:30PM 4:00PM 83 88 3:05PM 74 4:00PM 4:30PM 85 90 Sensor Data 4:30PM 5:00PM 70 75 5:00PM 5:30PM 72 77 5:30PM 6:00PM 75 80 Predicate Table Join Predicate: TS > MinTS && TS < MaxTS && (Temp < MinTemp || Temp > MaxTemp)
4Mhz uProc 900Mhz Radio (50-100 ft. range) 4 K RAM, 128 K Program Flash, 512 K Data Flash Berkeley Mote Constraints: Sensor Networks • Sensor nodes are small, battery-powered devices • Power conservation is important • Sensing and transmitting data typically dominate power usage
Sensor Database Motivation • Programming Apps is Hard • Limited power budget • Lossy, low bandwidth communication • Require long-lived, zero admin deployments • Distributed algorithms • Limited tools, debugging interfaces • Solution: database style interface (e.g. TinyDB [Madden 2002], Cougar [Yao 2003])
TinyDB Root 0 Main PC Controller • How TinyDB Works: • Form a routing tree • Distribute query to nodes • Every time node produces data tuple, filter by expression and pass result up tree, aggregating if necessary 1 2 3 4 5 6 7
Naïve Join Algorithm A X Root 0 Main PC Controller B B C D Predicate Table 1 2 3 4 5 6 7 • Send all tuples from data table to root; perform join at root
B X Ideal Join Algorithm Root 0 Main PC Controller • Send join table to each node • At node, perform join • Problem: Severe Node Memory Constraints A A B B C 1 2 C D D A B X X C A A 3 4 5 D B B C C D D A A 6 7 X B B X C C D D
B C X D A B REED Algorithm 1 A A B • Cluster nodes into groups • Store portion of predicate table in each group member • Send sensor data tuples to every member of group C C D D D Root 0 1 2 X X 3 4 5 X X X X 8 6 7
Group Formation Space: 10 CurrList: {1, 3, 4} Potential: {1, 3, 4} Space: 8 CurrList: {1, 4} Potential: {1, 3, 4, 6} Space: 4 CurrList: {1} Potential: {1, 2, 3, 4, 6} 1 Neighbor list: {1, 2, 3, 4, 6} 6 Choose Me! {1, 3, 4, 6} Space: 4 Group Accepted: {1, 3, 4} Broadcast: Want to make group Neighbor list: {1, 4, 6} Choose Me! {1, 3, 4} Space: 2 3 4 Neighbor list: {1, 3, 4, 6} Neighbor list: {1, 3, 4}
Table Distribution • Group members figure out amongst themselves how the table will be divided across group • Table flooded to network
Bloom Filter Optimization Step 1: Hash domain of sensor values onto Bloom Filter Step 2: Send Bloom Filter to Each Sensor Node Root 0 0 Temp: 20 1 01000010 hash 0 2 0 01000010 Temp: 90 0 1 0 01000010 hash 1 5 0 01000010 01000010 Bloom Filter 3 4 01000010 01000010 6 7 X X • Might produce false positives but never false negatives • Can be used in conjunction with previous REED algorithm
Cache Diffusion Root 0 81-90 23-50 11-20 1 2 60-70 11-20 23-50 23-50 81-90 60-70 3 4 5 11-20 24 20 21 23-50 23-50 60-70 81-90 6 7 • Cache non-joining ranges on a per node basis • Also will produce false positives but no false negatives
Results: Experimental Setup • Ran experiments both in simulation and on real motes • For simulation, 40 sensor nodes arranged in a grid • Use TinyOS Packet Level Simulation • Models CSMA backoff • Carrier sense packet delivery model • Overlap between 2 receptions leads to both being corrupted • Use TinyOS MintRoute for MultiHop Routing Layer
Simulated Results Match Real Results from Motes • Ran REED algorithm on a simple 5 node sensor network
Conclusion • Contributions: • Complex filters -> table of expressions -> join • REED algorithms capable of • Running with limited amounts of RAM • Robustness in the face of message loss and node failure • Experiments show benefits of doing complex join-based filters in the sensor network
Backup Slides 180 160 180 140 160 140 120 120 100 100 Number of Transmissions (1000s) 80 Number of Transmissions (1000s) 80 60 40 60 20 0 40 0 0.1 0.3 20 Selectivity 0 0 0.1 0.3 0.5 0.7 0.9 Selectivity
Distributed Join Group Formation A Group is a set of nodes where every node is in broadcast range of every other node. Root 0 • Process: • Every node maintains list of nodes it can hear by listening in on packets • After a random interval, a node P which is not in a group broadcasts a form group request • Every node N which hears that request and is not currently in a group replies to P with a list of neighbors and amount free space • Node P collects the replies, and determines who should be in the group. For every node N which replied, P sends either a group reject or a group accept message. • Group accept message contains a list of nodes in the group {1,2,5} 1 2 {1,2,3,4} {4,1,3, 6} 3 4 5 {5,2,6,7} {3,1,4} {7,5,6} 6 7 {6,5,7, 4}
Distributed Join Join Table Distribution Root 0 • Process: • When a node enters a group, it sends a request to the root for join table data • Per group, the root gives out non-overlapping segments of the join table to every member • Once all the nodes in a group have received join tuples, they begin processing data tuples as a group Get me some tuples! (3) 1 2 3 4 5 Get me some tuples! (2) Get me some tuples! (4) 6 7
Distributed Join Operation • For nodes not in group: • When generating a data tuple or receiving data tuple from child, pass on to parent • When receiving a result from child, pass on to parent Root 0 a 1 1 2 a • For nodes in group: • When generating a data tuple or receiving data tuple from child, broadcast to group (including self). • Upon receiving data tuple broadcast from group, join with stored subset of join table and pass result up to parent. • When receiving a result from child, pass on to parent. a a 3 4 5 6 7
Related Work • Gamma[8] and R* [15] systems both looked at ways to horizontally partitioning a table to perform a distributed join • Different optimization goals • TinyDB [19,20,21] and Cougar [31] both present a range of distributed query processing techniques • No joins • Bonfils and Bonnet [6] propose a scheme for join-operator placement within sensor networks • Look at joins of sensor data, not an external table
Motivating Applications • Industrial Process control • Distributed sensors measure environmental variables • Want to know if exceptional condition is reached • Failure and Outlier Detection • Look for de-correlated sensor readings • Power scheduling • Minimize power consumption by distributing work across sensors
Results Experimental Setup root • Sensor Nodes in a 2 x 20 grid • Use TinyOS Packet Level Simulation • Models CMSA backoff • Carrier sense packet delivery model • Overlap between 2 receptions leads to both being corrupted • Use TinyOS MintRoute for MultiHop Routing Layer 5 feet