740 likes | 866 Views
Ian Yap, Josh Hyman UCLA ianyap@ucla.edu , josh@cs.ucla.edu. Student Presentation #2: Storage In Sensor Networks. What are we going to cover today?. Introduction! Ultra-low power storage option for sensors! PRESTO: A feedback-driven sensor network data management system!
E N D
Ian Yap, Josh Hyman UCLA ianyap@ucla.edu, josh@cs.ucla.edu Student Presentation #2: Storage In Sensor Networks
What are we going to cover today? • Introduction! • Ultra-low power storage option for sensors! • PRESTO: A feedback-driven sensor network data management system! • TSAR: A storage architecture for sensor networks! • Mutli-resolution storage and search!
Scaling high-bandwidth sensor network deployments • We have made a good start at building scalable, long-term wireless sensor network deployments that deal with low-bandwidth, low-duty rate applications. • Micro-climate monitoring system at James Reserve (CENS-UCLA), Bird monitoring at Great Duck Island – (Intel-U.C.Berkeley) • low-data rate (few samples/minute), medium-scale (100s of nodes) deployments. • Duty-cycling + low-power listen/transmit, simple aggregation schemes (TinyDiffusion/TinyDB). • We have very little understanding of how to scale high-bandwidth sensor network applications (involving vibration/acoustic/image sensors) where significant data rates can be expected.
Transitioning from centralized to distributed storage and search. Current Data Acquisition Systems Multi-hop Wireless Data Acquisition using Motes Distributed In-network storage and search
Data Correlation vs. Decentralization Geo-Spatial Data Mining Progressive Transmission Long-term wireless sensor network deployments Spatial Temporal Exploited Data Correlation Peer-to-Peer: Gnutella, CAN, CHORD Centralized Data Collection Web Caches None Centralized Hierarchical Fully Distributed Degree of Decentralization Can existing storage and search systems satisfy design goals?
Ultra-Low Power Data Storage for Sensor Networks Each generation of sensor platforms reduces computation and communication costs, but storage costs have not tracked this trend. Telos Mica2 MicaZ Storage Efficiency Mica Computation & CommunicationEfficiency Applications working with high data volumes: camera sensor networks acoustic tracking networks for animals, vehicles, etc. seismic biological sensor networks
Storage Media Toshiba Parallel NAND flash adapter Mica2 Atmel serial NOR flash TelosB STM serial NOR flash MMC adapter
Comparison of the per byte cost of operations * Without ECC. Cost of performing ECC in software is approx 0.026uJ/byte
Computation and Communication Costs • Parallel NAND flash most energy efficient • More than 200 times less in comparison to communication costs • Comparable to computation costs • Enables ultra-low power, almost infinite storage (~1 GB) for sensors
Tagline: “Suddenly as if by magic!” • Two-tier sensor data management architecture from University of Massachusetts, Amherst. • Lots of sensors that report to higher-tier proxies that respond to user queries PREdictive STOrage Architecture for Sensor Networks
Motivations for the Presto Sensor Data Management System • System needs to detect unusual data trends! • System should support archival queries! • System should be adaptive!
The three fundamental goals of Presto • Model-Driven Push • Support for archival queries • Adaptation to Query and Data Dynamics
Four components of the Proxy! • Modeling and Prediction Engine • Query Processing at the Proxy • Proxy Cache • Failure Detection
Proxy Component 1:Modeling and Prediction Engine • Main idea is to use a set of past sensor data to predict future values at a specific time t • The engine is based on the seasonal ARIMA(Auto-Regressive Integrated Moving Average) time-series model, aka SARIMA • The specific SARIMA model here is the Box-Jenkins seasonal model • The proxy will initially construct this model by figuring out the parameters
Proxy Component 1:Modeling and Prediction Engine • Box-Jenkins Model has an order of (p,d,q) x(P,D,Q)S • For a temperature model, it reduces to the equation: • The proxy needs to collect a data-set from each sensor to calculate the parameters(θ and Θ) for the equation. • Once model is done, predicting a sensor value at time t is a piece of cake • Xt-1 is previous value and Xt-S is the value at this time instant in the previous season. et-S is the prediction error at time (t-S)
Proxy Component 2:Query Processing at the Proxy • Basically, the proxy calculates a confidence interval for its model( in this case a temperature (0,1,1) x(0,1,1) model) : • So if the query’s precision(error tolerance) satisfies this confidence interval, then the proxy can just use the predicted value. This allows it to handle many queries locally without disturbing the sensors
Proxy Component 3:Proxy Cache • Assumed to be big enough to store all sensor data previously predicted or fetched by the proxy. • When a proxy can handle a query locally, it takes the data from here • If the proxy is forced to pull data from a sensor, OR a sensor has pushed new data to the proxy, the cache needs to use the new value to improve its prediction engine(interpolation) • There are two interpolation heuristics: forward and backward
Proxy Component 4:Failure Detection • Sensors are notoriously fragile in both hardware and software aspects. • Presto’s predictive techniques seek to reduce communication between the sensor and proxy • The detection is very proxy-oriented • When the proxy needs to pull data from the sensors, and the sensors do not respond, then it will flag a sensor error • Obviously, the latency in detecting the error depends on how often the proxy has to fetch sensor data • It is possible for the user to explicitly force the proxy to pull data(i.e. also check for sensor(s)’ status)
Three simple tasks of the Sensor! • Using the model to predict which observed(sensed) data to push up • Maintain local archive of ALL observations(using NAND flash) • Respond efficiently to pull requests from the proxy
Sensor Operation • Only samples that deviate significantly from the prediction are pushed out! • Local archive uses the energy-efficient NAND flash mentioned earlier, which allows more data to be stored
How Presto adapts • Dynamic nature of sensor data! • Sensor data in the cache can go stale • Model might be outdated, parameters need to be recalculated! • Model retrained with incremental data • Dynamic nature of queries! • Queries behavior is also dynamic • The threshold parameter δ can affect how often the sensor pushes data to the proxy • Proxy uses a moving average window to track the error tolerances of the incoming queries. When this average changes by more than a pre-defined threshold, a new δ is calculated and sent to the sensor(s)
Actual Presto Implementation • Temperature Sensing Application for James Reserve • Proxy • Stargate, with 802.11b and Emstar transceiver • Emstar running on the Stargate also simulates additional multiple virtual sensor nodes • Sensor • Telos, with TinyOS
Experimental Evaluation • Three micro-benchmarks to evaluate individual components • Performance of Model-Driven Push • Presto Scalability • How well Presto adapts • Failure Detection
Micro-benchmarks • Energy Consumption • Communication Latency • Asymmetric Resource Usage
Benchmark 1:Energy Consumption • Telos measured against a Mica2 with an attached 1MB NAND flash The only thing Telos loses out is on the storage
Benchmark 2:Communication Latency • Latency of querying a sensor node • Based on varying duty-cycles on the Mica2, since there is no TinyOS implementation for duty-cycling on the Telos • The Mica2 uses B-MAC High latency indicates that the proxy, not sensor, should handle most of the incoming queries
Benchmark 3: Asymmetric Resource Usage • To verify the claim that it is efficient to do model estimation on the proxy and not on the sensor • Also shows that usage of the model on the sensor is rather energy-efficient • Most importantly, this tries to show that the latency and energy consumption for the Telos mote are reasonably small enough
Performance of Model-Driven Push(Simulated in Matlab) Prediction Error
Presto Scalability 2. Impact of Query Rate • Impact of Network Size
TSAR*: A Two Tier Sensor Storage Architecture Using Interval Skip Graphs (*Tiered Storage ARchitecture) Slides adapted from Presto Web site, University of Massachusetts at Amherst
Key Ideas in TSAR • Works as a component of PRESTO • Exploit storage trends for archival. • Use cheap, low-power, high capacity flash memory in preference to communication. • Index at proxies and store at sensors. • Exploit proxy resources to conserve sensor resources and improve system performance. • Extract key searchable attributes. • Distill sensor data into concise attributes such as ranges of time or value that may be used for location and retrieval but require less energy to transmit.
TSAR Architecture • 1. Interval Skip Graph-based index between proxies. • Exploit proxy resources to locate data stored on sensors in response to queries. • 2. Summarization process • Extracts identifying information: e.g. time period during which events were detected, range of event values, etc. • 3. Local sensor data archive • Stores detailed sensor information: e.g. images, events. Sensor node archive
TSAR Architecture • 1. Interval Skip Graph-based index between proxies. • Exploit proxy resources to locate data stored on sensors in response to queries. Summarization function • 2. Summarization process • Extracts identifying information: e.g. time period during which events were detected, range of event values, etc. • 3. Local sensor data archive • Stores detailed sensor information, e.g. images, events.
TSAR Architecture • 1. Interval Skip Graph-based index between proxies • Exploit proxy resources to locate data stored on sensors in response to queries. Distributed index • 2. Summarization process • Extracts identifying information: e.g. time period during which events were detected, range of event values, etc. • 3. Local sensor data archive • Stores detailed sensor information, e.g. images, events.
Sensor archivesinformation and transmits summary to proxy. Example - Camera Sensing Cyclops camera image Birds(t1,t2)=1<id> Summary handle summarize <id> Sensor node storage
Example - Indexing Index Network of proxies Birds(t1,t2)=1<id> Birds t1,t2 1 <id> proxy Summary and location information are stored and indexed at proxy.
Example - Querying and Retrieval Birds in interval (t1,t2)? Cyclops camera summarize Birds t1,t2 1 <id> Cyclops camera summarize proxy Query is sent to any proxy.
Example - Querying and Retrieval Birds in interval (t1,t2)? Cyclops camera summarize Birds t1,t2 1 <id> Cyclops camera summarize <id> proxy Index is used to locate sensors holding matching records.
Example - Querying and Retrieval Cyclops camera summarize Birds t1,t2 1 <id> Cyclops camera <id> proxy Record is retrieved from storage and returned to application.
Outline of Talk • Introduction and Motivation • Architecture • Example • Design • Skip Graph • Interval Search • Interval and Sparse Interval Skip Graph • Experimental Results • Conclusion and Future Directions
Goals of Index Structure • The index should: • support range queries over time or value, • be fully distributed among proxies, and • Support interval keys indicating a range in time or value. (| |)? Distributed index insert(| |)
2 3 5 6 9 12 18 19 What is a Skip Graph? (Aspnes & Shah, 2003, Harvey et al. 2003) Distributed extension of Skip Lists (Pugh ‘90): Probabilistically balanced - no global rebalancing needed. Ordered by key - provides efficient range queries. Fully distributed - data is indexed in place. Properties: Log(N) search and insert No single root - load balancing, robustness Single key and associated pointers
Interval search Given intervals [low,high] and query X: 1 - order by low 2 - find first interval with high <= X 3 - search until low > X Query: x=4 8 9 6 10 5 8 2 4 2 3 1 5 0 3 0 1 2 3 4 5 6 7 8 9 10
Interval search Given intervals [low,high] and query X: 1 - order by low 2 - find first interval with high <= X 3 - search until low > X Query: x=4 8 9 6 10 5 8 2 4 2 3 1 5 0 3 0 1 2 3 4 5 6 7 8 9 10
Interval search Given intervals [low,high] and query X: 1 - order by low 2 - find first interval with high <= X 3 - search until low > X Query: x=4 8 9 6 10 5 8 2 4 2 3 1 5 0 3 0 1 2 3 4 5 6 7 8 9 10
Simple Interval Skip Graph Method: Index two increasing values: low, maxSearch on either as needed. 0-3 0-1 1-5 2-4 5-8 6-10 8-9 9-12 3 3 5 5 8 10 10 12 Derived from Interval Tree, Cormen et al. 1990 Interval keys: YES logN search: YES logN update: NO - (worst case O(N))