1 / 24

Towards a Model-Based Data Collection Framework for Environmental Monitoring Networks

Towards a Model-Based Data Collection Framework for Environmental Monitoring Networks Research Proposal Jayant Gupchup Department of Computer Science, Johns Hopkins University †. 75 m. Background – II (motes). Communication (radio). 3.6 V 19.0 Ah. Computing, Storage. Sensors.

cicero
Download Presentation

Towards a Model-Based Data Collection Framework for Environmental Monitoring Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards a Model-Based Data Collection Framework for Environmental Monitoring Networks Research Proposal Jayant Gupchup Department of Computer Science, Johns Hopkins University†

  2. 75 m

  3. Background – II (motes) Communication (radio) 3.6 V 19.0 Ah Computing, Storage Sensors “Sending one packet costs same energy as thousands of CPU cycles” – Matt Welsh, Harvard

  4. All data are not equal

  5. Task list • Define “Informative Periods” • Algorithm : Find Informative (or interesting) Periods • Algorithm : Sampling Planner based on the interesting periods • Evaluation

  6. Initial Direction & Main Results • Principal Component Analysis (PCA) based approach • Classification-based approach towards detecting events.

  7. PCA based approach: Motivation • Observations: • Well behaved days show typical signature (bell-shaped pattern) • Rainy days (or periods) deviate from this signature • Strong trend component from one day to the next • Diurnal, trend features seen in most environmental modalities • PCA is good at capturing variation in collection of similar curves

  8. PCA – Toy Example First Principal Component Variable #2 Variable #1 Finds directions of Maximum Variance Reduces Dimensionality (truncate to first “p” directions)

  9. Eigenmodes for Air Temperature Directions of Maximum Variance

  10. Discriminating event, well-behaved days [5] Well-behaved days: “Fits model well” Event day: “Large residuals” [5] : J. Gupchup, R. Burns, A. Terzis, and A. Szalay, Model-Based Event Detection in Wireless Sensor Networks, Proceedings of Workshop on Data Sharing and Interoperability on the World-Wide Sensor Web (DSI), ACM/IEEE, 2007

  11. Offline to Online • Offline • Basis locked from midnight to midnight • Access to complete 24 hour signal • Online • Access to signal up to the current hour “d” • Basis locked from hour “d” to hour “d” • Vectors cyclically shifted by “d” • Eigenvalues remain the same

  12. Online Prediction Residuals

  13. Summary • PCA model effective in finding informative periods • Need to know • Shift value, “d” • “sundial” [6] • But … why not use Barometric Pressure too? [6] : Jayant Gupchup, Razvan Musăloiu-E, Alex Szalay, Andreas Terzis. Sundial: Using Sunlight to Reconstruct Global Timestamps, To appear in the proceedings of the 6th European Conference on Wireless Sensor Networks (EWSN 2009)

  14. Classification-Based Approach • 2-class problem {Rainy, Sunny} • Most classifiers provide probabilities • Sample based on those probabilities

  15. Future Work - I • Task 1: Model Improvement • Study effect (or correlation) of • Event-magnitude • Inter-Arrival Time • Explore Incremental and Robust PCA [7], [8] • Explore Label based Classifiers • Combine Air Temp, Barometric Pressure and Light Modalities (joint work with Zhiliang Ma, Dept. of Applied Math and statistics) • Task 2 : Sampling Planner • Prediction error and/or Probability of Event (PoE) • Neighbor opinion(s) • Acquisition cost of each sensor [7] : Reliable Eigenspectra for New Generation Surveys, Tamas Budavari, Vivienne Wild, Alexander S. Szalay , Laszlo Dobos, Ching-Wa Yip , MNRAS. Accepted for publication [8] : A Robust Classification of Galaxy Spectra: Dealing with Noisy and Incomplete Data, A.J. Connolly, A.S. Szalay, Astronomical Journal

  16. Future Work - II • Task 3 : Evaluation • Define Cost and Benefit functions • Compare proposed approach with existing systems • Task 4 : Application and Extensions • Identify class of applications where the framework can be used

  17. Questions ???

  18. Overview: Proposed Framework <θ1,θ2, .. θn> Model Prediction Error Prob (Event) Sampling Scheduler Update Model <X1,X2, ... Xt> <Xt+1,Xt+2, … Xt+h> Mote Storage

  19. Properties of our PCA model • Transformation: Y = X*V • Projected variables are uncorrelated • Compression/Multi-resolution • Achieve a massive compression • From previous slide, compression ratio = 4/96 = 24X • Online Basis • Basis for any “d” to “d” hour using cyclic shifting • Re-projection error Bounds • Sum of “left out” eigenvalues

  20. Preliminary Results • Rain prediction • Use Barometric Pressure • Simple linear classifiers perform well • Classification Accuracy towards 76%

  21. Eigenvector 5

  22. Online Prediction

  23. Literature Survey • Barbie-Query (BBQ, [1]) • Approximate query answering (Range, value queries) • Sensing cost differential … Energy Saving opportunities! • Predictions outside confidence interval, collect samples • Shortcomings • NOT collecting long-term environmental data • Do not consider the role played by events • PRESTO [2] • Reduce Storage costs => Reduce Communication costs • Seasonal-AutoRegressive Integrated Moving Average (S-ARIMA) [3] model for predictions • Model known to node and Basestation • When predictions within confidence bounds, do not store collected samples • Basestation can reconstruct missing samples. • Shortcomings • No adaptive sampling on interesting events [1] : Model-Driven Data Acquisition in Sensor Networks; Amol Deshpande, et al. VLDB 2004 [2] : PRESTO: Feedback-driven Data Management in Sensor Networks; Ming Li, Deepak Ganesan, and Prashant Shenoy; USENIX 2006 [3]: P.J. Brockwell, R.A. Davis. Introduction to time series and forecasting. 2002.

  24. Related Work • Near-Optimal Sensor Placement [4] • Find most informative locations to place sensors • At the same time … Keep the network connected • Solution: Information-theoretic (entropy) & Steiner tree approximation • Differences • Focus is finding informative locations in an offline fashion • Solution addresses spatial variability • Sampling rate does not change once locations are fixed [4] : A. Krause, C. Guestrin, A. Gupta, J. Kleinberg. "Near-optimal Sensor Placements: Maximizing Information while Minimizing Communication Cost". In Proc. of Information Processing in Sensor Networks (IPSN) 2006

More Related