420 likes | 516 Views
An adaptive modular approach to the mining of sensor network data. G. Bontempi, Y. Le Borgne (1) {gbonte,yleborgn}@ulb.ac.be Machine Learning Group Université Libre de Bruxelles – Belgium
E N D
An adaptive modular approach to the mining of sensor network data G. Bontempi, Y. Le Borgne (1) {gbonte,yleborgn}@ulb.ac.be Machine Learning Group Université Libre de Bruxelles – Belgium (1) Supported by the COMP2SYS project, sponsored by the HRM program of the European Community (MEST-CT-2004-505079)
Outline • Wireless sensor networks: Overview • Machine learning in WSN • An adaptive two-layer architecture • Simulation and results • Conclusion and perspective Y. Le Borgne
Sensor networks : Overview • Goal : Allow for a sensing task over an environment • Desiderata for the nodes: • Autonomous power • Wireless communication • Computing capabilities Y. Le Borgne
Smart dust project • Smart dust: Get mote size down to 1mm³ • Berkeley - Deputy dust (2001) • 6mm³ • Solar powered • Acceleration and light sensors • Optical communication • Low cost in large quantities Y. Le Borgne
Current available sensors • Crossbow : Mica / Mica dot • uProc: 4Mhz, 8 bit Atmel RISCRadio: 40 kbit 900/450/300 MHz or250 kbit 2.5GHz (MicaZ 802.15.4)Memory: 4K RAM / 128 K Program Flash / 512 K Data FlashPower: 2 x AA or coin cell • Intel :iMote • uProc: 12Mhz, 16 bit ARMRadio: BluetoothMemory: 64K SRAM / 512 K Data FlashPower: 2 x AA • MoteIV : Telos • uProc: 8Mhz, 16 bit TI RISCRadio: 250 kbit 2.5GHz (802.15.4)Memory: 2 K RAM / 60 K Program Flash / 512 K Data FlashPower: 2 x AA Y. Le Borgne
Applications • Wildfire monitoring • Ecosystem monitoring • Earthquake monitoring • Precision agriculture • Object tracking • Intrusion detection • … Y. Le Borgne
Challenges for… • Electronics • Networking • Systems • Data bases • Statistics • Signal processing • … Y. Le Borgne
Machine learning and WSN • Local scale • Spatio-temporal correlations • Local predictive model identification • Can be used to: • Reduce sensor communication activity • Predict values for malfunctioning sensors Y. Le Borgne
Machine learning and WSN • Global scale • The network as a a whole can achieve high level tasks • Sensor network <-> Image Y. Le Borgne
Supervised learning and WSN • Classification (Traffic type classification) • Prediction (Pollution forecast) • Regression (Wave intensity, population density) Y. Le Borgne
A supervised learning scenario • Ѕ: Network of S sensors • x(t)={s1(t),s2(t),…sS(t)} snapshot at time t • y(t)=f(x(t))+ε(t) the value associated to S at time t (ε standing for noise) • Let DN be a set of N observations (x(t),y(t)) • Goal : Find a model that predicts y for any new x Y. Le Borgne
Centralized approach High transmission overhead Y. Le Borgne
Two-layer approach • Use of compression • Reduce transmission overhead • Spatial correlation induces low loss in compression • Reduction of learning problem dimensionality Y. Le Borgne
Two-layer adaptive approach • PAST : Online compression • Lazy learning : Online learning Y. Le Borgne
Compression : PCA • PCA: • Transform the set of n input variables , into a set of m variables , m<n. • Linear transformation : , • Variance preserving maximization • Solution : • m first eigenvectors of x correlation matrix, or • Minimization of Y. Le Borgne
PAST – Recursive PCA • Projection approximation subspace tracking [YAN95] • Online formulation: • Low memory requirement and computational complexity : O(n*m)+O(m²) Y. Le Borgne
PAST Algorithm Recursive formulation: [HYV01] Y. Le Borgne
Learning algorithm • Lazy learning: K-NN approach • Storage of observation set: • When a query q is asked, takes the k nearest neighbours to q: • Builds a local linear model: , such that • Computes the output at by applying Y. Le Borgne
How many neighbours? • y=sin(x)+e • e : Gaussian noise with σ=0.1 • What is the y value at x=1.5? Y. Le Borgne
How many neighbours? • K=2 : Overfitting Y. Le Borgne
How many neighbours? • K=2 : Overfitting • K=3 : Overfitting Y. Le Borgne
How many neighbours? • K=2: Overfitting • K=3: Overfitting • K=4: Overfitting Y. Le Borgne
How many neighbours? • K=2: Overfitting • K=3: Overfitting • K=4: Overfitting • K=5: Good Y. Le Borgne
How many neighbours? • K=2: Overfitting • K=3: Overfitting • K=4: Overfitting • K=5: Good • K=6: Underfitting Y. Le Borgne
Automatic model selection([BIR99],[BON99],[BON00]) • Starting with a low k, local models are identified • Their quality is assessed by a leave one out procedure • The best model(s) are kept for computing the prediction • Low computational cost • PRESS statistics (ALL74) • Recursive least squares ([GOO84]) Y. Le Borgne
Advantages of PAST and lazy • No assumption on the process underlying data • On-line learning capability • Adaptive with non-stationarity • Low computational and memory costs Y. Le Borgne
Simulation • Modeling wave propagation phenomenon • Helmholtz equation: • k is the wave number • 2372 sensors • 30 k values between 1 and 146; 50 time instants • 1500 Observations • Output k is noisy Y. Le Borgne
Test procedure • Prediction error measurement • Normalized Mean Square Error (NMSE) • 10-fold cross-validation (1350/150) Example of learning curve: Y. Le Borgne
Experiment 1 • Centralized configuration • Comparison PCA / PAST for 1 to 16 first principal components Y. Le Borgne
Results • Prediction accuracy similar if number of principal components sufficient Y. Le Borgne
Clustering • The number of clusters involves a trade-off between • The routing costs between clusters and gateway • The final prediction accuracy • The robustness of the architecture Y. Le Borgne
Experiment 2 • Partitioning into geographical clusters • P varies from P(2) to P(7) • 2 main components for each cluster • Ten-fold cross-validation – 1500 data Example of P(2) partitioning Y. Le Borgne
Results • Comparison of P(2) (Top) and P(5) (bottom) error curves • As number of cluster increases: • Better accuracy • Faster convergence Y. Le Borgne
Experiment 3 • Simulation: at each time instant • Probability 10% for a sensor failure • Probability 1% for a supernode failure • Recursive PCA and lazy learning deals efficiently with input space dimension variations • Robust with random sensor malfunctioning Y. Le Borgne
Results • Comparison of P(2) (Top) and P(5) (bottom) error curves • The number of clusters increases the robustness Y. Le Borgne
Experiment 4 • Time varying changes in sensor measures • 2700 time instants • Sensor response decreases linearly from a factor 1 to a factor 0.4 • A temporal window: • Only the last 1500 measures are kept Y. Le Borgne
Results • Due to the concept drift, the fixed model (in black) becomes outdated • The lazy characteristic of the proposed architecture can deal with this drift very easily Y. Le Borgne
Conclusion • Architecture: • Yielding good results compared to batch equivalent • Computationally efficient • Adaptive with appearing and disappearing units • Handling easily non-stationarity Y. Le Borgne
Future work • Extensions of tests to real-world data • Improvement of clustering strategy • Taking costs (routing/accuracy) into consideration • Making use of ad-hoc feature of the network • Test of other compression procedures • Robust PCA • ICA Y. Le Borgne
References Smart Dust project: http://www-bsac.eecs.berkeley.edu/archive /users/warneke-brett/SmartDust/ Crossbow: http://www.xbow.com/ [BON99] G.Bontempi. Local Techniques for Modeling, Prediction and Control. PhD Thesis, IRIDIA- Université Libre de Bruxelles, 1999. [YAN95] B. Yang. Projection Approximation Subspace Tracking, IEEE Transactions on Signal Processing, 43(1):95-107,1995. [ALL74] D.M. Allen. 1974. The relationship between variable and data augmentation and a method of prediction. Technometrics, 16, 125-127 [GOO84] G.C. Goodwin & K.S. Sin. 1984. Adaptive filtering Prediction and Control. Prentice-Hall. [HYV01] Independent Component Analysis. A. Hyvarinen, J. Karhunen, E. Oja. 2001. Y. Le Borgne
References on lazy learning [BIR99] M. Birattari, G. Bontempi, and H. Bersini. Lazy learning meets the recursive least square algorithm. In M. S. Kearns, S.a. Solla, and D.a. Cohn, editors, NIPS 11, pages 375-381, Cambridge,1999, MIT Press. [BON99] G. Bontempi, M.Birattari, and H.Bersini. Local learning for iterated time-series prediction. In I. Bratko and S. Dzeroski, editors, Machine Learning : Proceedings of the 16th International Conference, pages 32-38, San Francisco, CA, 1999. Morgan Kaufmann Publishers. [BON00] G. Bontempi, M.Birattari, and H. Bersini. A model selection approach for local learning. Artificial Intelligence Communications, 121(1), 2000. Y. Le Borgne
Thanks for your attention! Y. Le Borgne