240 likes | 495 Views
Sensor Data. 한국기술교육대학교 민준기. Sensor Data Management. Wireless Sensor Network Limited Energy Power Limited Computing Power Sensor Data Management Navie Approach Each Sensor sends data to the base station Do data processing at the base station Problem
E N D
Sensor Data 한국기술교육대학교 민준기
Sensor Data Management • Wireless Sensor Network • Limited Energy Power • Limited Computing Power • Sensor Data Management • Navie Approach • Each Sensor sends data to the base station • Do data processing at the base station • Problem • Each sensor waste its energy quickly in order to send its reading continuously • Minimize Energy Consumption • In-Network Processing
Major Research Topics • Data Aggregation • Data Gathering • Query Processing
Aggregation(1/5) • TAG (Tiny Aggregation) • In-Network Aggregation • Tree Routing Based • Simple Approach • Cost for Median is very high Sum(2,12, 7) 2 Sum(4,5,3) Sum(3,2,2) 4 3 5 3 2 2
Aggregation(2/5 ) • Q-Digest[2] • Capture the distribution of sensor data approximately • Digest property • count(v) <= floor(n/k) (except leaf node) • count(v)+count(vp)+count(vs) >= floor(n/k) (except root node) , where v is a node, vp is the parent of v, vs is the sibling of v. n is the number of data, k is compression parameter σ is the range of data • Size of q-Digest <= 3k • Each Sensor build q-Digest • Parent node • Merges q-Digests of Children • Compression compression
Aggregation(3/5 ) • Quantile Query • Find value whose rank in n values is qn, where q (0,1) If q = 0.5, find median <[1,8],1> <[5,6], 2> <[7,8], 2> <[3,3],4> <[4,4], 6> • Sorting in increasing right end point • <[3,3],4> <[4,4], 6> <[5,6], 2> <[7,8], 2> <[1,8],1> • <[4,4],6> exceed 0.5*15= 7.5 • Thus, 4 is an estimated median
Aggregation(4/5 ) • Multiple Aggregation • Equivalence Class Reduction[3] • Q = {q1 = {1+2+3}, q2 ={1+2}, q3 = {3}} • Equivalent class = set of sensors supports same query set • EC1 = {1,2} , EC2 = {3} • Bit Vector EC1 = [1,1,0]T, EC2 = [1,0,1]T EC1 EC2 Q1 1 1 basis Q2 1 0 x v1 = {1+2} 1 0 x v1 Q3 0 1 v2 = {3} 0 1 v2
Aggregation(5/5) • Multiple Aggregation • Segmentation Based Method[4] • Dynamic routing, Not tree routing • Segment == equivalent class • A sensor sends data to a node including same segment as possible • STG vs STS • Node 6 can send data to node 5 and 7, in case, node 6 sends data to node 7 • STG : node 4 sends data for q2 (=4, 7, 8) and q1+q2 (=4,5) node 1 receives 3 messages ( from node 2 - 1 message, node 4- 2 messages) • STS: multiple routing node 4 sends data for q2 (=4,5,6,7) to node 1 and q1(=4,5) to node 2 node 1 receives 2 messages
Gathering(1/8 ) • In-network aggregation provides a great opportunity for reducing the communication overhead • Since a single aggregated value represents the overall sensing field, it may be insufficient to analysis the correlation among subregions of the sensor field • Sensor Data Gathering • Exact Data Gathering waste Energy • Solution reduce the number of transmission
Gathering(2/8 ) • Basic Approach • Temporal Suppression • A node does not transmit a value if it has not change since last reported • Spatial Suppression • A node suppresses it value if it is identical to those of its neighboring • Approximate Gathering • Sensor readings have errors intrinsically • Sensor readings have strong correlations
Gathering(3/8) • Approximate Data Gathering • Each Sensor has a tool to estimate future value • The base Station also keep tools • If a sensor does not send data estimation correct • If a sensor sends data estimation incorrect • Update tools of the sensor and the basestation • Model Based • BBQ[5] • KEN[6] • PAQ[7] • Filter Based • Dual Kalman[9] • Compression Based • Wavelet, DFT, SBR[8]etc. • A collection of readings of a sensor is transmitted periodically
Gathering (4/8 ) • Model Based Approach • Linear Regression • Xt+1 = aXt+b • BBQ, KEN • Multivariate Gaussian model • Probability density function: P(X1, X2, X3, …, Xn) • Xi: random variable for sensor readings
Gathering(5/8) • Approximate Gathering • PAQ • Linear Regression and Gaussian model require much time to construct correct model, and much data • AutoRegression(3) model • A data Vt = mt+X(t) Vt - mt= X(t) • X(t) = aX(t-1)+bX(t-2)+cX(t-3)+b(w)N(0,1) • mtis a mean of V to time t, a,b,c is real constants, b(w) is white noise • Predictor P(t) = mt+ a(vt-1 – mt-1)+ b(vt-2 – mt-2) + c(vt-3 – mt-3)
Gathering(6/8) • PAQ • Lemma)Let e = v b(w), where v > 1. Then the actual value at time t is contained in [P(t)-e , P(t)+e)] with probability at most 1/v2. Proof) Chebychev inequality P(|vt- P(t)| > e) <= b(w)2/e2 = b(w)2/v2b(w)2 = 1/v2 • Generally v is 6 or 7 • Using above Lemma, PAQ decide when it updates its model. Outlier Parital fit Well fit -e -d d -e
Gathering(7/8) • Filter Based • Mode Based Approach requires much data to construct models • Each node has the filter according to the last reported sensor reading • |Vnew – Vold| > e, the reading is sent to the base station
Initial state Update error covariance project current state Update system state Compute Kalman gain Estimate next state Prediction step Correction step Gathering(8/8 ) • Dual Kalman Filter • Base station has as many filters as the number of sensors • Discrete Kalman Filter • Ex) moving object • State model : xt = vt-1*dt+xt-1 vt= vt-1 • Measure model: z (real position) • z = [1 0]T x +vt , where vt is measurement white Guassion noise
Query Processing(1/6) • Join Operation • An important operator • It allows to relate measurements taken at different nodes. L R
Query Processing(2/6) • General Join Plans[12,13] L R Sequential L R Naive L R Centroid
Query Processing(3/6) • Optimal Join Location[14] • Weighted Fermat Problem • One wants to find the point with the property that the weighted sum of the distances from the point to the vertexes of a triangle is minimized.
Query Processing(4/6) • Synopsis Join[13] • Prunes non-candidate tuples and only joins candidate tuples • Preliminary Join • Eliminate non-candidate tuples • Final Join
Query Processing(5/6) • TPSJ [10] • Preprocessing: Query Decomposition • Query Q • Decomposed Queries Q1 Q2
Query Processing(6/6) • TPSJ • Fist phase • Query Q1 execute • Second phase • Query Q2 is executed with the injecting of R1 into the network
Conclusion • Sensor • Light weight • Wireless • Sensor Data Management • Reduce Energy consumption In-network Processing • Aggregation • Gathering • Query Processing
Reference • [1] S. Madden et.al., “TAG: Aggregation Service for Ad-Hoc Sensor Networks”, OSDI, 2002 • [2] N. Shrivastava et.al., “Medians and Beyond: New Aggregation Techniques for Sensor Networks,” ACM Sensys 2004 • [3] N. Trigoni et.al., “Multi-Query Optimization for Sensor Networks” DCOSS 2005 • [4]N. Trigoni, et.al., "Routing and Processing Multiple Aggregate Queries in Sensor Networks,“ ACM SenSys, 2006. • [5] A. Deshpande et.al., "Model-Driven Data Acquisition in Sensor Networks,“ VLDB, 2004. • [6] D. Chu et.al., "Approximate Data Collection in Sensor Networks using Probabilistic Models,“ ICDE, 2006 • [7] D. Tulone et. al., “PAQ: Time Series Forecasting For Approximate Query Answering In Sensor Networks,” European Conf. Wireless Sensor Networks, 2006 • [8] A. Deligiannakis et.al., “Compressing Historical Information in Sensor Networks,” ACM SIGMOD 2004 • [9] A. Jain et.al., “Adaptive Stream Resource Management Using Kalman Filters,” ACM SIGMOD 2004 • [10] X. Yang et.al., “In-Network Execution of Monitoring Queries in Sensor Networks,” ACM SIGMOD 2007. • [11]M. Stern et.al., “Towards Efficient Processing of Gneral-Purpose Joins in Sensor Networks,” ICDE 2009. • [12]A. Pandit et.al, “ Communication-Efficient Implementation of Range-Joins in Sensor Networks,” International Conference on Database Systems for Advanced Applications (DASFAA), 2006 • [13] H. Yu et.al, “In-Network Join Processing for Sensor Networks,” APWeb 2006. • [14] A. Coman et.al, “On Join Location in Sensor Networks,” MDM 2007.