1 / 30

Time-Decaying Sketches for Sensor Data Aggregation

Time-Decaying Sketches for Sensor Data Aggregation. Graham Cormode AT&T Labs, Research Srikanta Tirthapura Dept. of Electrical and Computer Engineering Iowa State University Bojian Xu Dept. of Electrical and Computer Engineering Iowa State University. 75F 11:39. 76F 11:34. 72F 11:29.

bela
Download Presentation

Time-Decaying Sketches for Sensor Data Aggregation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Time-Decaying Sketches for Sensor Data Aggregation Graham Cormode AT&T Labs, Research Srikanta Tirthapura Dept. of Electrical and Computer Engineering Iowa State University Bojian Xu Dept. of Electrical and Computer Engineering Iowa State University

  2. 75F11:39 76F11:34 72F11:29 73F11:19 78F11:41 78F11:41 73F11:39 73F11:39 76F11:38 76F11:38 76F11:26 76F11:26 79F11:30 70F11:22 76F11:15 76F11:45 80F11:38 79F11:30 76F11:25 76F11:45 73F11:40 Mean of the Temperatures in the Last 30 Minutes

  3. 75F11:39 76F11:34 72F11:29 73F11:19 78F11:41 73F11:39 76F11:38 76F11:26 79F11:30 70F11:22 76F11:15 76F11:45 80F11:38 79F11:30 76F11:25 76F11:45 73F11:40 Sketch

  4. Sketch Merging Answer

  5. General Time Decay • General Decay function: • Time decayed value of element at time c is: 0 age

  6. Formal Model of the Data(on One Sensor) Data stream: e0=(v0,t0,id0), e1=(v1,t1,id1), … • v: value • t: timestamp of creation • id: a unique id of the observation • User defined Time Decay: • Asynchronous arrival: It is possible ti > tj, while i<j • Duplicates: idi = idj is possible • Assume: if idi = idj , then vi = vj, ti=tj

  7. Contribution First mergable sketch combines the following:

  8. Related Work • S. Nath, P. B. Gibbons, S. Seshan and Z. R. Anderson, “Synopsis diffusion for robust aggregation in sensor networks”, SenSys 2004 • J. Considine, F. Li, G. Kollios and J. Byers, “Approximate Aggregation Techniques for Sensor Databases”, ICDE 2004 • E. Cohen and M. Strauss, “Maintaining time-decaying stream aggregates”, PODS 2003; Journal of Algorithm 2006 • S. Tirthapura, B. Xu and C. Busch, “Sketching Asynchronous Streams Over Sliding Windows”, PODC 2006

  9. Outline • Problem: Time decayedsum of distinct elements over an asynchronous stream. • Focus on Integral decay model: is always an integer

  10. Estimate of the Sum (on One Sensor) • Given: • Stream: R = (v0,t0,id0),…, (vn,tn,idn), … • User defined decay function: f() • Maintain: • c: current time • D: set of distinct elements in R

  11. Estimate of the Sum (cont’d) • Linear space lower bound on duplicate-insensitive sum (Alon, Matias and Szegedy, STOC 1996) • Deterministic approximate algorithm • Randomized algorithm giving accurate result • Goal: Continuously maintain an (,  )-estimate of: • User inputs: • D: set of distinct elements in R An (,  )- estimate for X is a random variable Y, such that Pr[|Y-X| >  X] < .

  12. √ √ √ Algorithm for Sum (High Level Picture) v1=4 v2=8 + Sum + Count Random Sampling SampleRate = p • Count the number of selected integers • Multiply by 1/p

  13. Duplicate Detection Hash Function Random Sampling Select x Copy 1 √ √ √ Copy 2

  14. Intuition - I (v,t,id) sample rate Sample By Chebyshev inequality, for anε-approximation of the count with constant probability:

  15. Intuition - II • t • t+ • Sample rate ?

  16. SIZE ?? Maintain Multiple Samples SampleRate pj p0 = 1 p1 = 1/2 p2 = 1/4

  17. SIZE ?? SampleRate pj p0 = 1 p0 = 1 p1 = 1/2 p1 = 1/2 p2 = 1/4 p2 = 1/4 Faster Sampling • RangeSample(Pavan & Tirthapura, SICOMP 2007) • Efficiently compute the number of selected integers √ √ √

  18. Binary search over [t, tmax] using RangeSample √ √ √ Expiry Time e=(v, t, id) At time: t At time: t √ √ √ At time: t +  At time: t +  = Expiry Time expiry time

  19. Level 0 Level 1 Level 2 p=1 Sample 0 1/2 1/4 1/8 Sketch Structure Largest expiry time of all the elements discarded from the sample t0 t1 t2 Sketch

  20. Level 0 Level 1 Level 2 p=1 1/2 1/4 (e1,22) (e1,19)

  21. Level 0 Level 1 Level 2 p=1 1/2 1/4 (e1,22) (e2,23) (e3,21) (e1,19) (e2,21)

  22. Level 0 Level 1 Level 2 p=1 1/2 1/4 Discard the element with smallest expiry time (e3,21) (e1,22) (e2,23) (e4,23) (e4,21) (e1,19) (e2,21)

  23. Level 0 Level 1 Level 2 p=1 1/2 1/4 t0= 21 (e1,22) (e2,23) (e4,23) (e4,21) (e1,19) (e2,21)

  24. Level 0 Level 1 Level 2 p=1 1/2 1/4 Duplicate t0= 21 (e1,22) (e2,23) (e4,23) (e4,21) (e1,19) (e2,21)

  25. Level 0 Level 1 Level 2 Level used to answer the query p=1 1/2 1/4 Answer a Query for the Decayed Sum Current time = 20 t0= 21 (e1,22) (e2,23) (e4,23) (e4,21) (e1,19) (e2,21) e2 e4 √ √

  26. union union (e2,9) (e5,10) (e3,13) union Over the Whole Sensor N/W Sketch 1 (e1,6) (e2,9) (e3,13) Result of merging sketch 1&2 (e4,6) (e5,10) (e3,13) Each sample keeps 3 distinct items with largest expiry time. Sketch 2

  27. Algorithm Complexity • Space complexity: • Time complexity • expected time for processing one item • Time for answering a query • Time for merging two sketches

  28. Conclusion First sketch combines the following

  29. Ongoing and Future Work • Implementation • Observed results better than theoretical predictions • Better duplicate insensitive sketches for specific decay models? • Other aggregates, such as Variance, clustering?

  30. THANKS

More Related