290 likes | 396 Views
Approximating Sensor Network Queries Using In-Network Summaries. Alexandra Meliou Carlos Guestrin Joseph Hellerstein. Approximate Answer Queries. Approximate representation of the world: Discrete locations Lossy communication Noisy measurements
E N D
Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein
Approximate Answer Queries • Approximate representation of the world: • Discrete locations • Lossy communication • Noisy measurements • Applications do not expect accurate values (tolerance to noise) • Example: • Return the temperature at all locations ±1C, with 95% confidence • Query Satisfaction: • On expectation the requested portion of sensor values lies within the error range
In-network Decisions Query Use in-network models to make routing decisions No centralized planning
In-network Summaries Spanning tree T(V,E’) + Models Mv for all nodes v Mv represents the whole subtree rooted at v.
Model Complexity • Gaussian distributions at the leaves: • good for modeling individual node measurements Need for compression
Talk “outline” Compression In-network summaries Construction Traversal
Collapsing Gaussian Mixtures • Compress an m-size mixture to a k-size mixture. • Look at simple case (k=1) • Minimize KL-divergence? “Fake” mass
Quality of Compression Depends on query workload Query with acceptable error window W’<W Query with acceptable error window W
Compression Accurate mass inside interval No guarantee on the tails
Talk “outline” Compression In-network summaries Construction Traversal
Query Satisfaction • A response R={r1…rn} satisfies query Q(w,δ) if: • In expectation the values of at least δn nodes lie within [ri-w,ri+w] Q In-network summary Within error bounds [r1, r2, r3, r4, r5, r6, r7, r8, r9, r10] R
Optimal Traversal • Given: tree and models • Find: subtree such that [μleaves] response Can be computed with Dynamic Programming
Greedy Traversal • If local model satisfies • Return μ • Else descend to child node More conservative solution: enforces query satisfiability on every subtree instead of the whole tree
Talk “outline” Compression In-network summaries Construction Traversal
Optimal Tree Construction • Given a structure, we know how to build the models • But how do we pick the structure?
Traversal = cut Theorem: In a fixed fanout tree, the cost of the traversal is where |C| is the size of the cut, and F the fanout Intuition: minimize cut size Group nodes into a minimum number of groups which satisfy the query constraints Clustering problem
Optimal Clustering • Given a query Q(w,δ), optimal clustering is NP-hard • Related to the Group Steiner Tree Problem • Greedy algorithm with factor log(n) approximation • Greedily pick max size cluster • Issue: does not enforce connectivity of clusters
Greedy Clustering • Include extra nodes to enforce connectivity • Augment clusters only with accessible nodes (losing the logn guarantee)
Clustering comparison • 2 distributed clustering algorithms are compared to the centralized greedy clustering
Talk “outline” Compression Enriched models In-network summaries Construction Traversal
Enriched models • Support more complex models • k-mixtures • Compress to a k-size mixture instead of a SGM • Virtual nodes • Every component of the k-size mixture is stored as a separate “virtual node” • SGMs on multiple windows • Maintain additional SGMs for different window sizes • More space, more expensive model updates (SGM = Single Gaussian Model)
Evaluation of enriched models SGM surprisingly effective in representing the underlying data
Talk “outline” Sensitivity analysis Compression In-network summaries Construction Traversal
Tree Construction Parameters and Effect on Performance • Confidence • Performance for workloads of different confidence than the hierarchy design • Error window • Broader vs narrower ranges of window sizes • Assignment of windows across tree levels • Temporal changes • How often should the models be updated
Confidence Workload of 0.95 confidence Design confidence does not have a big impact on performance
Error windows A wide range is not always better, because it forces the traversal of more levels
Compression Conclusions Enriched models In-network summaries Traversal Construction • Analyzed compression schemes for in-network summaries • Evaluated summary traversal • Studied optimal hierarchy construction • Studied increased complexity models • Showed that simple SGM are sufficient • Analyzed the effect on efficiency of various parameters Sensitivity analysis