230 likes | 250 Views
Explore efficient aggregation methods like tree-based and multi-path approaches in sensor networks. Discover the Tributary-Delta model combining both for robust data collection. Adapt aggregation based on loss rates and node communication.
E N D
Tributaries and Deltas: Efficient and Robust Aggregation in Sensor Networks ManJhi, S. Nath P. Gibbons CMU ICS280sensors Winter 2005
Introduction • Existing approaches to in-network aggregation: • Tree –based approach • Answer is generated by performing in-net aggregation along the tree • Proceed level by level from leaves • Exact computation • Suffer from high communication failures • “Not uncommon to loose 80% of readings”. ICS280sensors, Winter 2005
Introduction • Multi-path approach • Use wireless broadcast medium • Broadcast partial results to multiple neighbors • Use topology called rings. • Nodes divided into levels according to hop count from BS • Aggregation performed level by level up to the BS. • Each reading is accounted for multiple times • Robust • Suffer from: approximate answers and long message size ICS280sensors, Winter 2005
Approach Comparison ICS280sensors, Winter 2005
Tributary-Delta overview • Combine the two approaches • Adapting the aggregation to the current loss rate • Low loss: trees are used • for low/zero approximate error and small size • High loss: multi-path • For robustness ICS280sensors, Winter 2005
Challenges • How do nodes decide whether to use tree or multi-path • How do the nodes using different approaches communicate • How do the nodes convert partial results when transitioning between approaches • New algorithm for finding frequent items ICS280sensors, Winter 2005
More on multi-path • To construct a rings topology • BS transmits and any node hearing the transmission is in ring 1 • Nodes in ring I transmit and any node hearing the transmission, but not already in a ring, is in ring I+1. • All level I nodes that hear a level i+1 partial result incorporate the result into its own result • Low communication error ICS280sensors, Winter 2005
More on multi-path • Special technique to avoid double-counting: synopsis (sketches) diffusion • Synopsis generation: takes a stream of local sensor readings at a node and produces a partial result-synopsis • Synopsis fusion: takes two synopses and generate a new one • Synopsis evaluation: translates a synopsis into a query answer ICS280sensors, Winter 2005
More on multi-path • Example: count distinct items • Let n by upper bound of the count • h() be a hash function from sensor ids to [1, … lg(n)] • SG function produces a bit vector of all 0’s and the sets the h(i)’th bit to 1 when see an id of i. • SF function is OR function • SE function takes a bit vector and output 2^(j-1)/0.77351, where j is the index of the lowest-order UNSET bit. ICS280sensors, Winter 2005
Tributary-Delta • View aggregation as a directed graph • Nodes and BS are vertices • Directed edge fro successful transmission • Vertex labeled either M or T, for multi-path or tree • Edge labeled based on source vertex • The labels may change ICS280sensors, Winter 2005
Tributary-Delta • Correctness criteria of topology construction • No two M vertices with partial results representing an overlapping set of sensors are connected to T vertices. • Restrict to: a node receiving from an M node uses M scheme • Edge correctness: An M edge can never be incident on a T vertex • Path correctness: in any directed path in G, a T edge can never appear after an M edge ICS280sensors, Winter 2005
Tributary-Delta • Dynamic adaptation: • An M vertex is switchable if all incoming edges are E edges, or no incoming edges (M1, M2) • A T vertex is switchable if its parent is an M vertex or it has no parent. (T3, T4, T5) • Let G’ be the connected component of G that includes the BS • “if the set of T vertices in G’ is not empty, at least one of them is switchable. If the set of M vertices in G’ is not empty, at least one of them is switchable” ICS280sensors, Winter 2005
Adaptation design • User specify a threshold on the minimum percentage of nodes that should contribute to the aggregate answer • Depending on the % of nodes contributing to the current result, the BS decides whether to shrink or expand the delta region for future result • Increasing delta region increases the % contributing • Key concern in switching nodes between tree and multi-path aggregation: transmitting and receiving synchronization • Design choice: (to ensure switched nodes can retain current epoch) • From M to T: must choose its parents from one of its neighbors in level i-1. • From T to M: transmits to all neighbors in level i-1 ICS280sensors, Winter 2005
Adaptation strategies • TD-coarse: if the % is below the user-specified threshold, all the current switchable T nodes is switched. • TD: • each switchable M node includes in its outgoing messages an additional field : number of nodes in sub-tree not contributing. • Max and min of such number are maintained • If % is below threshold: BS expands the delta region by switching from T to M all children of swichable M nodes beloning to a sub-tree that has max nodes not contributing • When shrinking: switch each swichable M node whose subtree has only min nodes not contributing. ? • Trade-off: higher convergence time. (will it converge?) ICS280sensors, Winter 2005
Identify frequent items • The problem: • Each of m sensor nodes generates a collection of items. • Given a user-supplied error tolerancee, the toal is to obtain from each item u, an e-deficient count c’(u) at the BS: • Max {0, c(u)-e*N} <= c’(u) <= c(u) • Where N = sum(c(u)) ICS280sensors, Winter 2005
Identify frequent items–tree algorithm • Partial result sent by a node X to its parent is a summary: • S = <N, e, {(u, c’(u))}> • Each c’(u) satisfies max {0, c(u)-e*N} <= c’(u) <= c(u) • Approach is to distribute the e among intermediate nodes in the tree. • Make e(i) a function of height of a node (height of a leaf node is 1) • For correctness: e(1)<= e(2) <=… <= e(h) • As long as e(h) <= e, user guarantee is met. • Called precision gradient • At each node: summary of items with count at most e*N is dropped. ICS280sensors, Winter 2005
Identify frequent items–tree algorithm ICS280sensors, Winter 2005
Min Total-Load algorithm • D-dominating tree: fro any d>=1, we say that a tree is d-dominating if for any i>=1, H(i)>=(d-1)/d*(1+1/d+…+1/d^(i-1)) • Where H(i)=1/m*SUM(h(j)), with h(j) being the number of nodes at height j, and m the total number of nodes. • If a tree is d-dominating but not d+delta-dominating, refer to d as the domination factor. ICS280sensors, Winter 2005
Min Total-Load algorithm • Lemma: for any d-dominating tree of m nodes, where d>1, a precision gradient setting of e(i)=e*(1-t)(1+t+…+t^(i-1)) with t=1/sqrt(d) limits total communication to (1+ 2/(sqrt(d)-1))*m/e. • Follows from: step 3 of alg. 1, at most 1/(e(i)-e(i-1)) items are sent by a node at height i to its parent ICS280sensors, Winter 2005
Min Total-Load algorithm • Lemma: a tree in which each internal node of height I has at least d children of height i-1 is d-dominating • Construction of topology with large dominating factors: • Each node of height i+1, if has two or more children of heigh I, pins down any two of its children so that they can not switch parents, and flag itself. • Non-pinned nodes in each level j switch parents randomly to any other reachable non-flagged node in level j-1. • As soon as a non-flagged node has at least two flagged children of the same height, it pins both of them and the flags itself. • This makes the tree 2-dominating. ICS280sensors, Winter 2005
Identify frequent items–multi-path algorithm • Replace the + operator with duplicate-insensitive addition operators • Synopsis generation, fusion, and evaluation all depend on what duplicate-insensitive addition algorithm is used. ICS280sensors, Winter 2005
Results ICS280sensors, Winter 2005
Results ICS280sensors, Winter 2005