Maximizing a Submodular Utility for Deadline Constrained Data Collection in Sensor Networks

Maximizing a Submodular Utility for Deadline Constrained Data Collection in Sensor Networks ZizhanZheng and Ness B. Shroff Presenter: WenzhuoOuyang Department of Electrical and Computer Engineering The Ohio State University TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAA

Outline • Motivation • System Model and Problem Formulation • Approximation Algorithms • Simulations • Conclusion and Future Work

Motivation • Data collection in a sensor network • Each node holds some sensing data for an event • A sink collects data through a routing tree • Utilitymaximization • Collecting data from all the nodes is often infeasible • delay: large network, real-time data request • energy, … • Trading off data quality and communication cost • Redundancy in the data: spatial, temporal • Data Collection under a Deadline Constraint

System Model • A routing tree T = (V, E) rooted at the sink s0 • Each node has at most one data packet ready to deliver • Time slotted system, 1-hop interference model • One packet can be forwarded in each time slot • Links are reliable • Two data collection schemes • Raw data forwarding: complicated post-processing of data needed • In-network data aggregation: MAX, MIN, SUM, etc. • aggregation is perfect • A utility function defined on subsets of nodes • f: 2V! R+ , where f(S) gives the utility of nodes in a subset S µ V • e.g., f(S) = i2Swifor some wi2 R+ (additive utility)

Problem Formulation • Deadline constrained utility maximization max f(S) s.t.SµV, LF(S) ·D (resp. LA(S) ·D) (1) • D– deadline constraint (# time slots allowed) • LF(S) (resp. LA(S))– minimum number of time slots needed for forwarding (resp.aggregating)all the packets in SµV to the sink • Examples LF(S) = 5 LA(S) = 3 LF(S) >> LA(S) in a larger setting Data forwarding Data aggregation

Beyond Additivity: Monotone SubmodularUtility • For additive utility and for data aggregation only, an efficient polynomial time solution to Problem (1) is known using dynamic programming (Hariharan and Shroff’09). • An additive utility largely ignores the spatial correlation of sensor nodes • Our contribution: Efficient algorithms for maximizing a more general form of utility that captures a large class of spatial correlation for both data forwarding and data aggregation. • Assumptions about utility function f: 2V! R+ • Normalized: f(;) = 0 • Monotone: f(S) ·f(T) 8SµTµV • Submodular: f(S[{a}) – f(S) ¸f(T[ {a}) –f(T) 8S µ T µ V and a2V \ T • A discrete counterpart of concavity (‘diminishing return’) • Includes additive utility as a special case

Submodular Utility for Sensor Selection • Area and point coverage in a disk sensing model • Mutual information (Krause et al.’07) • Given random variables defined on nodes, X1,…,X|V| • f(S) = I(XS; XV nS) = H(XVnS) – H(XV nS | XS) • Variance reduction for modeling sensing uncertainty (under a mild condition) (Das and Kempe’08, Krause et al.’08) • Maximum a posteriori (MAP) estimate and a variant of maximum likelihood (ML) estimate for parameter estimation (Shamaiah et al.’10) f(S) = # points covered by nodes in S Ex: S = {b}, T = {b, c}, f(S[ {a}) – f(S) = 2, f(T[ {a}) – f(T) = 1. c b a

Challenges • Most of previous works on sensor selection focus on maximizing a submodular function subject to a cardinality constraint: max f(S) s.t.SµV, |S| ·D • Problem (1) reduces to this special case for a tree of height 1. • There is no (1-1/e+²)-approximation for any ² > 0 unless P = NP for a general monotone submodular function (Feige’98). • Problem (1) in its general form is more challenging due to the multi-hop data forwarding nature and 1-hop interference. • For additive utility, an efficient solution to Problem (1) is known for data aggregation (Hariharan and Shroff’09), but remains open for data forwarding.

Main Results • For data forwarding, a simple greedy algorithm achieves a factor 1/2-approximation when the sink has a single child, and 1/3-approximation in general. For additive utility, the greedy algorithm is optimal in the first case, and has a factor 1/2 in general. • For data aggregation, a bi-criteria approximation can be achieved, which finds a solution with a utility at least a fraction of the optimal utility and a delay at most ½TD. • D – deadline constraint • hT – height of the tree • ½T – a parameter determined by tree structure and bounded by the maximum node degree. We expect it is typically small (< 2).

Deadline Constrained Data Forwarding • Problem: max f(S) s.t.SµV, LF(S) ·D • A simple greedy algorithm (Algorithm 1) 1: SÃ;. 2: while true do 3: AÃ {a: a2V \ S and LF(S[ {a}) ·D }. 4: ifA = ;then break. 5: aÃ argmaxa2Af(S[ {a}) –f(S). 6: SÃ S[ {a}. 7: returnS. • Need to know LF(S) for SµV • For a tree network subject to 1-hop interference, the minimum delay schedule can be determined (Florens et al.’04) • But, a key construction in the above approach needs to be fixed, which is critical both for ensuring the correctness of the algorithm and for its analysis Focus on membership oracle Assume a value oracle is given

Analysis of the Greedy Algorithm • Submodular maximization over p-systems • Our problem: max f(S) s.t.S2IF , whereIF꞉= {SµV : LF(S) ·D} • Proposition: (V, IF) is a 1-system when the sink has only one child, and a 2-system in general. • Approximation factors of the greedy algorithm follow directly from the proposition and the lemma • p-system –A pair (A, I) , Iµ 2A, such that (i) ;2I, (ii) 8S µA, if S2I and S’µS, then S’ 2I (sets in I are called independent sets), (iii) 8 SµA, the size of the maximum independent set in S is at most p times the size of any maximal independent set in S. – Ex: For the edge set E and the matchingsMin a graph, (E, M) is a 2-system. • Lemma: For a p-system (A, I) and f: 2A! R+, the problem of max f(S) s.t.S2Ican be approximated by the greedy algorithm within a factor of 1/(p+1) if fis monotone submodular and f(;) = 0, and a factor 1/p if f is additive. (Fisher et al.’78)

Simulations • 1000£1000 2d area (5£5 grid), 1000 target points, 200 sensor nodes (sensing range 100, communication range 200). • f(S)– number of target points covered by nodes in S. • A randomly selected node as the sink, a routing tree built by breadth-first search. • Compared with a random node selection algorithm • Greedy algorithm performs 70% better • Bi-criteria algorithm performs up to 50% better, and 25% better in average. • Minimum delay / deadline < 2.5 and ¼ 1.5in average.

Conclusion and Future Work • We have proposed efficient approximation algorithms for two data collection schemes over a tree network subject to 1-hop interference, for maximizing a submodular utility subject to a deadline constraint. • We plan to extend our work to more general settings, e.g., • Unreliable wireless links • Imperfect aggregation • Other interference models • Other types of constraints, e.g., an energy constraint on each node • Joint optimization of tree construction and sensing set selection.

Thank you !

Differences with Network Utility Maximization • Traditional NUM Model for a multi-hop wireless network • A set of users (flows), each with a source, a destination, a real data rate xs, and a utilityUs(xs),typically non-decreasing and strictly concave • Objective: max sUs(xs)s.t. the system is stable subject to some interference constraint • Major differences in our setting • No exogenous arrivals (packets ready at time 0) • A deadline constraint: bounding the delay for data collection • Binary decision variables: for each node, whether to deliver data or not • A separable utility)an additive utility • s Us(xs) = s wsxsfrom some ws2R+ • largely ignores the spatial correlation of sensed data • A set function is more natural: f(S) gives the utility of nodes in set S.

Challenges • Most of previous works on sensor selection focus on maximizing a submodular function subject to a cardinality constraint: max f(S) s.t.SµV, |S| ·D • Problem (1) reduces to this special case for a tree of height 1. • There is no (1-1/e+²)-approximation for any ² > 0 unless P = NP for a general monotone submodular function (Feige’98). • For additive utility, an efficient solution to Problem (1) is known for data aggregation (Hariharan and Shroff’09), but remains open for data forwarding. • For a general monotone submodular utility and when the 1-hop interference model is replaced by the ‘clique’ model, Problem (1) for data aggregation is closely related to the group Steiner problem, and the latter is hard to approximate within a logarithmic factor (Halperin and Krauthgamer’03).

Minimum Delay Data Forwarding • Main ideas (Florens et al.’04) • Consider data dissemination instead • Map a tree network to a multi-line network Our construction 5 time slots 3 time slots 3 hops 5 hops

Deadline Constrained Data Aggregation • Problem: max f(S) s.t.SµV, LA(S) ·D • Observations • Without loss of optimality, a node should wait until it receives all the packets from its children, and then forward one aggregated packet. • LA(S) = LA(T(S)), where T(S) is the minimum subtree spanning S[ s0 s0 • The simple greedy algorithm is still applicable. • LA(S) for any SµV can be determined recursively • However, proving a performance bound for it eludes us. • Obscure structure of the feasible set under the 1-hop interference model

Deadline Constrained Data Aggregation (cont.) • Main idea: Approximation by the ‘clique’ interference model. • At most one node can transmit at any time • The minimum delay for aggregating packets on SµVequals |T(S)| . • Consider the new Problem: max f(S) s.t.SµV, |T(S)| ·D’. • Identify a proper D’ to connect the two problems. Proposition:Let I1 ꞉= {SµV : |T (S)| ·D’}.Then (V, I1) is a hT–system, where hTis the height of the tree. Corollary: The greedy algorithm is a factor 1/(hT+1) approximation to the new problem.

A Bi-criteria Approximation (Algorithm 2) 1: hÃmin (hT, D), and remove nodes in Tat level larger than h. 2: D’Ã the maximum cardinality of any subtreeT1 µTwith root s0and minimum delay bounded by D(Hariharan’09). 3: Find a maximum utility subtreeT2µTwith root s0and size bounded by D’ using the greedy algorithm. 4: Expand T2greedily without further increasing the minimum delay. Proposition: The algorithm finds a subtree with a utility at least a fraction of the optimal utility and a minimum delay at most ½TD, with ·¢T(the maximum node degree).

Maximizing a Submodular Utility for Deadline Constrained Data Collection in Sensor Networks

Maximizing a Submodular Utility for Deadline Constrained Data Collection in Sensor Networks

Presentation Transcript

Sensor Data Management In Sensor Networks

Data Management in Sensor Networks

Data Collection Structures for Wireless Sensor Networks

Ubiquitous Data Collection for Mobile Users in Wireless Sensor Networks

Uncertain Data Management for Sensor Networks

Adaptive Data Collection in Environmental Sensor Networks

Acoustic Ranging in Resource-Constrained Sensor Networks

On Computing Compression Trees for Data Collection in Wireless Sensor Networks

A Robust Spanning Tree Topology For Data Collection in Sensor Networks

A Framework for Secure Data Aggregation in Sensor Networks

Maximizing Submodular Functions and Applications in Machine Learning

A Framework for Secure Data Aggregation in Sensor Networks

Approximate Data Collection in Sensor Networks using Probabilistic Models

Online Data Gathering for Maximizing Network Lifetime in Sensor Networks

Scalable Data Collection in Sensor Networks

A System for Semantic Data Fusion in Sensor Networks

Maximizing Lifetime per Unit Cost in Wireless Sensor Networks

Maximizing submodular functions

A Hierarchical Scheme for Data Aggregation in Sensor networks

A Theory for Maximizing the Lifetime of Sensor Networks

Maximizing Utility

Adaptive Data Collection Strategies for Lifetime-Constrained Wireless Sensor Networks