350 likes | 372 Views
Simultaneous Placement and Scheduling of Sensors. Andreas Krause , Ram Rajagopal, Anupam Gupta, Carlos Guestrin. rsrg @caltech. ..where theory and practice collide. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A A A. Traffic monitoring.
E N D
Simultaneous Placement andScheduling of Sensors Andreas Krause, Ram Rajagopal,Anupam Gupta, Carlos Guestrin rsrg@caltech ..where theory and practice collide TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAAA
Traffic monitoring • CalTrans wants to deploy wireless sensors under highways and arterial roads • Deploying sensors is expensive(need to close and open up roads etc.) Where should we place the sensors? • Battery lifetime ¼ 3 years • Need 10+ years lifetime for feasible deployment • Solution: Sensor scheduling (e.g., activate every 4 days) When should we activate each sensor?
~$7K 75 days YSI 6600 Sonde Monitoring water networks • Contamination of drinking watercould affect millions of people Contamination Sensors Simulator from EPA • Place sensors to detect contaminations • “Battle of the Water Sensor Networks” competition Where and when should we sense to detect contamination?
Traditional approach If we know that we need to schedule, why not take that into account during placement? 1.) Sensor Placement:Find most informative locations 2.) Sensor Scheduling:Find most informative activation times (e.g., assign to groups + round robin)
Our approach If we know that we need to schedule, why not take that into account during placement? 1.) Sensor Placement:Find most informative locations Simultaneously optimize overplacement and schedule 2.) Sensor Scheduling:Find most informative activation times (e.g., assign to groups + round robin)
Model-based sensing • Utility of sensing based on model of the world • For traffic monitoring: Learn probabilistic models from data (later) • For water networks: Water flow simulator from EPA • For each subset A µ V compute “sensing quality” F(A) Model predicts High impact Lowimpactlocation Contamination Medium impactlocation S3 S1 S2 S3 S4 S2 Sensor reducesimpact throughearly detection! S1 S4 Set V of all network junctions S1 Low sensing quality F(A)=0.01 High sensing quality F(A) = 0.9
Problem formulation Sensor Placement: Given: finite set V of locations, sensing quality F Want: A*µ V such that Sensor Scheduling: Given: sensor placement A*µ V Want: Partition A* = A1*[ A2*[ … [ Ak* s.t. Ak* = sensors activated at time k Want to maximize average performance over time!
The SPASS Problem Simultaneous placement and scheduling (SPASS): Given: finite set V of locations, sensing quality F Want: Disjoint sets A1*, …, Ak* such that | A1*[ … [ Ak *| · m and Typically NP-hard! At = sensors activated at time t
Contribution of s2 to F(A4) F(A4[ {s2}) – F(A4) Score F(Ai) A4 A2 A3 A1 Greedy average-case placement and scheduling (GAPS) Greedily choose: s: sensor location t: time step to add s to Start with A1,…,Ak = ; For i = 1 to m (s*,t*) := argmax(s,t) F(At[ {s}) – F(At) At* := At*[ {s*} s8 2 3 1 s1 4 1 s10 4 s13 s5 s12 s6 2 s2 1 s11 s9 s7 2 3 1 How well can this simple heuristic do?
Key property: Diminishing returns Placement A = {S1, S2} Placement B = {S1, S2, S3, S4} S2 S2 S3 S1 S1 S4 Theorem [Krause et al., J Wat Res Mgt ’08]: Sensing quality F(A) in water networks is submodular! Adding S’ will help a lot! Adding S’ doesn’t help much S’ New sensor S’ + S’ B Large improvement Submodularity: A + S’ Small improvement For A µ B, F(A [ {S’}) – F(A) ¸ F(B [ {S’}) – F(B)
Performance guarantee Theorem GAPS provides constant factor approximation t F(AGAPS,t) ¸ 1/2 t F(A*t) Proof Sketch: • SPASS requires maximization of a monotonic submodular function over a truncated partition matroid • Theorem then follows from result by Fisher et al ’78 Generalizes analysis of k-cover problem (Abrams et al., IPSN ’04) Can also get slightly better guarantee (¼ 0.63) using more involved algorithm by Vondrak et al. ‘08
t F(At) high! mint F(At) high! Score F(Ai) s13 s10 s8 s12 s5 s6 s1 s11 s2 s7 s2 A4 A2 A3 A1 Poor coverage at t=4! Average-case scheduling can be unfair Consider V = {s1,…,sn}, k = 4, m = 10 Want to ensure balanced coverage t F(At) high! mint F(At) low s1 s6 Score F(Ai) s13 s10 s5 s8 s12 s11 s2 s7 s2 A4 A2 A3 A1
Balanced SPASS Want: A1*, …, Ak* disjoint sets s.t. |A1*[ … [ Ak *| · m and Greedy algorithm performs arbitrarily badly We now develop an approximation algorithm for this balanced SPASS problem!
Key idea: Reduce worst-case to average-case Suppose we learn the value attained by optimal solution: c* = mint F(A*t) = OPT Then we need to find a feasible solution A1,…,Aksuch that F(At) ¸ c* for all t If we can check feasibility for any c, we can find optimal c* using binary search! How can we find such a feasible solution?
c Trick: Truncation Need to find a feasible solution such that F(At) ¸ c for all t For Fc(A) = min{F(A), c}: F(At) ¸ c for all t t Fc(At) = k c Truncation preserves submodularity! Hence, to check whether OPT = mint F(A*t) ¸ c, we need to solve average-case problem F(A) Fc(A) |A|
c c s9 s13 s18 s16 s13 s9 s8 s8 s19 s27 s1 s1 s7 s20 s49 s7 Score Fc(Ai) Score Fc(Ai) s15 s45 s10 s10 s5 s5 s14 s32 s12 s12 s3 s6 s31 s11 s2 s2 A4 A4 A2 A3 A2 A3 A1 A1 Approximate solutionguarantees only 2c Optimal solutionhas value 4c Challenge: Use of approximation • Only have an ½-approximation algorithm (GAPS) for average case problem nocoverage! Can lead to unbalanced solution! mint F(At) = 0
s13 s9 s8 s1 s7 s10 Remedy: Can rebalance solution • Can attempt to rebalance the solution, to obtain uniformly high score for all buckets c Score Fc(Ai) s6 s12 s5 s11 s2 A4 A2 A3 A1
Is rebalancing always possible? • If there are elements s where F({s}) is large, rebalancing may not be possible: Rebalanced solutionstill has mint F(At) = 0 c s7 s3 Score Fc(Ai) s2 A4 A2 A3 A1
Distinguishing big and small elements • Element s2 V is big if F({s})¸ c for some fixed 0<<1 If we can ensure that F(At) ¸ c for all tthen we get approximation guarantee! Can remove big elements from problem instance! Will find out howto choose later! c “big” elements Score Fc({s}) c s2 s3 s4 s1 … sn
How large should be? GAPS solutionon small elements c s9 s8 s7 Score Fc(Ai) rebalanced solution s6 s5 c s10 s11 s2 s12 s4 A2 A3 A1 … Ak’ “satisfied” time steps Lemma: If = 1/6, can always successfully rebalance (i.e., ensure all time steps are satisfied)
eSPASS Algorithm eSPASS: Efficient Simultaneous Placement and Scheduling of Sensors • Initialize cmin=0, cmax = F(V) • Do binary search: c = (cmin+cmax)/2 • Allocate big elements to separate time steps (and remove) • Run GAPS with Fc to find A1,…,Ak’, where k’ = k - #big elements • Reallocate small elements to obtain balanced solution • If mint F(At) ¸ c/6: increase cmin • If mint F(At) < c/6: decrease cmax until convergence
Performance guarantee Theorem eSPASS provides constant factor 6 approximation mint F(AeSPASS,t) ¸ 1/6 mint F(A*t) Can also obtain data-dependent bounds which are often much tighter
Experimental studies Questions we ask: • How much does simultaneous optimization help? • Is optimizing the balanced performance a good idea? • How does eSPASS compare to existing algorithms (for the special case of sensor scheduling)? Case studies: • Contamination detection in water networks • Traffic monitoring • Community sensing • Selecting informative blogs on the web
Traffic monitoring Goal: Predict normalized road speeds on unobserved road segments from sensor data Approach: • Learn probabilistic model (Gaussian process) from data • Use eSPASS to optimize sensing quality F(A) = Expected reduction in MSE when sensing at locations A Data: • from 357 sensors deployed on highway I-880 South (PeMS) • Sampled between 6am and 11am during work days
Benefit of simultaneous optimization ¼ 30% lifetime improvement for same accuracy! For large k, random scheduling hurts more than random placement OP: Optimized Placement OS: Optimized Schedule RP: Random Placement RS: Random Schedule Higher is better Traffic data Lifetime improvement (k groups)
Average-case vs. Balanced Score Optimizing for balanced score leads to good average-case performance, but not vice versa Higher is better Traffic data
Data-dependent bounds Our data-dependent bounds show that eSPASS solutions are typically much closer to optimal than 1/6 Higher is better Traffic data
Water network monitoring • Real metropolitan area network (12,527 nodes) • Water flow simulator provided by EPA • 3.6 million contamination events • Multiple objectives: Detection time, affected population, … • Place sensors that detect well “on average”
Benefit of simultaneous optimization Simultaneous optimization significantly outperforms traditional approaches OP: Optimized Placement OS: Optimized Schedule RP: Random Placement RS: Random Schedule Higher balanced score Water networks More sensors E.g., ~3x reduction in affected population when m = 24, k = 3
Comparison with existing techniques • Comparison of eSPASS with existing algorithms for scheduling (m = |V|): • MIP: Mixed integer program for domatic partitioning with accuracy requirements (Koushanfary et al. 06) • SDP: Approximation algorithm for domatic partitioning (Deshpande et al. 08) • Results on temperature monitoring (Intel Berkeley) data set with 46 sensors • Goal: Minimize expected MSE
Average-case error Comparison with existing techniques eSPASS outperforms existing approaches for sensor scheduling Lower error (MSE) Worst-case error Temperature data
Trading off power and accuracy • Suppose that we sometimes activate all sensors(e.g., determine boundary of traffic jam, localize source of contamination) • Want to simultaneously optimize mint F(At) and F(A1 [ … [ Ak) • Scalarization: for some 0 < l < 1, we want to optimize:l mint F(At) + (1-l) F(A1[ … [ Ak) Theorem: Our algorithm, mcSPASS (multicriterion SPASS) guarantees factor 8 approximation! “Balanced performance” “High-density performance”
Tradeoff results max l mint F(At) + (1-l) F(A1[ … [ Ak) eSPASS ( = 1) mcSPASS ( = .25) Stage-wise ( = 0)
l = 0 l = 0.25 l = 1 Tradeoff results Can simultaneously obtain high performancein scheduled and high-density mode 1 l mint F(At) + (1-l) F(A1[ … [ Ak) 0.99 0.98 High-density performance 0.97 0.96 0.95 0.94 0.82 0.84 0.86 0.88 0.9 0.92 Water networks Scheduled performance
Conclusions • Introduced simultaneous placement and scheduling (SPASS) problem • Developed efficient algorithms with strong guarantees: • GAPS: 1/2 approximation for average performance • eSPASS: 1/6 approximation for balanced performance • mcSPASS: 1/8 approximation for trading off high-density and balanced performance • Data-dependent bounds show solutions close to optimal • Presented results on several real-world sensing tasks