420 likes | 433 Views
Explore phase transitions in geometric random graphs and statistics gathering over grids for sensor networks. Study monotone properties and sharp thresholds. Discuss the concept of bottleneck matchings.
E N D
Phase Transitions and Statistics Gathering for Sensor Networks Ashish GoelStanford UniversityJoint work with Sanatan Rai and Bhaskar Krishnamachari; Enachescu, Govindan, and Motwani http://www.stanford.edu/~ashishg ashishg@stanford.edu
Sensor Networks • Sensors often reside on the Euclidean plane • Whether two sensors can communicate is determined largely by their Euclidean distance • Often, laid out regularly • Grids, random geometric graphs are reasonable models • Big advantage over general networks in algorithm design and analysis • This talk: • Phase transitions in geometric random graphs • Statistics gathering over grids ashishg@stanford.edu
Geometric Random Graphs • G(n;r) in d-dimensions: • n points uniformly distributed in [0,1]d • Two points are connected if their Euclidean distance is less than r • Sensor networks can often be modeled as G(n;r) with d=2 • Eg. sensors “sprinkled” from a helicopter over a corn field • The wireless radius corresponds to r • Question: How shouldnandrbe chosen to ensure thatG(n;r)has a desirable property P(eg. connectivity, 2-connectivity, large cliques)with high probability? ashishg@stanford.edu
1 Any other pointYis a neighbor ofX with probabilityr2 Expected degree of X is r2 (n-1) r X 0 1 ashishg@stanford.edu
Thresholds for monotone properties? • A graph property P is monotone if, for all graphs G1=(V,E1) and G2=(V,E2) such that E1µ E2, G1 satisfies P)G2 satisfies P • Informally, addition of edges preserves P • Examples: connectivity, Hamiltonianicity, bounded diameter, expansion, degree ¸ k, existence of minors, k-connectivity .… • Folklore Conjecture: All monotone properties have “sharp” thresholds for geometric random graphs • Also, simulation evidence for several properties [Krishnamachari, Bejar, Wicker, Pearlman ’02] [McColm ‘03] ashishg@stanford.edu
To quote Bollobás (via Krishnamachari et al): “One of the main aims of the theory of random graphs is to determine when a given property is likely to appear.” ashishg@stanford.edu
Example: Connectivity Define c(n) such that p c(n)2= log n/n • Asymptotically, when d=2 • G(n;c(n)) is disconnected with high probability • For any e > 0, G(n; (1+e)c(n)) is connected whp • So, c(n) = is a “sharp” threshold for connectivity at d=2 [Gupta and Kumar ‘98; Penrose ‘97] • Similar thresholds exist for all dimensions • cd(n) ¼ (log n/(nVd))1/d, where Vdis the volume of the unit ball in d dimensions • Average degree ¼log n at the threshold ashishg@stanford.edu
Width and sharp thresholds • For property P, and 0 < e < 1, if there exist two functions L(n) and U(n) such that Pr[G(n;L(n)) satisfiesP] = e, and Pr[G(n;U(n)) satisfiesP] = 1 - e, then the e-width we(n) of P is defined as U(n)-L(n) • If we(n) = o(1) for all e, then P is said to have a sharp threshold ashishg@stanford.edu
Example Pr[G(n;r) satisfies P] 1 1-e e r 0 Width ashishg@stanford.edu
Connections (?) to Bernoulli Random Graphs • Famous graph family G(n;p) • Popular model for general random graphs • Edges are iid; each edge present with probability p • Connectivity threshold is p(n) = log n/n • Average degree exactly the same as that of geometric random graphs at their connectivity threshold!! • All monotone properties have e-width = O(1/log n) for any fixed e in the Bernoulli graph model [Friedgut and Kalai ’96] • Can not be improved beyond O(1/log2 n) • Almost matched [Bourgain and Kalai ’97] • Proof relies heavily on independence of edges • There is no edge independence in geometric random graphs => we need new techniques ashishg@stanford.edu
Our results [Goel, Rai, Krishnamachari, Ann App Prob 2005] • The e-width of any monotone property is • Sharp thresholds in the geometric random graph model • Sharper transition (inverse polynomial width) than Bernoulli random graphs (inverse logarithmic width) • There exist monotone properties with width • Tight for d=1, sub-logarithmic gap for d>1 ashishg@stanford.edu
Why cd(n)? • Why express results in terms of cd(n)? • Width gives a sharp “additive” threshold • We are typically interested in properties that subsume connectivity • For such properties, an additive threshold in terms of cd(n) also corresponds to a “multiplicative” threshold • The exact sharpness of the multiplicative threshold depends on L(n) and on the exact additive bounds ashishg@stanford.edu
Bottleneck Matchings • Draw n “blue” points and n “red” points uniformly and independently from [0,1]d • B, R denotes the set of blue, red points resp. • A minimum bottleneck matching between R and B is a one-one mapping f:B ! R which minimizes maxu2 B||f(u)-u||2 • The corresponding distance (maxu2 B||f(u)-u||2) is called the minimum bottleneck distance • Let Xn denote this minimum bottleneck distance ashishg@stanford.edu
Example Bottleneck distance = g ashishg@stanford.edu
Bottleneck Matchings and Width Theorem: If Pr[Xn > g] · p then the sqrt(p)-widthof any monotone property is at most 2g Implication: Can analyze just one quantity, Xn, as opposed to all monotone properties (in particular, can provide simulation based evidence) Proof: Let P be any monotone property • Let e = sqrt(p) • Choose L(n) such that Pr[G(n;L(n)) satisfiesP] = e • Define U(n) = L(n) + 2g • Draw two random graphs GL and GU(independently) from G(n;L(n)) and G(n;U(n)), resp. • Let B, R denote the set of points in GL, GU resp. ashishg@stanford.edu
Bottleneck Matchings and Width(proof contd.) Assume Xn·g. Let f be the corresponding minimum bottleneck matching between R and B. For any u,v 2 B: ||f(u)-f(v)||2· ||f(u)-u||2 + ||u-v||2 + ||f(v)-v||2· 2g + ||u-v||2 Hence, (u,v) is an edge in GL) (f(u),f(v)) is an edge in GU i.e.GLis a subgraph ofGU By definition, Pr[Xn > g] · p ) Pr[GLis not a subgraph of GU] · p = 2(1) ashishg@stanford.edu
· 2g + L(n) Suppose bottleneck matching ·g · L(n) Illustration I: Triangle Inequality ashishg@stanford.edu
Bottleneck Matchings and Width(proof contd.) Pr[GL is not a subgraph of GU] · p = 2 (1) Let q = Pr[GUdoes not satisfyP] Pis monotone, Pr[GLsatisfiesP] = e, ) Pr[GLis not a subgraphof GU] ¸e¢q (2) Combining (1) and (2), we get e¢q · p i.e. q ·e Therefore, Pr[GUsatisfies] ¸ 1- i.e. the -width of is at most U(n) – L(n) = 2 Done! ashishg@stanford.edu
Does not satisfy P q e Satisfies P G(n;L(n)+2g) G(n;L(n)) GLis not a subgraph of GU with probability at least eq Illustration II: Probability Amplification GU GL ashishg@stanford.edu
Our Goal now Analyze the bottleneck matching distance Xn Specifically, we are done if Xn = O(g(n)) with high probability, for some “small” g(n) ashishg@stanford.edu
Comparison with Bernoulli Random Graphs? • We are attempting to show something quite strong • G(n;r) is a subgraph of G(n;r+g) whp, for small g • “Laminar” structure • Corresponding result is NOT true for Bernoulli random graphs even for = ½ • If small bottleneck matchings exist whp, we will get stronger thresholds than for Bernoulli random graphs ashishg@stanford.edu
Existence of Small Bottleneck Matchings • The bottleneck matching length is O(cd(n)) whp for d ¸ 3 [Shor and Yukich 1991; we present a simpler proof] O(c2(n) log1/4n) whpfor d = 2 [Leighton and Shor 1989] O (sqrt(log(1/e))/sqrt(n)) with probability 1-efor d = 1 [Our paper (folklore?)] • This gives us the desired widths • Will omit the proof, focus on implications • There is a small bottleneck matching between a random geometric graph and the grid ashishg@stanford.edu
Lower bound examples • For d=1, the property “min-degree ¸ n/4” has width W(sqrt(log 1/e)/sqrt(n)) • Basic idea: Just the two endpoints on the line are interesting for the purpose of finding the minimum degree • For d¸ 2, the property “G is a clique”has width W(1/n1/d) • Open problems: • Plug the gap in the upper/lower bounds on the width for d ¸ 2 • Also, all our lower bound examples undergo phase transitions at r = Q(1). Is there something interesting and different in the region where r is of the order of the connectivity threshold? ashishg@stanford.edu
Implications – Mixing Time Fastest mixing Markov chain defined on G(n;r) has mixing time Q(r-2 log n) for large enough r [Boyd, Ghosh, Prabhakar, Shah ‘05] Alternate proof: • GRID(n;r): n points are laid on a grid in [0,1]2 and two points are connected if they are within distance r. • Fastest mixing time of GRID(n;r) = Q(r-2log n) [Trivial] • G(n;r) is a super-graph of GRID(n;r-d) and a sub-graph of GRID(n;r+d) whp for small enough d [Our result] ) Fastest mixing time of G(n;r) is Q(r-2log n) whp ashishg@stanford.edu
Implications – Spectra Our techniques can be extended to show that the spectrum of random geometric graphs converges to the spectrum of the grid graph. [Rai ’05] ashishg@stanford.edu
Implications – Coverage • Coverage: Any point in the unit square must be within a distance r from one of the n sensors • Known: there is a sharp threshold in r[Shakkottai, Srikant, Shroff ’04] • Coverage is NOT a graph property, so it does not fall within our framework • But the laminar structure in our proof implies a sharp threshold for coverage as well (weaker than the sharpest known) • Fixed density, increasing area: [Muthukrishnan, Pandurangan] ashishg@stanford.edu
Conclusions • Monotone properties in G(n;r) have sharp thresholds • Much sharper than for Bernoulli Random Graphs • Much stronger too: Random geometric graphs exhibit a laminar structure • Useful for recovering several known results/proving new ones • Randomness is often a red-herring since the grid often yields tight upper/lower bounds • Rule of thumb: Grids are nearly as good a model for sensor networks as random geometric graphs • No useful analogue in general networks (deterministic expanders are no easier to analyze than Bernoulli random graphs) • Open problem: Does laminarity imply anything about throughput (via separators)? ashishg@stanford.edu
Statistics Gathering Assume sensors are laid out on an n £ n grid • Each sensor has 4 neighbors • Each sensor knows its (x,y)-coordinates • A processing agent (sink) at (0,0) Vast literature: model based, gossip based, ODI, throughput maximization, coding This talk: • Answering box queries [Goel, unpublished] • Aggregating spatially correlated data [Enachescu, Goel, Govindan, Motwani, 2004] ashishg@stanford.edu
Answering Box Queries The sink can issue queries for the aggregate statistics in any box Will focus on statistics that can be estimated from a random sample of size K Examples: mean, median, quantiles, standard deviations Some naive strategies: • All sensors send their value to the sink every time unit No query-time cost, very high pre-processing cost (n per node) • The sink samples K sensors from the box at query time No preprocessing cost, cost Kn at query time ashishg@stanford.edu
Notation: K-uniform Samples A K-uniform sample of S is a subset of S obtained by choosing each element with probability K/|S|, independently of all other elements K-uniform samples are equally good for estimation purposes ashishg@stanford.edu
Our Solution: Hyperbolic Gossip Pre-processing: Every time unit, sensor (i,j) chooses z(i,j) uniformly at random from [0,1) • Sensor (i,j) sends its value to all sensors (x,y) satisfying z(i,j) · K/(|x-i|¢|y-j|) [hyperbolic region] • Probability that (x,y) hears from (i,j) = K/(|x-i|¢|y-j|)** H(i,j): set of sensors contacted by (i,j) • H(i,j) is connected, and has expected size O(K log2n) • Pre-processing cost = O(K log2n) per sensor (sharply concentrated) Claim: Every sensor can now derive K-uniform samples from every box that contains this sensor • Querying becomes trivial: sink can query any sensor in this box • Side-benefit: Each sensor has a rough map of the entire sensor-field ashishg@stanford.edu
Deriving K-uniform Samples Suppose sensor (x,y) needs to derive K-uniform samples from box B • Sub-sample: For every node (i,j) in B which has sent its value to (x,y), add the value to the sample with probability |x-i|¢|y-j|/|B| • Probability that (x,y) hears from (i,j) = K/(|x-i|¢|y-j) • Probability that the value from (i,j) is in the sample = (|x-i|¢|y-j|/|B|)¢(K/(|x-i|¢|y-j)) = K/|B| (x,y) ashishg@stanford.edu
Aggregation in Sensor Networks • Highly constrained in power (and hence communication) • Data gathering: Each sensor (source) sends its data to a central agent (the sink) • In-network aggregation (coding) can result in great power savings due to correlations in data or the nature of the query • Example: spatially correlated data • Aggregation gains are unpredictable • Basic Question: What is the best routing structure to aggregate data on? • Does it depend on the correlation? ashishg@stanford.edu
The Spatial Correlation Model • Imagine each sensor is a camera with range k • Can take a picture of any point within a 2k £ 2k square centered at the sensor • Let Ak(s) denote the area imaged by sensor s • Total amount of information obtained from set of sensors S = |s2 S Ak(s)| • What is the right value of k? • Would like a single aggregation tree that is good for all k ashishg@stanford.edu
Rationale • If an algorithm does well for all camera ranges k, it also does well when the aggregation function is a linear superposition of different values of k • For example, consider events of different intensities (fireflies, camp-fires, forest fires, volcanic eruptions) • Different events can be sensed within different ranges • For any sets of intensities and relative frequencies, there is a superposition of different values of k which measures exactly the information in a set of sensors S • The right model for spatial correlation given simultaneous optimization? ashishg@stanford.edu
A Useful Transformation • Instead of focusing on the information captured by a sensor, focus instead on the area in a 1£1 square • We call this a “value” • A value is sensed by all sensors within a 2k£2k square centered at the “value” The value for k=2 ashishg@stanford.edu
Cost Model • Find a routing tree (by choosing a “parent” for each node) to send the information from all nodes to the processing node. • Cost of an edge: Number of values transmitted across the edge • Cost of the tree: Total cost of all the edges • We want to minimize this transmission cost simultaneously for all k, assuming perfect aggregation. • Assume the nodes must send all data values. ashishg@stanford.edu
Collision Time • Consider two adjacent paths in a routing tree. • Collision time = the number of steps before the lower path meets the upper path. Example: collision time = 3 ashishg@stanford.edu
Collision Times and Simultaneous Optimization Theorem: An aggregation tree with average expected collision time O(N1/2) gives a constant factor approximation to the optimum aggregation trees for all k ashishg@stanford.edu
A Randomized Algorithm • Each node chooses one of its two downstream (i.e. closer to the origin) neighbors as its parent • Results in a shortest path tree rooted at the sink • All data flows on this shortest path tree • Choice of parent: Node at position (x,y) chooses the node at position (x-1,y) with probability x/(x+y) and the node at position (x,y-1) with probability y/(x+y) • Different nodes choose independently • Interesting Property: The path from a sensor to the sink is a shortest path chosen uniformly at random from all such shortest paths ashishg@stanford.edu
The Randomized Algorithm [contd.] • Intuition: Consider A= (j-1,j+1) and B = (j+1,j-1) • A is above the diagonal, so has a slight preference for going down • B is below the diagonal, so has a slight preference for going left • But if A goes down and B goes left, they meet! • This “centrist” tendency introduces a small but sufficient bias so that the average expected collision time is O(N1/2) as opposed to O(N) [Enachescu, Goel, Govindan, Motwani; Theoretical Comput Sci 2006] • Proof omitted • Implies that the randomized algorithm is an O(1) approximation for all correlation parameters k • Much stronger than what is known for general graphs [Goel, Estrin] ashishg@stanford.edu
Conclusions (Aggregating Correlated Data) • Simple routing techniques work well for a wide range of aggregation functions • Rule of thumb: Do not couple coding and routing in a sensor network • Supported by [Pattem et al; Beferull-Lozano et al; Goel, Estrin] • Open problems • Extend our results to other correlation models • Example: Gaussian sources, spatial covariance matrix [Scaglione, Servetto] • Conjecture: Our model of spatial correlations captures Gaussian sources • Making Spatio-temporal maps of sensor readings with varying and unknown spatio-temporal correlation • Sampling of correlated data? • Gossip based protocols for computing statistics of correlated data? (along the lines of [Kempe et al., Boyd et al]) ashishg@stanford.edu