510 likes | 635 Views
On the Capacity of Information Networks. Nick Harvey Collaborators: Micah Adler (UMass), Kamal Jain (Microsoft), Bobby Kleinberg (MIT/Berkeley/Cornell), and April Lehman (MIT/Google). What is the capacity of a network?. bottleneck edge. What is the capacity of a network?. s 2. s 1. t 1.
On the Capacity of Information Networks Nick Harvey Collaborators: Micah Adler (UMass), Kamal Jain (Microsoft), Bobby Kleinberg (MIT/Berkeley/Cornell), and April Lehman (MIT/Google)
bottleneck edge What is the capacity of a network? s2 s1 t1 t2 • Send items from s1t1 and s2t2 • Problem: no disjoint paths
b2 b1 b1⊕b2 An Information Network s2 s1 t1 t2 • If sending information, we can do better • Send xor b1⊕b2 on bottleneck edge
Moral of Butterfly Network Flow Capacity ≠ Information Flow Capacity
Network Coding • New approach for information flow problems • Blend of combinatorial optimization, information theory • Multicast, k-Pairs • k-Pairs problems: Network coding when each commodity has one sink • Analogous to multicommodity flow • Definitions for cyclic networks are subtle
Multicommodity Flow Efficient algorithms for computing maximum concurrent (fractional) flow. Connected with metric embeddings via LP duality. Approximate max-flow min-cut theorems. Network Coding Computing the max concurrent coding rate may be: Undecidable Decidable in poly-time No adequate duality theory. No cut-based parameter is known to give sublinear approximation in digraphs. Directed and undirected problems behave quite differently
Directed k-pairs s1 s2 Coding rate can be muchlarger than flow rate! Butterfly: • Coding rate = 1 • Flow rate = ½ Thm [HKL’04,LL’04]: graphs G(V,E) whereCoding Rate = Ω( flow rate ∙ |V| ) t2 t1 Thm: graphs G(V,E) whereCoding Rate = Ω( flow rate ∙ |E| ) • And this is optimal • Recurse on butterfly construction
Directed k-pairs Coding rate can be muchlarger than flow rate! …and much larger than the sparsity(same example) Flow Rate Sparsity < Coding Rate in some graphs
Pigeonhole principle argument Undirected k-pairs • No known undirected instance where coding rate ≠ max flow rate! (The undirected k-pairs conjecture) Flow Rate Coding Rate Sparsity Gap can be Ω(log n) when G is an expander
< = = < Undirected k-Pairs Conjecture Coding Rate ? ? Sparsity Flow Rate Unknown until this work Undirected k-pairs conjecture
s1 t3 s2 t1 s4 t4 s3 t2 Okamura-Seymour Graph Cut Every cut has enough capacity to carry all commodities separated by the cut
s1 t3 s2 t1 s4 t4 s3 t2 Okamura-Seymour Max-Flow Flow Rate = 3/4 si is 2 hops from ti. At flow rate r, each commodity consumes 2r units of bandwidth in a graph with only 6 units of capacity.
If an edge codes multiple commodities, how to charge for “consuming bandwidth”? We work around this obstacle and bound coding rate by 3/4. The trouble with information flow… s1 t3 s2 s4 t1 t4 s3 t2 At flow rate r, each commodity consumes at least 2r units of bandwidth in a graph with only 6 units of capacity.
i i i Informational Dominance • Definition:A e if for every coding solution,the messages sent on edges of A uniquely determine the message sent on e. • Given A and e, how hard is it to determine whether A e? Is it even decidable? • Theorem: There is an algorithm tocompute whether A e in time O(k²m). • Based on a combinatorial characterizationof informational dominance
What can we prove? • Combine Informational Dominance with Shannon inequalities for Entropy • Flow rate = coding rate for “Special Bipartite Graphs”: • Bipartite • Every source is 2 hopsaway from its sink • Dual of flow LP is optimizedby assigning length 1 to all edges • Next: show that proving conjecture for all graphs is quite hard s1t3 s4 t4 s2t1 s3t2
k-pairs conjecture & I/O complexity • I/O complexity model [AV’88]: • A large, slow external memory consisting of pages each containing p records • A fast internal memory that holds 2pages • Basic I/O operation: read in two pages from external memory, write out one page
Matrix transposition: Given a p×p matrix of records in row-major order, write it out in column-major order. Obvious algorithm requires O(p²) ops. A better algorithm uses O(p log p) ops. I/O Complexity of Matrix Transposition
Matrix transposition: Given a p×p matrix of records in row-major order, write it out in column-major order. Obvious algorithm requires O(p²) ops. A better algorithm uses O(p log p) ops. I/O Complexity of Matrix Transposition s1 s2
Matrix transposition: Given a pxp matrix of records in row-major order, write it out in column-major order. Obvious algorithm requires O(p²) ops. A better algorithm uses O(p log p) ops. I/O Complexity of Matrix Transposition s1 s2 s3 s4
Matrix transposition: Given a pxp matrix of records in row-major order, write it out in column-major order. Obvious algorithm requires O(p²) ops. A better algorithm uses O(p log p) ops. I/O Complexity of Matrix Transposition s1 s2 s3 s4 t3 t1
Matrix transposition: Given a pxp matrix of records in row-major order, write it out in column-major order. Obvious algorithm requires O(p²) ops. A better algorithm uses O(p log p) ops. I/O Complexity of Matrix Transposition s1 s2 s3 s4 t3 t4 t1 t2
Theorem: (Floyd ’72, AV’88) A matrix transposition algorithm using only read and write operations (no arithmetic on values) must perform Ω(p log p) I/O operations. MatchingLower Bound s1 s2 s3 s4 t3 t4 t1 t2
Proof: Let Nij denote the number of ops in which record (i,j) is written. For all j, Σi Nij ≥p logp. Hence Σij Nij ≥p² logp. Each I/O writes only p records. QED. Ω(p log p)Lower Bound s1 s2 s3 s4 t3 t4 t1 t2
Definition: An oblivious algorithm is one whose pattern of read/write operations does not depend on the input. Theorem: If there is an oblivious algorithm for matrix transposition using o(p log p) I/O ops, the undirected k-pairs conjecture is false. The k-pairs conjecture and I/O complexity s1 s2 s3 s4 t3 t4 t1 t2
Proof: Represent the algorithm with a diagram as before. Assume WLOG that each node has only two outgoing edges. The k-pairs conjecture and I/O complexity s1 s2 s3 s4 t3 t4 t1 t2
Proof: Represent the algorithm with a diagram as before. Assume WLOG that each node has only two outgoing edges. Make all edges undirected, capacity p. Create a commodity for each matrix entry. The k-pairs conjecture and I/O complexity s1 s2 s3 s4 t3 t4 t1 t2
Proof: The algorithm itself is a network code of rate 1. Assuming the k-pairs conjecture, there is a flow of rate 1. Σi,jd(si,tj) ≤ p |E(G)|. Arguing as before, LHS is Ω(p² log p). Hence |E(G)|=Ω(p log p). The k-pairs conjecture and I/O complexity s1 s2 s3 s4 t3 t4 t1 t2
Other consequences for complexity • The undirected k-pairs conjecture implies: • A Ω(p log p) lower bound for matrix transposition in the cell-probe model. [Same proof.] • A Ω(p² log p) lower bound for the running time of oblivious matrix transposition algorithms on a multi-tape Turing machine. [I/O model can emulate multi-tape Turing machines with a factor p speedup.]
Distance arguments • Rate-1 flow solution implies Σi d(si,ti) ≤ |E| • LP duality; directed or undirected • Does rate-1 coding solution implyΣi d(si,ti) ≤ |E|? • Undirected graphs: this is essentially thek-pairs conjecture! • Directed graphs: this is completely false
s(2) s(3) s(4) s(5) s(6) s(7) s(8) s(1) t(2) t(3) t(4) t(5) t(6) t(7) t(8) t(1) Recursive construction • k commodities (si,ti) • Distance d(si,ti) = O(log k) i • O(k) edges!
Recursive Construction s1 s2 • Equivalent to: 2 commodities 7 edges Distance = 3 G(1): t1 t2 s2 s1 Edge capacity = 1 t1 t2
Start with two copies of G(1) Recursive Construction s3 s4 s1 s2 G(2): t3 t4 t1 t2
Replace middle edges with copy of G(1) Recursive Construction s3 s4 s1 s2 G(2): t3 t4 t1 t2
4 commodities, 19 edges, Distance = 5 Recursive Construction s3 s4 s1 s2 G(1) G(2): t3 t4 t1 t2
# commodities = 2n, |V| = O(2n), |E| = O(2n) Distance = 2n+1 Recursive Construction s1 s2 s3 s4 s2n-1 s2n G(n-1) G(n): t1 t2 t3 t4 t2n-1 t2n
Summary • Directed instances: • Coding rate >> flow rate • Undirected instances: • Conjecture: Flow rate = Coding rate • Proof for special bip graphs • Tool: Informational Dominance • Proving conjecture solves MatrixTransposition Problem
Open Problems • Computing the network coding rate in DAGs: • Recursively decidable? • How do you compute a o(n)-factor approximation? • Undirected k-pairs conjecture: • Stronger complexity consequences? • Prove a Ω(log n) gap between sparsest cut and coding rate for some graphs • …or, find a fast matrix transposition algorithm.
Optimality • The graph G(n) proves:Thm [HKL’05]: graphs G(V,E) whereNCR = Ω( flow rate ∙ |E| ) • G(n) is optimal:Thm [HKL’05]: graph G(V,E),NCR/flow rate = O(min {|V|,|E|,k})
Informational Dominance Def: A dominates B if information in A determines information in Bin every network coding solution. s1 s2 Adoes not dominate B t2 t1
Informational Dominance Def: A dominates B if information in A determines information in Bin every network coding solution. s1 s2 Adominates B Sufficient Condition: If no path from any source B then A dominates B t2 t1
s1 s2 t1 t2 Informational Dominance Example • “Obviously” flow rate = NCR = 1 • How to prove it? Markovicity? • No two edges disconnect t1 and t2 from both sources!
Informational Dominance Example s1 s2 t1 Cut A t2 Sufficient Condition: If no path from any source B then A dominates B
Informational Dominance Example s1 s2 • Our characterization implies thatA dominates {t1,t2} H(A) H(t1,t2) t1 Cut A t2
Rate ¾ for Okamura-Seymour s1 s1 t3 s1t3 i s4 t4 s2t1 s3 s3t2
+ + ≥ + + Rate ¾ for Okamura-Seymour s1t3 i s4 t4 s2t1 i s3t2 i
+ + ≥ + + Rate ¾ for Okamura-Seymour s1t3 i s4 t4 s2t1 i s3t2 i
i i i + + i Rate ¾ for Okamura-Seymour s1t3 s4 t4 s2t1 s3t2 ≥ + +
i i ≥ + + Rate ¾ for Okamura-Seymour s1t3 s4 t4 s2t1 s3t2 ≥ +