710 likes | 863 Views
Network Coding: A New Direction in Combinatorial Optimization. Nick Harvey. Collaborators. David Karger Robert Kleinberg April Rasala Lehman Kazuo Murota. Kamal Jain Micah Adler. UMass. Transportation Problems. Max Flow. Min Cut. Transportation Problems. Communication Problems.
E N D
Network Coding:A New Direction in Combinatorial Optimization Nick Harvey
Collaborators • David Karger • Robert Kleinberg • April Rasala Lehman • Kazuo Murota • Kamal Jain • Micah Adler UMass
Transportation Problems Max Flow
Min Cut Transportation Problems
Communication Problems “A problem of inherent interest in the planning of large-scale communication, distribution and transportation networks also arises with the current rate structure for Bell System leased-line services.” Motivation for Network Design largely from communication networks - Robert Prim, 1957 Spanning Tree Steiner Forest Steiner Tree Multicommodity Buy-at-Bulk Facility Location Steiner Network
bottleneck edge What is the capacity of a network? s2 s1 t1 t2 • Send items from s1t1 and s2t2 • Problem: no disjoint paths
b2 b1 b1⊕b2 An Information Network s2 s1 t1 t2 • If sending information, we can do better • Send xor b1⊕b2 on bottleneck edge
Moral of Butterfly Transportation Network Capacity ≠ Information Network Capacity
Understanding Network Capacity • Information Theory • Deep analysis of simple channels(noise, interference, etc.) • Little understanding of network structures • Combinatorial Optimization • Deep understanding of transportation problems on complex structures • Does not address information flow • Network Coding • Combine ideas from both fields
s1 s2 t2 t1 Definition: Instance • GraphG(directed or undirected) • Capacityce on edge e • kcommodities, with • A source si • Set of sinks Ti • Demanddi • Typically: • all capacities ce = 1 • all demands di = 1 • Technicality: • Always assume G is directed. Replace with
b2 b1 b1⊕b2 b1⊕b2 b2 Definition: Solution • Alphabet (e) for messages on edge e • A function fe for each edge s.t. • Causality: Edge (u,v) sendsinformation previously received at u. • Correctness: Each sink ti can decodedata from source si. b1
… m2 mr m1 Source: Sinks: Multicast • Graph is DAG • 1 source, k sinks • Source has r messages in alphabet • Each sink wants all msgs Thm [ACLY00]: Network coding solution exists iff connectivity r from source to each sink
Multicast Example s m1 m2 t2 t1
A B A+B A+B Linear Network Codes • Treat alphabet as finite field • Node outputs linearcombinations of inputs • Thm [LYC03]: Linear codes sufficient for multicast
Multicast Code Construction • Thm [HKMK03]: Random linear codes work (over large enough field) • Thm [JS…03]: Deterministic algorithm to construct codes • Thm [HKM05]: Deterministic algorithm to construct codes (general algebraic approach)
Random Coding Solution • Randomly choose coding coefficients • Sink receives linear comb of source msgs • If connectivity r, linear combshave full rank can decode! • Without coding, problem isSteiner Tree Packing (hard!)
Our Algorithm • Derandomization of [HKMK] algorithm • Technique: Max-Rank Completionof Mixed Matrices • Mixed Matrix: contains numbers and variables • Completion = choice of values for variables that maximizes the rank.
s1 s2 t2 t1 k-pairs problem • Network coding when each commodity has one sink • Analogous to multicommodity flow • Goal: compute max concurrent rate • This is an open question
log( (e) ) Edge e Rate • Each edge has its own alphabet (e) of messages • Rate = min log( (S(i)) ) • NCR = sup { rate of coding solutions } • Observation: If there is a fractional flow with rational coefficients achieving rate r, there is a network coding solution achieving rate r. Source S(i)
Directed k-pairs s1 s2 • Network coding rate can be muchlarger than flow rate! • Butterfly graph • Network coding rate (NCR) = 1 • Flow rate = ½ Thm [HKL’04,LL’04]: graphs G(V,E) whereNCR = Ω( flow rate ∙ |V| ) Thm [HKL’05]: graphs G(V,E) whereNCR = Ω( flow rate ∙ |E| ) t2 t1
NCR / Flow Gap s1 s2 • Equivalent to: NCR = 1Flow rate = ½ G(1): t1 t2 Network Coding Flow s2 s2 s1 s1 Edge capacity = 1 Edge capacity = ½ t1 t2 t1 t2
Start with two copies of G(1) NCR / Flow Gap s3 s4 s1 s2 G(2): t3 t4 t1 t2
Replace middle edges with copy of G(1) NCR / Flow Gap s3 s4 s1 s2 G(2): t3 t4 t1 t2
NCR = 1, Flow rate = ¼ NCR / Flow Gap s3 s4 s1 s2 G(1) G(2): t3 t4 t1 t2
# commodities = 2n, |V| = O(2n), |E| = O(2n) NCR = 1, Flow rate = 2-n NCR / Flow Gap s1 s2 s3 s4 s2n-1 s2n G(n-1) G(n): t1 t2 t3 t4 t2n-1 t2n
Optimality • The graph G(n) proves:Thm [HKL’05]: graphs G(V,E) whereNCR = Ω( flow rate ∙ |E| ) • G(n) is optimal:Thm [HKL’05]: graph G(V,E),NCR/flow rate = O(min {|V|,|E|,k})
Multicommodity Flow Efficient algorithms for computing maximum concurrent (fractional) flow. Connected with metric embeddings via LP duality. Approximate max-flow min-cut theorems. Network Coding Computing the max concurrent network coding rate may be: Undecidable Decidable in poly-time No adequate duality theory. No cut-based parameter is known to give sublinear approximation in digraphs. Network flow vs. information flow No known undirected instance where network coding rate ≠ max flow! (The undirected k-pairs conjecture.)
Why not obviously decidable? • How large should alphabet size be? • Thm [LL05]: There exist networks wheremax-rate solution requires alphabet size • Moreover, rate does not increase monotonically with alphabet size! • No such thing as a “large enough” alphabet
The value of the sparsest cut is a O(log n)-approximation to max-flow in undirected graphs. [AR’98, LLR’95, LR’99] a O(√n)-approximation tomax-flow in directed graphs. [CKR’01, G’03, HR’05] not even a valid upper bound on network coding rate in directed graphs! s1 s2 t2 t1 Approximate max-flow / min-cut? e {e} has capacity 1 and separates 2 commodities, i.e. sparsity is ½. Yet network coding rate is 1.
The value of the sparsest cut induced by a vertex partition is a valid upper bound, but can exceed network coding rate by a factor of Ω(n). We next present a cut parameter which may be a better approximation… Approximate max-flow / min-cut? ti si sj tj
i i i Informational Dominance • Definition:A e if for every network coding solution, the messages sent on edges of A uniquely determine the message sent on e. • Given A and e, how hard is it to determine whether A e? Is it even decidable? • Theorem [HKL’05]: There is a combinatorial characterization of informational dominance. Also, there is an algorithm to compute whetherA e in time O(k²m).
Informational Dominance Def: A dominates B if information in A determines information in Bin every network coding solution. s1 s2 Adoes not dominate B t2 t1
Informational Dominance Def: A dominates B if information in A determines information in Bin every network coding solution. s1 s2 Adominates B Sufficient Condition: If no path from any source B then A dominates B (not a necessary condition) t2 t1
s1 s2 t1 t2 Informational Dominance Example • “Obviously” flow rate = NCR = 1 • How to prove it? Markovicity? • No two edges disconnect t1 and t2 from both sources!
Informational Dominance Example s1 s2 • Our characterization implies thatA dominates {t1,t2} H(A) H(t1,t2) t1 Cut A t2
Capacity of edges in A Demand of commodities in P i Informational Meagerness • Def: Edge set Ainformationally isolates commodity set P if A υPP. • iM(G) = minA,Pfor P informationally isolated by A Claim: network coding rate iM(G).
Approximate max-flow / min-cut? • Informational meagerness is no better than an Ω(log n)-approximation to the network coding rate, due to a family of instances called the iterated split butterfly.
Approximate max-flow / min-cut? • Informational meagerness is no better than a Ω(log n)-approximation to the network coding rate, due to a family of instances called the iterated split butterfly. • On the other hand, we don’t even know if it is a o(n)-approximation in general. • And we don’t know if there is a polynomial-time algorithm to compute a o(n)-approximation to the network coding rate in directed graphs.
easy consequence of info. dom. Sparsity Summary • Directed Graphs • Undirected Graphs Flow Rate Sparsity < NCR iM(G) in some graphs Flow Rate NCR Sparsity Gap can be Ω(log n) when G is an expander
< = = < Undirected k-Pairs Conjecture ? ? ? ? Sparsity Flow Rate NCR Unknown until this work Undirected k-pairs conjecture
s1 t3 s2 t1 s4 t4 s3 t2 The Okamura-Seymour Graph Every edge cut has enough capacity to carry the combined demand of all commodities separated by the cut. Cut
s1 t3 s2 t1 s4 t4 s3 t2 Okamura-Seymour Max-Flow Flow Rate = 3/4 si is 2 hops from ti. At flow rate r, each commodity consumes 2r units of bandwidth in a graph with only 6 units of capacity.
If an edge combines messages from multiple sources, which commodities get charged for “consuming bandwidth”? We present a way around this obstacle and boundNCR by 3/4. The trouble with information flow… s1 t3 s2 s4 t1 t4 s3 t2 At flow rate r, each commodity consumes at least 2r units of bandwidth in a graph with only 6 units of capacity.
We will prove: Thm [HKL’05]:NCR 6/7 < Sparsity. Proof uses properties of entropy. ABH(A) H(B) Submodularity: H(A)+H(B) H(AB)+H(AB) Lemma (Cut Bound): For a cut AE,H( A ) H( A, sources separated by A ). Okamura-Seymour Proof Thm [AHJKL’05]:flow rate = NCR = 3/4.
H(A) H(A,s1,s2,s4) (Cut Bound) s1 t3 s2 t1 s4 t4 s3 t2 Cut A
H(B) H(B,s1,s2,s4) (Cut Bound) s1 t3 s2 t1 s4 t4 s3 t2 Cut B
Add inequalities: H(A) + H(B) H(A,s1,s2,s4) + H(B,s1,s2,s4) Apply submodularity: H(A) + H(B) H(AB,s1,s2,s4) + H(s1,s2,s4) Note: AB separates s3 (Cut Bound) H(AB,s1,s2,s4) H(s1,s2,s3,s4) Conclude: H(A) + H(B) H(s1,s2,s3,s4) + H(s1,s2,s4) 6 edges rate of 7 sources rate 6/7. Cut A Cut B
Rate ¾ for Okamura-Seymour s1 s1 t3 s1t3 i s4 t4 s2t1 s3 s3t2