290 likes | 478 Views
Some interesting directions in network coding. Muriel Médard Electrical Engineering and Computer Science Department Massachusetts Institute of Technology. Collaborators. MIT LIDS: Minji Kim, Minkyu Kim, Anna Lee, Devavrat Shah, Jay-Kumar Sundararajan
E N D
Some interesting directions in network coding Muriel Médard Electrical Engineering and Computer Science Department Massachusetts Institute of Technology
Collaborators • MIT LIDS: Minji Kim, Minkyu Kim, Anna Lee, Devavrat Shah, Jay-Kumar Sundararajan • MIT CSAIL: Varun Aggarwal, Wenjun Hu, David Karger, Dina Katabi, Sachin Katti, Ben Leong, Una-May O’Reilly, Hariharan Rahul • MIT Broad Institute: Desmond Lun (previously LIDS) • Technical University of Munich: Ralf Koetter (previously UIUC) • University of Illinois Urbana-Champaign: Danail Traskov • California Institute of Technology: Michelle Effros, Tracey Ho (previously MIT LIDS, UIUC, Lucent) • Ecole Polytechnique Federale Lausanne (Switzerland): Christina Fragouli • Digital Fountain: Payam Pakzad (previously EPFL) • Samsung Advanced Institute of Technology: Chang Wook Ahn • BBN: Karen Haigh, Paul Rubel • Qualcomm: Niranjan Ratnakar (previously UIUC)
Overview • Basic overview of network coding • Network coding for erasures • Limited network coding • Network coding in multi-source multicast • Network coding beyond multicast Increasing functionality of network coding
Network coding • Canonical example [Ahslwede et al. 00] • What choices can we make? • No longer distinct flows, but information s b b 1 2 b b t u 1 2 w b b 1 2 x y z
Network coding • Picking a single bit does not work • Time sharing does not work • No longer distinct flows, but information s b b 1 2 b b t u 1 2 w b b 1 2 b 1 x y z b b 1 1
Network coding • Need to use algebraic nature of data • No longer distinct flows, but information s b b 1 2 Must we consider the optimization of codes and network usage jointly? b b t u 1 2 w b b 1 2 b + b 1 2 x y z b + b b + b 1 2 1 2
Randomized network coding- multicast • To recover symbols at the receivers, we require sufficient degrees of freedom – an invertible matrix in the coefficients of all nodes • The realization of the determinant of the matrix will be non-zero with high probability if the coefficients are chosen independently and randomly • Probability of success over field F ≈ • Randomized network coding can use any multicast subgraph which satisfies min-cut max-flow bound [Ho et al. 03] any number of sources, even when correlated [Ho et al. 04] Endogenous inputs j Exogenous input
Erasure reliability – single flow • End-to-end erasure coding: Capacity is packets per unit time. • As two separate channels: Capacity is packets per unit time. • -Can use block erasure coding on each channel. But delay is a problem. • Network coding: minimum cut is capacity • - For erasures, correlated or not, we can in the multicast case deal with average flows uniquely [Lun et al. 04, 05], [Dana et al. 04]: • - Nodes store received packets in memory • Random linear combinations of memory contents sent out • Delay expressions generalize Jackson networks to the innovative packets • Can be used in a rateless fashion
Feedback for reliability • Parameters we consider: • delay incurred at B: excess time, relative to • the theoretical minimum, that it takes for k packets • to be communicated, disregarding any delay due to • the use of the feedback channel • block size • feedback: number of feedback packets used • (feedback rate Rf = number of feedback messages / number of received packets) • memory requirement at B • achievable rate from A to C
Feedback for reliability Follow the approach of Pakzad et al. 05, Lun et al. 06 Scheme V allows us to achieve the min-cut rate, while keeping the average memory requirements at node B finite note that the feedback delay for Scheme V is smaller than the usual ARQ (with Rf = 1) by a factor of Rf feedback is required only on link BC Fragouli et al. 07
Interesting directions • Practical code design: • Using small generation sizes may reduce the throughput and erasure-correcting benefits of mixing information packets • Large generation sizes may incur unacceptable decoding delay at the receivers • Can we consider issues of delay, memory and feedback overhead for interesting code designs? • How do we take these issues into account when we use multicast rather than single flow approaches? • Parameter adaptation for delay-sensitive applications: • Feedback from the receivers to the source can be used to adjust adaptively the generation size and maximize the number of packets successfully decoded within the delay specifications. • The source response to this type of feedback is similar to TCP windows • Can we build an entire TCP-style suite for single network coded flows? • Errors – see Ralf’s talk!
5 2 1 4 6 3 0 1 0 1 1 0 indicates the associated coefficient Limited network coding with multicast • Difficulty of not allowing coding everywhere: • Finding a minimal set of coding nodes or links is NP-hard • Finding multicast codes when some nodes are not able to code is difficult • We associate a binary variable with each coefficient at a merging node 0 is zeroed, 1 remains indeterminate. • For each assignment of binary values to the variables, we can verify the achievability of the target rate R and determine whether coding is required.
LATA-X ISP 1755 (20,12,4) (40,12,3) Best Avg Best Avg Best Avg Best Avg Proposed GA 0 0.35 0 0.25 0 1.20 0 1.05 Minimal 1 0 0.90 0 1.05 0 1.35 1 1.85 Minimal 2 0 1.10 0 0.80 0 1.85 0 1.90 Ratio 1 0.39 1 0.31 1 0.89 0 0.57 Performance of genetic algorithms • LATA-X and ISP 1755 (Ebone) from Rocketfuel Project • Randomly generated connected acyclic directed graphs with (20 nodes, 12 sinks, rate 4) and (40 nodes, 12 sinks, rate 3) • Minimal 1 greedy approach [Fragouli Soljanin 06] • Minimal 2 greedy approach [Langberg et al. 05]
0 1 1 1 1 1 1 0 1 0 0 1 0 0 0 1 1 0 1 1 0 1 0 1 1 0 0 0 1 1 0 1 1 1 0 0 1 1 1 0 0 1 1 0 1 1 0 0 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 0 0 0 1 1 1 1 1 … … … … … … Decentralized operation • Populations can be managed locally • Cross-overs and mutations can be managed locally also • Some coordination is required for • Fitness value calculation - feedback can be done efficiently • Selection and pairing of chromosomes - can be calculated at the source and transmitted with the data on renewal of each generation. <Population>
Interesting problems • Interaction of coding and non-coding nodes: • Should they just co-exist or cooperate? • What algorithms can solve a joint routing/coding problem (in effect a constrained multicast network coding problem)? • Coding as a resource: • Can we determine how to place our coding resources in the network? • Should we turn coding on as needed?
Network coding – source coding -cooperation confluence • Network coding and distributed compression are intimately linked [Ho et al. 04] – we may envisage • Network coding for correlated sources can make use of naturally occurring correlation • Designing sources with correlation rather than straightforward replication as is done currently in mirrors • Coding and decoding melds erasure coding, multicast coding and compression • Rather than consider only shedding redundancy in networks, network coding points to using it and designing it intelligently
Optimization for multicast network coding Steiner-tree problem can be seen to be this problem with extra integrality constraints (1,1,0) (1,1,1) (1,0,1) (1,1,0) (1,1,1) (1,0,1) (1,1,1) (1,1,0) (1,0,1) source = sink Index on receivers rather than on processes [Lun et al. 04]
Joint versus separate coding Joint (cost 9) Separate (cost 10.5) for each link (R = 3) [Lee at al. 07]
Interesting directions • Making use of the joint coding: • Complexity goes up with the number of sources • How much better does this perform than doing Slepian-Wolf first, followed by routing or network coding? • How dependent is the design on knowing actual correlation parameters? • Practical code design for such schemes: • Achievability comes from random code construction, uses minimum-entropy decoding • Can we use the practical techniques that have yielded good results in Slepian-Wolf in this type of network coding? • Generalize mirror site design: • Do not copy a whole site, but just certain portions • How does this affect the storage in and operation of networks?
Going beyond multicast • Can create algebraic setting for linear non-multicast connections [Koetter Medard 02,03] • In the non-multicast case, linear codes do not suffice [Dougherty et al. 05] • Limited code approaches: ability to use XOR • Opportunistic XORs that are undone immediately (COPE) [Katabi et al. 05, 06] • End-to-end XOR codes on 2 flows [Traskov et al. 06] using cycle approaches • These approaches outperform routing by trivially subsuming it • Generalizations to codes including more flows, intermediate decoding points or codes beyond beyond XORs can be envisaged • A plethora of elaborations can be developed, leading to increased complexity with further benefits – trade-off unclear Net throughput (KB/s) Our Scheme No Coding b a Number of flows
Going beyond multicast • Can create algebraic setting for linear non-multicast connections [Koetter Medard 02,03] • In the non-multicast case, linear codes do not suffice [Dougherty et al. 05] • Limited code approaches: ability to use XOR • Opportunistic XORs that are undone immediately (COPE) [Katabi et al. 05, 06] • End-to-end XOR codes on 2 flows [Traskov et al. 06] using cycle approaches • These approaches outperform routing by trivially subsuming it • Generalizations to codes including more flows, intermediate decoding points or codes beyond beyond XORs can be envisaged • A plethora of elaborations can be developed, leading to increased complexity with further benefits – trade-off unclear Net throughput (KB/s) Our Scheme No Coding b a a Number of flows
Going beyond multicast • Can create algebraic setting for linear non-multicast connections [Koetter Medard 02,03] • In the non-multicast case, linear codes do not suffice [Dougherty et al. 05] • Limited code approaches: ability to use XOR • Opportunistic XORs that are undone immediately (COPE) [Katabi et al. 05, 06] • End-to-end XOR codes on 2 flows [Traskov et al. 06] using cycle approaches • These approaches outperform routing by trivially subsuming it • Generalizations to codes including more flows, intermediate decoding points or codes beyond beyond XORs can be envisaged • A plethora of elaborations can be developed, leading to increased complexity with further benefits – trade-off unclear Net throughput (KB/s) Our Scheme No Coding b a b a Number of flows
a+b a+b a+b b a b a Going beyond multicast • Can create algebraic setting for linear non-multicast connections [Koetter Medard 02,03] • In the non-multicast case, linear codes do not suffice [Dougherty et al. 05] • Limited code approaches: ability to use XOR • Opportunistic XORs that are undone immediately (COPE) [Katabi et al. 05, 06] • End-to-end XOR codes on 2 flows [Traskov et al. 06] using cycle approaches • These approaches outperform routing by trivially subsuming it • Generalizations to codes including more flows, intermediate decoding points or codes beyond beyond XORs can be envisaged • A plethora of elaborations can be developed, leading to increased complexity with further benefits – trade-off unclear Net throughput (KB/s) Our Scheme No Coding Number of flows
A principled optimization approach to match or outperform routing • An optimization that yields a solution that is no worse than multicommodity flow • The optimization is in effect a relaxation of multicommodity flow – akin to Steiner tree relaxation for the multicast case • A solution of the problem implies the existence of a network code to accommodate the arbitrary demands – the types of codes subsume routing • All decoding is performed at the receivers • We can provide an optimization, with a linear code construction, that is guaranteed to perform as well as routing [Lun et al. 04]
gives a set partition of {1, . . . ,M} that represents the sources that can be mixed (combined linearly) on links going into j Demands of {1, . . . ,M} at t Optimization Optimization for arbitrary demands with decoding at receivers
Coding and optimization • Sinks that receive a source process in Cby way of link (j, i) either receive all the source processes in C or none at all • Hence source processes in C can be mixed on link (j, i) as the sinks that receive the mixture will also receive the source processes (or mixtures thereof) necessary for decoding • We step through the nodes in topological order, examining the outgoing links and defining global coding vectors on them (akin to [Jaggi et al. 03]) • We can build the code over an ever-expanding front • We can go to coding over time by considering several flows for the different times – we let the coding delay be arbitrarily large • The optimization and the coding are done separately as for the multicast case, but the coding is not distributed
Fix the code approach – conflict hypergraph • There may occasions when we are not willing to go to infinite code lengths, or the types of codes may be pre-determined in our network, with different codes at different nodes • In that case, we can adopt a conflict hypergraph representation of the effects of coding and allowable rate regions together • Recent development for considering intrinsic multicast in switches [Sundarajan et al. 04] and special fabrics [Caramanis et al. 04] • Provides a systematic approach of representing the capacity region of a coded system for arbitrary codes • Vertices: • Define one vertex for each possible “composition of information” on every link • The composition of information on a link is the net transfer function from the source messages to the symbol sent on the link • Edges: • In a valid code, more than one vertex cannot be chosen corresponding to each link • If the composition on an outgoing link at a node is incompatible with a set of incoming input compositions, then the corresponding vertices are connected by a hyperedge • Natural extension of switching approaches in networks
Interesting directions • Design of codes: • How far should we go? • What are the advantages and disadvantages of fixing the lengths and fields ahead of time? • Should be looking at non-linear codes? • Can we find some distributed approaches? • Performance evaluation: • Can we use properties of certain conflict graphs to obtain capacity regions? • Can we generalize the optimization approach, for instance when certain nodes can do intermediate decoding?