Algebraic Gossip: Network Coding for Rapid Data Dissemination Srinivas Shakkottai

Algebraic Gossip: Network Coding for Rapid Data Dissemination Srinivas Shakkottai University of Illinois at Urbana-Champaign ECE 559 April 4th 2006 University of Illinois at Urbana-Champaign

Based on the Papers • Algebraic Gossip : A Network Coding Approach to Optimal Rumor Mongering. - Supratim Deb and Muriel Médard. • Network coding for large scale content distribution. - Christos Gkantsidis, Pablo Rodriguez. University of Illinois at Urbana-Champaign

Outline • Peer to Peer Networks : Decentralization • Bittorrent: Chopping files into pieces • Network coding: Linear Combinations • Algebraic Gossip • Algorithms: Push and Pull • Simulation Results University of Illinois at Urbana-Champaign

Peer to Peer Fully Connected University of Illinois at Urbana-Champaign

Questions • Why should the peers cooperate? • How fast can all the peers get all the pieces? - With full information - Without full information • What happens when peers depart? • What happens when peers both arrive and depart? University of Illinois at Urbana-Champaign

Model • Fully Connected Static Network. • Identical link capacities. • No Information: peers are unaware of what pieces other peers possess. • Altruistic peers: A peer will give another peer a piece regardless of whether it receives anything in exchange. • Time is discretized into rounds. • Explore the time aspect. University of Illinois at Urbana-Champaign

b1 +b2 b1 +b2 b1 +b2 Network Coding b1 b2 b1 b2 b2 b2 b1 b2 b2 University of Illinois at Urbana-Champaign

Peer to Peer • Example considered was that of multicast. • Can we do something similar in peer to peer networks? • Transmit linear combinations of pieces instead of uncoded pieces. • The coefficients used in the linear combinations could be transmitted in the header of the packet. • If one obtains enough linearly independent pieces, one can decode all the pieces to get the file. University of Illinois at Urbana-Champaign

Algorithms • Random Message Selection (RMS) - Nodes have no information regarding what pieces the other nodes have. - Transmitting node picks a message at random (i.e., each message that it has is equally likely to be picked) to send to the receiving node. University of Illinois at Urbana-Champaign

Algorithms • Random Linear Coding (RLC) Instead of viewing messages of size m bits as an m dimensional vector over a binary field, we may look at them as an dimensional vector over the finite field . Also, . • Push and Pull Algorithms. University of Illinois at Urbana-Champaign

Decoding University of Illinois at Urbana-Champaign

Useful result Node v transmits to node u using RLC in a round. = subspace spanned by vectors with u before. = subspace spanned by vectors with v before. = subspace spanned by vectors with u after. = subspace spanned by vectors with v after. Where q is the field size. University of Illinois at Urbana-Champaign

Proof • Let there be l vectors in which have a component orthogonal to . Call them {g1,.,gl}. • is not larger than if the coefficients of all these in the linear combination is zero, i.e., • Represent [g1,.,gl]t = A (l x k) University of Illinois at Urbana-Champaign

Continued… • All coefficients β are zero if : [β1,…, βl] A = 0. • This had k equations in l variables with field size q => max number of solutions is ql-1. • Also, since the β’s are chosen at random, University of Illinois at Urbana-Champaign

Results for RLC • Time required for all nodes to get all messages using RLC with Pull: • Time required for all nodes to get all messages using RLC with Push: University of Illinois at Urbana-Champaign

Proof Outline: RLC Pull Step I For i <= k/2 where, Let Can show University of Illinois at Urbana-Champaign

Continued… Then we have, University of Illinois at Urbana-Champaign

Step II Constant rounds per piece For i <= k/2 Further, • First part is obvious • For the second part use University of Illinois at Urbana-Champaign

Continued… This gives: (Chernoff) Choose and University of Illinois at Urbana-Champaign

Continued… So finally: and so, University of Illinois at Urbana-Champaign

RLC Pull k>2i University of Illinois at Urbana-Champaign

Continued… • Recall , • Prob that dimension of u remains i after a round: University of Illinois at Urbana-Champaign

Continued… :event that node u fails to increase dimension in t Obviously This means that => For University of Illinois at Urbana-Champaign

Continued… Finally University of Illinois at Urbana-Champaign

Transition Probabilities where, University of Illinois at Urbana-Champaign

Time to absorption Also, University of Illinois at Urbana-Champaign

Which Finally Means… Thus, Constant rounds per piece University of Illinois at Urbana-Champaign

Results for RMS • Time required for all nodes to get all messages using RMS with Pull: • Time required for all nodes to get all messages using RMS with Push: University of Illinois at Urbana-Champaign

So far… • Showed that RLC O(n) does better than RMS in the settings given. • These are asymptotic results. • Now study more realistic scenarios by simulation. • Compare BitTorrent, FEC and RLC. University of Illinois at Urbana-Champaign

Well Connected Network University of Illinois at Urbana-Champaign

Clustered Network • Two Clusters of 100 nodes each. • Good connectivity with clusters. • Low connectivity between clusters. • Similar to BitTorrent operation. University of Illinois at Urbana-Champaign

Clustered Network University of Illinois at Urbana-Champaign

Differential Capacities • Some nodes with high capacity and some with low. • Many slow nodes and some fast ones. University of Illinois at Urbana-Champaign

Differential Capacities University of Illinois at Urbana-Champaign

Dynamic Arrivals • Nodes arrive and depart. • 40 empty nodes arrive every 20 rounds. • 100 block file size. • Nodes stay 10 rounds longer than they take to finish. University of Illinois at Urbana-Champaign

Dynamic Arrivals University of Illinois at Urbana-Champaign

Conclusion • Assumption: Peers are unaware of information possessed by other peers. • Showed through theory and simulation that in such a setting random linear coding outperforms an uncoded scheme. University of Illinois at Urbana-Champaign

Algebraic Gossip: Network Coding for Rapid Data Dissemination Srinivas Shakkottai