630 likes | 759 Views
Topological Properties Affect the Power of Network Coding in Decentralized Broadcast. D i Niu , Baochun Li. Presented by Binh Tran 01 /07/2010. Outline. Motivation What is the power of network coding? Why to study the impact of network topology on the power of network coding?
E N D
Topological Properties Affect the Power of Network Coding in Decentralized Broadcast DiNiu, Baochun Li • Presented by Binh Tran • 01/07/2010
Outline • Motivation • What is the power of network coding? • Why to study the impact of network topology on the power of network coding? • Study the impact of topological dynamics on the power of network coding in gossip-based overlay broadcast • Theory: give the theoretical lower bound on the delay of any gossip algorithm in complete graphs. • Simulation: clustered and time varying topologies. Conclusion
Network coding for cost Cost of trees = 26
Network coding for cost Cost of network coding = 23
Motivation: Why to study the impact of network topology on the power of network coding? • Some Pioneering works have proved that NC • Can achieve multicast capacity in directed networks • Can speed up downloads over random block selection by 2-3 times (BitTorrent-like P2P content distribution) • However, others have showed that the rarest first algorithm of BitTorrent guarantees close-to-ideal diversity of blocks among peers, and using NC in such systems cannot be justified. CONFUSION due to the lack of theoretical understanding of NC’s benefit in P2P network, which are better modeled by gossip-based overlay broadcast Previous works show in the time-synchronized model that NC • Achieves the optimal delay performance for any transmission schedules in P2P networks • Achieves shorter broadcast delay of k blocks in complete graphs On the other hand, some works propose a decentralized block exchange algorithm based on push and pull that has a close-to-optimal performance
Motivation (con’t) • QUESTIONS • Does randomized network coding achieve the optimal broadcast delayas a block selection protocol? • Even if network coding achieves the optimal delay, how much benefit can it bring over reasonably good non-coding protocols? • Are there any factors that critically affect the marginal benefit of network coding, so much so that such benefit is only substantial under certain circumstances? • This paper proves • The optimality of NC in continuous-time gossiping model. • The marginal benefits of NC over reasonably good non-coding block selection policies. • And claims that topological dynamics serve as a critical factor that impacts the marginal benefits of NC in P2P network : • Clustering (traffic locality) topology • Time-varying topology
Gossip-like algorithms • The Gossip Algorithms conform to the following rules • For each node , at rate , it • Randomly chooses one of its neighbors to serve, and • Transmits one or a linear combination (in Galois field) of blocks it has obtained
Problem formulation • P2P network: , where nodes, and : the edge set that may change over time. • : an average upload bandwidth of node i • To accommodate random transmission delays, the time to take for node i to transmit a block follows a certain distribution with mean • An edge between 2 peers = data connection between them • A node maintain connections with a subset of all other peers = neighborhood • Inspired by gossip-based overlay broadcast systems, delivering k data blocks • The broadcast delay : the time needed to disseminate all k blocks to all the nodes in • The €-broadcast delay : 1- € of all the peers finish downloading.
Continuous-time model • A continuous-time trellis • If a block is sent from node u at time and is received by node v at time , we introduce vertices : and • A directed edge of capacity 1 from to • “Transmission edges” are determined by transmission schedules According to the well-known theorem on multicast in acyclic graphs, those nodes with can receive all k blocks Given a transmission schedule, the minimum possible time it takes a node v to receive all k block is
On the optimality of network coding • Proposition 1: Randomized Network Coding achieves the minimum possible broadcast delay for any topology and any transmission schedule with high probability. • Proposition 2: the author derive a theoretical lower bound on the broadcast delay of any “gossip algorithm” in complete graphs
Claims • Show that both coding and non-coding protocols can achieve performance close to the theoretical limits in complete and random graphs. • Performance of different algorithms • Random Usefull Block (RUB): among the blocks needed by the target peer, the sender transmits a random block • Local Rarest First (LRF): among the blocks need by the target peer, the sender transmits a random block with the smallest number of copies in the neighborhood • Global Rarest First (GRF): among the blocks needed by the target peer, the sender transmits a random block with the smallest number of copies in the network • Randomized Network Coding (NC): the sender linearly encodes all the coded blocks it has obtained using random coefficients in Galois field GF and uploads the encoded block to the target peer. • These block selection and encoding algorithms are implemented with SSE2 SIMD vector instruction
Performance of Different algorithms (con’t) Network coding is not necessarily needed to achieve close-to-optimal broadcast delay in complete and random graphs. This means the marginal benefit of NC is trivial in these graph
Clustered and time-varying topologies • Network model: • : graph of size N= mn, m clusters of peers: • Each peer p in also maintain global links with The links from peer p are changing periodically with cycle Each peer uploads to a random global neighbor at the points of a Poison process
Clustered and time-varying topologies Proposition 3: Implications: NC automatically makes better choices of blocks when transmitting across clusters.
Experimental studies: broadcast delay The performance of NC, GRF, RUB are not affected by the sparsity Varying can hardly affect the performance of RUB, NC, and GRF, with RUB being cosntantly inferior
Experimental studies: broadcast delay (con’t) The benefits of NC only increase dramatically when <= 1 The benefit of NC becomes to drop again if is too small
Guidelines for P2P topology when Network coding used • Study the problem of broadcasting multiple data blocks in networks of certain topologies using gossip-like algorithms, focusing on analyzing the benefit of randomized network coding. • Network coding achieves the optimal delay in any topologies, some non-coding protocols can achieve performance very close to the theoretical limits in complete and random graphs. • Clustering and time-varying topologies are two key factors that boost the benefit of network coding. • In clustered graphs, randomized NC behaves as if it has the global knowledge to make optimal decisions. • In topological dynamics, NC is resilient to traffic locality mechanisms that are common in P2P applications, and can take the best advantage of the path diversity, etc… • Need theoretically understand the complex behavior of network coding as compared to other gossiping algorithms in different kinds of random graphs.
Outline • Network coding: the random network coding method • The effects of randomness and network sparsity • How the redundancy introduced by network coding • Two examples and empirical studies • Discuss types of topologies to optimize system performance should network coding be applied.
Claim • Previous work • In P2P topology using randomized network coding, peers receive linearly independent coded block (innovative blocks/ new blocks) with very high probabilities • Note: provided that all coding is performed at source/intermediate nodes after complete decoding to recover the original blocks. • However, this paper showed • When peers code outgoing blocks before they fully decode and recover original blocks in realistic P2P topology, peers receive linearly dependent non-innovative blocks (redundant/old blocks ), thus decreasing their efficiency as these redundant blocks consume bandwidth.
Study model • A collection of N peers, self-organized into P2P topology with application layer links • One peer = server = source of content distribution • Original content on source is segmented into n original blocks [b1,b2,…,bn], where bi fixed k bytes. • Assume that psof peers serves as direct downstream peers of the server • Server sends coded blocks to these direct downstream peers with a period ts • Upon receiving new coded blocks, a peer produces new coded blocks for its downstream peers in the topology
Randomized Network Coding • Source: N original blocks [B1,B2,…,BN], where Bi: fixed k bits • At the time of encoding for downstream peer p, a peer independently and randomly chooses m blocks m coding cofficients in the Galois field GF( )
Example 1: Smaller topology • Claim: Peers may easily receive linearly dependent (non-innovative) blocks when aggressiveness a < 1 • Aggressivenessa :a peer produces a new coded block upon receiving axn coded blocks. It is used to adjust delay. • Smaller a leads to shorter waiting time and shorter delay in the process of content distribution. • a = 1: peers wait for n coded blocks to arrive before producing coded blocks. Assume a=1/n, where n is num of original blocks Network coding leads to linearly dependent blocks (redundant)
Example 2: Larger topology - A random topology that is often used in P2P networks today • Each peer: • An unique identifier • Pair of {independent, dependent} blocks received after every peer receives n = 3 coded blocks to successfully decode the desired data
Observations • Redundancy in network coding • Introduced by the stochastic nature of overlay link delays • Heavily dependent on the topology itself • Question? • Is a sparse topology any better? • Answer: MAY NOT BE THE CASE • Why? • For dense topology: lead to additional redundancy, also be helpful to rapidly disseminate innovative blocks across the topology • For too sparse topology: • Coded blocks may not be able to travel effectively through the topology -> redundancy in small clusters of peers Question? What constitutes a “GOOD” topology that minimizes redundancy introduced by network coding ? Redundancy critically depends on sparsity and randomness
Topology effects on the efficiency of network coding: sparsity and randomness Sparsity: quantitatively represented by the average number of neighbors that peers have (and also number of peers in network). Randomness: quantitatively characterized by the rewiring probability of a small-world topology We study the topology effects with various levels of sparsity and randomness via empirical evaluation on redundancy, block distribution times, server costs • Block Redundancy = # of coded blocks it receives/ # needed to successfully decode the segment • Distribution time = the time interval from initial forwarding a block from server to any of its downstream peers till all peers in the network have successfully received n independent coded blocks. • Server cost = # of blocks forwarded from the server to any of its downstream peers. Note the server stops sending when all of its downstream peers have received n independent blocks.
Small world topology with some rewired links • We vary the network topology randomness by adjusting the rewiring probability in small-world topologies • Small-world topology: organizing the peers into a ring, connecting each peer to d local neighbors, then rewiring each link to a random peer in the network with probability p p=0 : regular graph, where each peer has the same num of up and down streams neigbors p=1: totally random graph, small world graph chooses each of its d downstream links uniformly at random Regular graph: long path, significant clustering Random graph: lower clustering
Evaluations • Block Redundancy = # of coded blocks it receives/ # needed to successfully decode the segment • Distribution time = the time interval from initial forwarding a block from server to any of its downstream peers till all peers in the network have successfully received n independent coded blocks. • Server cost = # of blocks forwarded from the server to any of its downstream peers. • 100 peers using segments of 100 data blocks • Each peer forwards a coded block constructed from m=6 • Aggressiveness a = 0.004 • Server connectivity p = 0.15 • Link delays follow a uniform random distribution in [0.75t, 1.25t]
Effect of randomness: Performance experienced in a network with different levels of randomness and sparsity.
Effect of randomness: Performance experienced in a network with different levels of randomness and sparsity.
The choice of Ps affects redundancy and server cost The fraction of peers connected to the server, Ps, has a direct impact on the server cost
Impact of sparsity (network size+rewired probability): Regardless of the number of peers, 6 neighbors show the best performance
Regardless of the number of peers, 6 neighbors show the best performance
Regardless of the number of peers, 6 neighbors show the best performance
Server cost increases approximately linearly with N, and much better performance is observed for small p and N scale up
Server cost increases approximately linearly with N, and much better performance is observed for small p and N scale up
Server cost increases approximately linearly with N, and much better performance is observed for small p and N scale up