210 likes | 369 Views
On the Optimal Scheduling for Media Streaming in Data-driven Overlay Networks. Meng ZHANG with Yongqiang XIONG, Qian ZHANG, Shiqiang YANG Globecom 2006. Outline. Background Related Work Problem Statement and Formulation Global Optimal Solution Distributed Algorithm Performance Evaluation
E N D
On the Optimal Scheduling for Media Streaming in Data-driven Overlay Networks Meng ZHANG with Yongqiang XIONG, Qian ZHANG, Shiqiang YANG Globecom 2006
Outline • Background • Related Work • Problem Statement and Formulation • Global Optimal Solution • Distributed Algorithm • Performance Evaluation • Conclusion & Future Work
Background • The Internet has witnessed a rapid growth in deployment of data-driven (swarming based) overlay/peer-to-peer network based IPTV systems during recent years. • These products are based on data-driven protocol • Facts of concurrent online users • GridMedia: over 230,000, rate 310kbps (achieved by one server) (developed by our lab) • PPLive: 500,000, rate 300-500kbps • QQLive: 1,460,000, rate 300-500kbps (not one server)
Background - Data-Driven Protocol Review • Aiming to enable large-scale live broadcasting in the Internet environment • Very simple and very similar to that of Bit-Torrent • Two steps in data-driven protocol • The overlay construction • The block scheduling
Background - Data-Driven Protocol Review • The second step – block scheduling • The streaming is divided into blocks • Each node has a sliding window containing all the blocks it is interested in currently • The first step – overlay construction • All the nodes self-organize into a random graph I have block 1,2,3 Request block 3 Send block 3 I have block 1,2,4 Request block 4 Send block 4 I have block 1,2 Request block 1 Send block 1 I have block 2,3 Request block 2 Send block 2
Related Work • To improve data-driven protocol, most recent efforts focus on optimizing overlay construction (i.e. the first step ): • Vishnumurthy & Francis (INFOCOM2006): random graph building under heterogeneous overlay • Liang & Nahrstedt (INFOCOM2006): propose RandPeer, a peer-to-peer QoS-sensitive membership management protocol
Related Work • An problem not well addressed is how to optimize the second step, that is, • how to do optimal block scheduling and maximize the throughput of data-driven protocol under a constructed overlay • Most existent methods are straight forward and ad hoc • Chainsaw: pure random way • DONet: greedy local rarest-first • PALS: round-robin method
Problem Statement and Formulation Local Rarest First (LRF) strategy Throughput is 4 Optimal scheduling, throughput gain is 25% Some requests congestion at node 1 • How to do optimal scheduling to maximize the throughput of the whole overlay? • The real situation is more complicated because different blocks may have different importance and the bottlenecks are not only at the last mile. • Our basic approach: • Define priority to different blocks due to their importance • Maximize the sum of priorities of all requested blocks
Problem Statement and Formulation - Priority Definition • We use two factors to represent the significance of a block: • rarity factor • emergency factor • We define the priority of block j∈Ai for node i∈R as follow: • Pji = βPR(Σk∈Nbr(i)hkj)+(1-β)PE(Ci+WT-dji), • Where 0≤β≤1, functions PR(*) (rarity factor) and PE(*) (emergency factor) are both monotonously non-increasing ones
Problem Statement and Formulation - Formulation • Decision variable • Global block scheduling problem: • s.t.
Global Optimal Solution • Convert the global block scheduling formulation into an equivalent Min-Cost Flow Problem
Global Optimal Algorithm • Proposition: • The optimal goal of global block scheduling problem has the same absolute value as the minimum flow amount of its corresponding min-cost network flow problem. The flow amount on arc (vkin, vijb) ∈{0, 1} is just the value of xkji, which is the solution to the optimal block scheduling. • Algorithm complexity: • O(nm(loglogU)log(nC)), where n and m are the number of vertices and arcs while U and C is the largest magnitude of arc capacity and cost
Distributed Algorithm • We first use a simple way to estimate the bandwidth that is available from each neighbor with historical information. • qki(m):the total number of blocks arrived at node i from neighbor k in the mthperiod. • Wki(m+1): the estimated bandwidth from node k to node i
Distributed Algorithm • With the estimated available bandwidth, a local block scheduling is performed on each node • It can be also transformed into an equivalent min-cost network flow problem for local optimal request
Distributed Algorithm • Heuristic distributed algorithm: • Node i estimates the bandwidth Wki(m+1) that its neighbor k can allocate it in the (m+1)th period with the traffic received from that neighbor in the previous M periods, as shown in equation (3); • Based on Wki(m+1), node i performs the local block scheduling (2) using min-cost network flow model. The results xkji∈{0,1} represent whether node i should request block j from neighbor k; • Send requests to every neighbor.
Performance Evaluation- Compared Scheduling Methods • Random Strategy: each node will assign each desired block randomly to a neighbor which holds that block. Chainsaw uses this simple strategy. • Local Rarest First (LRF) Strategy: A block that has the minimum owners among the neighbors will be requested first. DONet adopts this strategy. • Round Robin (RR) Strategy: All the desired blocks will be assigned to one neighbor in a prescribed order in a round-robin way. If there is multiple available senders, it is assigned to a sender that has the maximum surplus available bandwidth.
Simulation Configuration • For a fair comparison, all the experiments use the same simple algorithm for overlay construction • Delivery ratio: to represent the number of blocks that arrive at each node before playback deadline over the total number of blocks encoded. • DSL nodes: • Download bandwidth: 40% 512K, 30% 1M, 30% 2M • Upload bandwidth: half of download bandwidth • 500 nodes • Each node has 15 neighbors • Request period: 2 second
Simulation Results • All are DSL nodes with exchanging window of 10 sec and bottlenecks only at the last mile. Group size is 500
Simulation Results • All are DSL users with exchanging window of 10 sec and end-to-end available bandwidth 10~150Kbps. Group size is 500
Conclusion & Future Work • The contributions of this paper are twofold. • First, to the best of our knowledge, we are the first to theoretically address the streaming scheduling problem in data-driven (swarming based) streaming protocol. • Second, we give the optimal scheduling algorithm under different bandwidth constraints, as well as a distributed asynchronous algorithm which can be practically applied in real system and outperforms existent methods by about 10%~80% • Future work • How to do optimization over a horizon of several periods, taking into account the inter-dependence between the periods. • How to do optimal scheduling with scalable video coding (such as layered video coding) or multiple description coding
Thanks Q&A