370 likes | 438 Views
P2P streaming with LT codes: a prototype experimentation . Andrea Magnetto, Rossano Gaeta , Marco Grangetto, Matteo Sereno Computer Science Department, University of Torino, Italy
E N D
P2P streaming with LT codes: a prototype experimentation Andrea Magnetto, Rossano Gaeta , Marco Grangetto, Matteo Sereno Computer Science Department, University of Torino, Italy AVSTP2P (ACM Workshop on Advanced video streaming techniques for peer-to-peer networks and social networking), October 29, 2010
Outline • Motivations • Introduction to rateless codes • LT codes • ToroVerde architecture • Overlay layer • Content distribution layer • Video buffer layer • Experimental results • Conclusions and future work
Outline • Motivations • Introduction to rateless codes • LT codes • ToroVerde architecture • Overlay layer • Content distribution layer • Video buffer layer • Experimental results • Conclusions and future work
Motivations • P2P + Video Streaming + Rateless codes • Real-time multicast over P2P network is promising application • startup delay, latency, security, ISP friendliness • Coding techniques can improve the efficiency of content distribution • M. Wang, B. Li, “R2: Random push with random network coding in live p2p streaming”, IEEE JSAC, Dec. 2007. • W. Chuan, B. Li, ”rStream: resilient and optimal p2p streaming with rateless codes”, IEEE Trans. on Parallel and Distributed Systems, Jan. 2008. • http://www.powercam.cc/slide/6391
Motivations • Rateless codes • LT codes, Raptor codes • Packet level coding by simple XOR • Random coded packets are generated on the fly (rateless) • Optimal (asymptotically zero overhead) • Low complexity encoder/decoder • M. Luby, “LT codes”, IEEE Symposium on Foundations of Computer Science, 2002.
Outline • Motivations • Introduction to rateless codes • LT codes • ToroVerde architecture • Overlay layer • Content distribution layer • Video buffer layer • Experimental results • Conclusions and future work
LT Codes • Luby Transform (LT) Codes: • Encoding process: • For the ith encoded packet, select degree di by carefully chosen Degree Distribution(RobustSoliton Distribution). • Choose di source data. • Perform XOR on chosen data. • Decoding process: • Decode degree-one encoded packets. • Remove degree-one edges iteratively. x6 x1 x2 x3 x4 x5 … y1 y2 y3 y4 y5 x2 x4 x1x3 x2x5 x3x5x6 7
LT codes applications • Data dissemination with rateless codes is well suited for broadcasting (3GPP MBMS, DVB-H) • Application to overlay networks • not straightforward • connections with network coding
LT codes & overlay networks • LT codes for overlay networks • Simple push approach for data distribution • Does not require content reconciliation • Robustness to peer churning and losses • Additional complexity & memory • Coding may introduce additional delay • Starting from our previous work this paper presents the development and evaluation of a live P2P streaming application based on LT codes • [4] M. Grangetto, R. Gaeta, and M. Sereno. Rateless codes network coding for simple and efficient P2P video streaming. In IEEE ICME, 2009.
Outline • Motivations • Introduction to rateless codes • LT codes • ToroVerde architecture • Overlay layer • Content distribution layer • Video buffer layer • Experimental results • Conclusions and future work
ToroVerde architecture • Video is organized in a sequence of chunks • 1 chunk = k packets • k = 120, packet size = 1328 bytes • independently decodable video units • composed of the video packets corresponding to a GOP encoded by the AVC/H.264 encoder • each chunk is coded using LT codes
ToroVerde architecture • LT codes rise tolerance to packet loss • UDP based • TFRC[3] used to control sending rates • LT code introduce delays (chunk coding only after complete decoding) • Early coded packet relaying (lesson learned from our previous work) [4] • To avoid loops of duplicated coded packets • Coded packets are relayed only once • Bloom filter[2] (32 bits, 3 hash functions in the header of each coded packet) to store the path already followed [2] A. Broder and M. Mitzenmacher. Network applications of bloom filters: A survey. Internet Mathematics, 1(4):485–509, 2005. [3] S. Floyd, M. Handley, J. Padhye, and J. Widmer. RFC 5348, TCP Friendly Rate Control (TFRC): Protocol Specification, Sep 2008.
Outline • Motivations • Introduction to rateless codes • LT codes • ToroVerde architecture • Overlay layer • Content distribution layer • Video buffer layer • Experimental results • Conclusions and future work
ToroVerde architecture : overlay layer • goal: to form the logical network • the cast: tracker, video server, peer • the part played by the tracker • store the addresses of the peers • respond to peers requests • the part played by the video server • accept connection requests up to a maximum number • deliver packets of LT coded chunks to its neighbors • the part played by the peer • contact the tracker • get a random subset of peers • send a contact message to all peers returned by the tracker • send keep-alive messages to neighbors and tracker
ToroVerde architecture : overlay layer • a peer is removed from the neighbor list • it does not send keep-alive messages for some time • it sends an explicit quit message • maximum number of neighbors • refuse additional connection requests • minimum number of neighbors • contact the tracker for more neighbors
Outline • Motivations • Introduction to rateless codes • LT codes • ToroVerde architecture • Overlay layer • Content distribution layer • Video buffer layer • Experimental results • Conclusions and future work
ToroVerde architecture : content distribution layer • goal: maximize upload bandwidth utilization and minimize duplicated packets using Bloom filters • dynamic buffer, B, to store (video) chunks • chunks are ordered according to their sequence number (Bfirstand Blast ) • state of a chunk (peer internal) • empty, partially decoded, decoded • a peer decodes a chunk, a STOP_CHUNK message is signaled to all its neighbors that in turn stop sending coded packets of that chunk
ToroVerde architecture : content distribution layer • Every peer keeps a buffer map BMp for each neighbor p • binary representation of useful chunks • algorithm to keep track of useful chunks • a peer removes a chunk from B when • it is not useful for all of its neighbors (info from buffer maps) • the player has already consumed it • if it is the eldest Bfirst is incremented
ToroVerde architecture : content distribution layer • coded packets are pushed to a subset of neighbors (the enabled set) • it is periodically refreshed every 1 s • cardinality of the enabled set is dynamically adapted depending on the upload bandwidth • strategy is borrowed from BitTorrent (worst enabled replaced by a random peer)
ToroVerde architecture : content distribution layer • Upload bandwidth is fully utilized when pushing fresh coded packet obtained from decoded chunks • a coded packet of a partially decoded chunk can be shared only once thus limiting the outgoing throughput • decoding time of a peer DT, defined as the highest sequence number of the chunks in B whose state is decoded, minus the number of partially decoded chunks in B • A peer has the highest DT if it has already decoded the most recent chunk without partially decoded chunks in the middle.
ToroVerde architecture : content distribution layer • Lead Index (LI) defined as the difference between the peer DT and the average DT in its neighborhood • Every peer estimates LI based on the knowledge of the BMs of its neighbors. • The leaders are peers whose upload bandwidth capacity at least doubles the video bitrate and whose LI is larger than a predefined threshold ∆l (=6) • The remaining are termed followers • Leader rank neighbors based on their LI thus preferring to enable other leaders, i.e. giving greater priority to the peers that need rarer chunks • Followers rank neighbors based on the number of coded packets uploaded in the last 20 s (tit-for-tat mechanism)
Outline • Motivations • Introduction to rateless codes • LT codes • ToroVerde architecture • Overlay layer • Content distribution layer • Video buffer layer • Experimental results • Conclusions and future work
ToroVerde architecture : video buffer layer • goal: to select the decoded chunks in B to be forwarded to the player • the player is started for the first time when the number of decoded chunks in a row reaches the threshold startup_chunks (=12) • decoded chunks are periodically sent to the player • buffering • if the number of decoded chunks in B falls below the threshold start_buffering (=2) • until the number of decoded chunks exceeds the threshold end_buffering (=8) • dropping • LI is used to detect if a peer is experiencing a high decoding delay compared to its neighbors. • if LI is below a negative threshold ∆d (=-6)the eldest partially decoded chunk in B is dropped (STOP_CHUNK messages to the neighbors)
ToroVerde architecture : video buffer layer • SVC where different video layers are conveyed by separated chunks • buffering works as usual by monitoring only the chunks of the BL(Base Layer) • dropping is applied to EL(Enhancement Layer) chunks with higher priority w.r.t. BL
Outline • Motivations • Introduction to rateless codes • LT codes • ToroVerde architecture • Overlay layer • Content distribution layer • Video buffer layer • Experimental results • Conclusions and future work
Experimental results • How to evaluate performance of ToroVerde • Continuity Index (CI) • the fraction of video packets retrieved by a peer before the corresponding playback deadline • Preview delay • Startup delay • System parameters • Video bitrate (in Kbps) • Video server upload bandwidth (in Kbps) • Upload bandwidth distribution (yielding the Resource Index (RI), i.e., the ratio between the average upload bandwidth and the video bitrate) • Arrival and departure patterns • Number of peers
Experimental results • How to compare ToroVerde against other proposals? • BIG problem if prototypes are not freely available! • Often only simulation studies • When PlanetLab experiments are conducted full information on system parameters are unknown or not accessible • How to choose Planetlab hosts? • BIG problem, as well! • Many unusable hosts (unreachability, bad connection quality, DNS problems, high CPU load, varying SSH keys, scarce free disk space)
Experimental results • Video bitrate: 300 kbps • Video server upload bandwidth: 2 Mbps • Upload bandwidth distributions • 256 Kbps 65% 65% 78% 85% • 384 Kbps 25% 30% 17% 10% • 1 Mbps 10% 5% 5% 5% • RI 1.2 1.1 1.05 1.02 • Arrival/departure patterns • stable • churning, alternating arrival and departures • mixed (half stable, half churning) • Average number of peers: 400
Experimental results • stable • RI 1.2 1.1 1.05 1.02 • CI 0.99 0.97 0.94 0.92 • preview delay 15.4 s 16.1 s 16.8 s 22.1 s • startup delay 42.2 s 44.1 s 47.8 s 55.9 s • mixed (200 s) • RI 1.2 1.1 1.05 1.02 • CI 0.96 0.96 0.94 0.92 • preview delay 22.0 s 23.5 s 26.7 s 31.6 s • startup delay 55.5 s 56.1 s 57.7 s 62.6 s • churn (200 s) • RI 1.2 1.1 1.05 1.02 • CI 0.93 0.93 0.92 0.92 • preview delay 29.8 s 30.6 s 31.2 s 31.7 s • startup delay 59.8 s 61.6 s 63.0 s 63.6 s
Experimental results Figure 3: Continuity index in presence of churn and limited upload. • TVS is robust to peer churning and is able to provide a good performance when the upload contributions are limited.
Experimental results • Video bitrate: 400 kbps • Video server upload bandwidth: 800 Kbps • Upload bandwidth distributions • 345 Kbps 8% • 460 Kbps 64% • 585 Kbps 12% • 690 Kbps 12% • 920 Kbps 4% • RI 1.28 • Arrival/departure patterns • churning (100 s) • Average number of peers: 400 • SVC: BL is reserved ¼ of the bitrate
Experimental results Figure 4: Continuity index with SVC Fig. 4 shows that peers can dynamically adapt their behavior to the overlay state, with almost no need to stop the playout for buffering. Indeed discontinuities are kept below 1% for all the tested video bitrates.
Outline • Motivations • Introduction to rateless codes • LT codes • ToroVerde architecture • Overlay layer • Content distribution layer • Video buffer layer • Experimental results • Conclusions and future work
Conclusion and future work • Contributions • Development of a complete P2P live streaming application exploiting LT codes to cope with data loss and peer churning • Support for SVC • Experimental evaluation in resource limited scenarios • On going works • Deeper analysis of TVS performance w.r.t. delays and overhead with and without SVC • Extension of TVS architecture to support multi-view • A.Magnetto, R.Gaeta, M.Grangetto, M.Sereno, “TURINstream: a Totally pUsh Robust and effIcieNt P2P video streaming architecture”, to appear on IEEE Transactions on Multimedia
References • [1] V. Bioglio, R. Gaeta, M. Grangetto, and M. Sereno. “On the fly gaussian elimination for LT codes.” IEEE Communication Letters, 13(2):953–955, Dec. 2009. • [2] A. Broder and M. Mitzenmacher. “Network applications of bloom filters: A survey.” Internet Mathematics, 1(4):485–509, 2005. • [3] S. Floyd, M. Handley, J. Padhye, and J. Widmer. RFC 5348, “TCP Friendly Rate Control (TFRC): Protocol Specification,” Sep 2008. • [4] M. Grangetto, R. Gaeta, and M. Sereno. “Rateless codes network coding for simple and efficient P2P video streaming.” In IEEE ICME, 2009. • [6] X. Liao, H. Jin, Y. Liu, L. M. Ni, and D. Deng. “Anysee: Peer-to-peer live streaming.” In IEEE INFOCOM, 2006. • [7] M. Luby. “LT codes.” In IEEE FOCS, pages 271–280, Nov. 2002. • [9] C. Wu and B. Li. “rStream: resilient and optimal peer-to-peer streaming with rateless codes.” IEEE Transactions on Parallel and Distributed Systems, 19(1):77–92, Jan. 2008 . • A. Magnetto, R. Gaeta, M. Grangetto, M. Sereno, “TURINstream: a Totally pUsh Robust and effIcieNt P2P video streaming architecture,” to appear on IEEE Transactions on Multimedia. • A. Magnetto, S. Spoto, R. Gaeta, M. Grangetto, and M. Sereno. “Fountains vs torrents: the p2p toroverde protocol.” In Proceedings of The 18th Annual Meeting of the IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (Mascots 2010).