Prediction-based Prefetching to Support VCR-like Operations in Gossip-based P2P VoD Systems

Prediction-based Prefetching to Support VCR-like Operations in Gossip-based P2P VoD Systems TianyinXu, Weiwei Wang, Baoliu Ye Wenzhong Li, SangluLu, Yang Gao Nanjing University Dislab, NJU CS

Outline • Background • P2P VoD streaming; Gossip-based systems; VCR-like interactive behavior. • Motivation • Solutions • System architecture; Prefetching model; Data scheduling; VCR-like operation support. • Performance Evaluation • Conclusions Dislab, NJU CS

Background (1) • P2P media streaming • Everyone can be a content producer/provider. • Cache-and-relay mechanism: peers actively cache media contents and further relay them to other peers that are expecting them. * P2P live streaming is very successful! • CoolStreaming (INFOCOM’05), • PPLive, Joost Dislab, NJU CS

Background (2) • P2P VoD streaming is challenging! • Provide free access to any segment in the video at anytime by VCR-like operations. • VCR-like (Video Cassette Recorder) operations • random seek, pause, fast forward/backward (FF/FB) • For VCR-like operations, “jump” process is the most important. • Most VCR-like operations can be implemented by “jump”. • random seek & pause: 1 jump; FF/FB: series of jump; Dislab, NJU CS

Question: Is the prediction feasible? Motivation (1) • How to support the “jump”? • Optimizing the index overlay to realize fast segment relocation • Jump => locate-and-download process; • Necessary, but far more sufficient. • Prediction-based Prefetching • Expect a zero jump delay; • Proactively prefetch segments that are likely to be requested by future VCR-like operations; • Rely on prediction accuracy. Dislab, NJU CS

User Access Patterns (1) • User rarely view the movie from the beginning to the end. • The total playing time of a user is quite limited and tends to be short. • Because some popular segments (called highlights) attract more user requests than non-popular segments. Brampton et al., NOSSDAV-2007 Zheng et al., P2PMMS-2005 Dislab, NJU CS

User Access Patterns (2) • Probability distribution of object and segment popularity • Log-normal distribution • Zipf distribution Brampton et al., NOSSDAV-2007 Yu et al., EUROSYS-2006 Dislab, NJU CS

User Access Patterns (3) • Fast Forward is more frequent than Fast Backward. • Short Jump is more frequent than Long Jump. Cheng et al., IPTPS-2007 Brampton et al., NOSSDAV-2007 Dislab, NJU CS

Motivation (2) • Our Objective: Effective Prediction-based Prefetching Scheme • Effective prediction model • Based on user access patterns • Easy to be integrated in current P2P VoD systems • Practical data scheduling Dislab, NJU CS

Our solution: • Server side: offline pattern mining => prediction model • Peer side: lightweight online prediction System Architecture (1) • Solution 1: Let the server do prediction for each user [1] • Pro: Server has large volumes of user viewing logs • Con: poor scalability • Solution 2: Let the client exchange user logs and do prediction [2] • Pro: scalable • Cons: 1. lack of large volumes of user logs 2. high computing cost & training time [1] Huang et al, “A User-Aware Prefetching Mechanism for Video Streaming”, WWW-2003 [2] He et al, “VOVO: VCR-Oriented Video-On-Demand in Large-Scale Peer-to-Peer Networks”, TPDS-2009 Dislab, NJU CS

System Architecture (2) • Take full advantage of tracker • Tracker has large volume of user viewing logs; • Every node have to contact the tracker to join the system • initiate its neighbor & partner list Dislab, NJU CS

Prediction Approach: Overview • Frequent Sequential Pattern Mining • PerfixSpan[1] : Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. • Splitting Video Segments into Abstract States • Mapping User Logs to Abstract States • Construct Contingency Table (CCT) • Model Utilization [1] Pei et al., “Mining Sequential Patterns by Pattern Growth: The PrefixSpan Approach”, TKDE-2004. Dislab, NJU CS

Prediction Approach (1) 111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 111111111111111111111111111111111111111111 111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 Frequent Sequential Patterns Dislab, NJU CS

Prediction Approach (2) • Sequential patterns found may be overlapped? • e.g. <1,2,3,4,5,6,7> and <5,6,7,8,9,10,11,2> • Splitting Approach • Filter out the sub-patterns • e.g. <1,2,3,4>,<1,2,3,4,5>,<1,2,3,4,5,6>,<1,2,3,4,5,6,7> • Scan over the remaining sequential patterns • Cut them into intervals without overlapping - e.g. <1,2,3,4,5,6,7> and <5,6,7,8,9,10,11,2>[1,7],[8,12] • Take intervals not exist in the mined sequential patterns as separate intervals • Split the contiguous intervals into appropriate granularity intervals(States) • - MIN, MAX Dislab, NJU CS

Prediction Approach (3) • Map Raw User logs into State Transitions • <s,s’> • e.g. <1,2,3,4,5,6,7,8,9,10> map to [1,6][7,13] • Transition Table Construction • Simple Frequency Counting Dislab, NJU CS

Data Scheduling • Two stage scheduling strategy: • Stage 1: fetch urgent segments into playback buffer • Guarantee the continuity of normal playback • Urgent line mechanism [1] • Stage 2: prefetch based on prediction • Reduce jump latency • Utilize residual bandwidth [1] Li et al., “ContinuStreaming: Achieving High Plackback Continuity of Gossip-based Peer-to-Peer Streaming”, IPDPS-2008. Dislab, NJU CS

VCR-like Operation Support • The jump process caused by VCR-like operations: • Case 1. The jump segment is already prefetched on the local peer => Just playback!! • Case 2. The jump segment is cached on the partners’ buffer => download and playback! • Case 3. Neither cached on the local peer nor cached by the partners => relocate, connect and download Dislab, NJU CS

Simulation Settings • User Log Generation • Modify GISMO [1] • Using log-normal distribution to let users trend to jump around hot scenes. • The simulation is built on top of a topology of 5000 peer nodes based on the transit-stub model generated by GT-ITM. • The streaming rate is S = 256 Kpbs, the download bandwidth is randomly distributed in [1.5S, 5S]. • The default size of the playback buffer is 30Mbytes, i.e., each peer can cache 120 second recent stream (100 for playback, 20 for prefetching). • The arrival of peers follows the Poisson Process with λ = 5. [1] GISMO: A Generator of Internet Streaming Media Objects and Workloads Dislab, NJU CS

Performance Evaluation (1) Dislab, NJU CS

Performance Evaluation (2) 2 Dislab, NJU CS

Performance Evaluation (3) 3 Dislab, NJU CS

Performance Evaluation 4 Dislab, NJU CS

Conclusions • A practical architecture that can be used in almost all existing P2P VoD systems • A novel and simple prediction approach • State abstraction plays an important role • A two stage data scheduling Dislab, NJU CS

The End Dislab, NJU CS

Prediction-based Prefetching to Support VCR-like Operations in Gossip-based P2P VoD Systems

Prediction-based Prefetching to Support VCR-like Operations in Gossip-based P2P VoD Systems

Presentation Transcript

Gossip-Based Ad Hoc Routing

Gossip-Based Computation of Aggregation Information

Dependence-Based Value Prediction

A Lightweight Currency-based P2P VoD Incentive Mechanism

e -Gossip: Location Based Service via Electronic Gossip

Prediction-based coding

P2P-VoD

A Popularity-Based Prediction Model for Web Prefetching

Web-based Support Systems

Provision of VCR-like Functions in Multicast VoD

Alarm-based prediction:

P2P VOD system

P2P in VoD

Virtual Communities and Gossiping in Social-Based P2P Systems

Temporal-DHT and its Application in P2P-VoD Systems

Spreadsheet-Based Decision Support Systems

Mediation and Indexing in P2P-based Information Systems

Exploring VoD in P2P Swarming Systems

Analytics-Based Crime Prediction

Precomputation-based Prefetching

Neighborhood - based Tag Prediction

Standards-Based P2P Communications Systems