100 likes | 228 Views
Modeling YouTube QoE based on Crowdsourcing and Laboratory User Studies. Tobias Hoßfeld , Raimund Schatz STSM 15.8.- 30.9.2011 http://www3.informatik.uni-wuerzburg.de/research/fia http://www3.informatik.uni-wuerzburg.de/staff/hossfeld. QoE Issue: Waiting, Waiting, Waiting….
E N D
Modeling YouTube QoE based on Crowdsourcing and Laboratory User Studies Tobias Hoßfeld, Raimund Schatz STSM 15.8.-30.9.2011 http://www3.informatik.uni-wuerzburg.de/research/fia http://www3.informatik.uni-wuerzburg.de/staff/hossfeld
QoE Issue: Waiting, Waiting, Waiting… Waiting Time Perception Stalling
Research Activities Related to STSM Application-Level Measurements • bottleneck scenario with constant bandwidth • video characteristics • realistic stalling patterns Monitoring and Stalling Detector • heuristics fit QoS • information extraction approach leads to exact QoE results video player parameter, initial buffer 2sec variable video bit rate V; high stalling frequency for V=B used stalling lengthin tests: 1-6sec QoE management stalling as key influence factor QoE Modeling • only stalling relevant, not content, demographics, etc. • users “accept” almost no or only short stalling • crowdsourcing supports i:lab Optimization and Dimensioning • initial delay (GI/GI/1): T0/D<5% • bandwidth provisioning: 120%V • TCP better UDP in bottleneck mapping between QoS(e.g. bandwidth B) and QoE
Executive Summary of STSM Developed Test Design Application Measurements • Remote users • ‘Reliability’ questions • App./user monitoring • Preloading of data Realistic parametersfor temporal stimuli Conducted Crowd-sourcing Tests Laboratory Study • Data analysis • Identification of reliableusers • Key influences factorsvia machine learning • Fitting with fundamentalrelationships • Reliable users • Different demographics • Different test setting, e.g.longer user tests Derived QoEModel • Mapping function: stalling and QoE • Acceptance vs. perception • Comparison crowdsourcingwith laboratory results
Crowdsourcing Workflow 1 2 • Challenge: identify unreliable QoE results Countermeasures: • proper test design (gold standard data, consistency questions, content questions, application monitoring) • filtering data and analyzing QoE results 4 3 5 Methods also applicable to e.g. field trials!
Crowdsourcing: Unreliable workers • LEVEL 1: ‘reliability’ questions • - wrong answers to content questions • different answers to the same questions • always selected same option • consistency questions: specified the wrong country/continent LEVEL 2: ‘QoE’ question - did not notice stalling - perceived non-existent stalling LEVEL 3: ‘application/user’ monitoring - did not watch all videos completely • SOS hypothesis indicates unreliable test • Many user ratings rejected • further improvementsrequired • User warnings („Test not done carefully“) rejection rate decreased about 50% • Filtering may be too strict application layer monitoring not reliable C1 C2 C3 C4 C5 C6 C7 Facebook
Crowdsourcing vs. Laboratory Studies • Key influence factors on YouTube QoE stalling frequency and stalling duration determine the user perceived quality • Lab studies within ACE 2.0 at FTW’s i:Lab • Similar shapes of curves in laboratory and crowdsourcing study 4 seconds of stalling
Conclusions • Most of relevant stimuli of Internet applications are of temporal nature • QoE models have to be extended in temporal dimension: stalling, waiting times, service interruptions • Gap between user perception and user acceptance, differences in lab and crowdsourcing (WG3) • ‘Failed’ subjective studies for analysis of reliability (WG4) • Standards to detect unreliable subjects (WG5) • Crowdsourcing appears promising • Tests are conducted fast at low costs • Possibility to access different user groups (in terms of expectations/social background) • But new challenges are imposed WG1:“Web and cloud apps” WG2: “Crowd-sourcing”
Outcome of STSM • “Quantification of YouTube QoE via Crowdsourcing” by Tobias Hoßfeld, Raimund Schatz, Michael Seufert, Matthias Hirth, Thomas Zinner, Phuoc Tran-Gia, IEEE International Workshop on Multimedia Quality of Experience - Modeling, Evaluation, and Directions (MQoE 2011), Dana Point, CA, USA, December 2011. • “FoG and Clouds: On Optimizing QoE for YouTube” by Tobias Hoßfeld, Florian Liers, Thomas Volkert, Raimund Schatz, accepted at 5th KuVS GI/ITG Workshop "NG Service Delivery Platforms", at DOCOMO Euro-Labs, Munich, Germany • “Quality of Experience of YouTube Video Streaming for Current Internet Transport Protocols” by Tobias Hoßfeld and Raimund Schatz, currently under submission at ACM Computer Communications Review; a technical report of University of Würzburg is available containing the numerical results, Technical Report No. 482: “Transport Protocol Influences on YouTube QoE”, July 2011. • " ‘Time is Bandwidth’? Narrowing the Gap between Subjective Time Perception and Quality of Experience” by Sebastian Egger, Peter Reichl, Tobias Hoßfeld, Raimund Schatz, submitted to IEEE ICC 2012 - Communication QoS, Reliability and Modeling Symposium • “Challenges of QoE Management for Cloud Applications” by Tobias Hoßfeld, Raimund Schatz, Martin Varela, Christian Timmerer, submitted to IEEE Communications Magazine, Special Issues on QoE management in emerging multimedia services • “Recommendations and Comparison of Subjective User Tests via Crowdsourcing and Laboratories for online video streaming”, intended for submission • “Impact of Fake User Ratings on QoE”, intended for Journal submission.