490 likes | 587 Views
The Role of Sensory Psychology to VoIP Rate Adaptation : A Study on Skype Calls. Skype Group, NSLAB INFOCOMM2012(Hopefully). Tx /Rx Content Bitrate Jitter Packet Loss Rate Quality of Service( QoS ). Mean Opinion Score (MOS) Reaction Time Reactivity/Responsiveness
E N D
The Role of Sensory Psychology to VoIP Rate Adaptation: A Study on Skype Calls Skype Group, NSLABINFOCOMM2012(Hopefully)
Tx/Rx Content • Bitrate • Jitter • Packet Loss Rate • Quality of Service(QoS)
Mean Opinion Score (MOS) • Reaction Time • Reactivity/Responsiveness • Quality of Experience (QoE)
QoE • MOS • Reaction Time • Reactivity QoS • Tx/Rx Content • Bitrate • Jitter • Packet Loss
Related Works On the TCP-Friendliness of VoIP Traffic, Tian Bu et al., INFOCOM2006 • Disprove the conjecture that VoIP is not TCP-Friendly after taking the user back-off mechanism into account. • User back-off: real time appswill drop out completely ifthe user perceived unacceptable quality due tonetwork congestion.
Related Works Quantifying Skype User Satisfaction, K. T. Chen et al., SIGCOMM2006 • The User Satisfaction Index(USI) • Using traditional metrics(RTT, jitter, bitrate) to infer user-centric metrics (reactivity, duration, MOS.) • Allow real-time and user-centric adaptation .
Related Works Could Skype be More Satisfying?, T. Y. Huang et al., IEEE Network 2010 • Skype’s adaptation does not take the individual codec and packet loss patternsinto consideration. • The inconsistency in voice quality results in over-utilization of bandwidth.
Related Works An Experimental Investigation of the Congestion Control Used by Skype, L. D. Cicco et al., WWIC 2007 • Skype’s slow adaptation to bandwidth drop causes coexisting TCP flows to be suppressed. • Skype’s over-utilization of bandwidth causes massive fluctuation on bitrate, which may result in user frustration.
Motivation • Clearly, there are many to be improved on Skype’s rate adaptation algorithm. • Skype’s over-utilization of bandwidth is1) wasting network resource and2) threating other applications at the risk of3) producing massive fluctuation on quality. • Our major assumption: This selfish deed of Skype is actually NOT helping user satisfaction. Users dislike changes on audio quality, even if they actually increase the average rate.
Goal • Confirm our assumption about user’s impression towards audio quality fluctuation. • Get a ballpark idea of the possible relationships between parameter and MOS. (formulation)
Method • Exploit audio encoder/decoder to create audio track with fluctuating qualities (bitrates.) • We will focus on Silk in all following experiments due to its1) potential of domination of VoIP codec and2) flexibility on fine-tuning bitrate.
Test Tracks Bitrate High rate Low rate Time ∆T ∆T
Test Tracks Setup Encoder Decoder Header Header High rate Encoder Decoder ∆T Combine Low rate PCM PCM PCM PCM PCM
Formulation: Goal • We target three variables, High Rate, Low Rate, ∆T, that affect the user’s perception. • Interactions between the three variables. • Exp1: Find the relation between fixed bitrate and MOS. • Exp2: Find the formula that combines the three dimensions with MOS.
Formulation: Test Tracks Setup • The maximum and minimum bitrate of Silk are 40.6 and 5.6 kbps. • We chose 10 rates uniformly from the interval.
Formulation: Test Tracks Setup • The source track is 30 seconds long. We set ∆T as its factors. • We picked 4 rates (q1, q4, q7, q10) to be the candidates of high and low rates. ∆T 10 sec 5 sec 3 sec 2 sec 1 sec Experiment 2 {40.6, 28.9, 17.2, 5.6} kbps LR HR
Formulation: Test Tracks Setup • Follows the ITU recommendations. • Four voices: 2 male and 2 female. • Sentences with no coherent plot. • 30seconds, 44.1 kbps • Reference tracks (original 44.1 kbps) are inserted in the test cases in order to provide a standard of rating. • The tracks of Exp1&2 are mixed up and the order of rating for each subject is randomly picked.
Formulation: Analysis (Exp1) • The plot can be fitted by a shifted logarithm function. • The shift is due to the lower boundary of human audio perception. • Observed rapid MOS drop with lower bitrate.
Why Logarithms? • Weber–Fechner lawThe smallest noticeable difference in stimulus (the least difference that the test person can still perceive as a difference,) was proportional to the starting value. • The law is shown plausible in a wide range of human perceptions including hearing, vision, taste, sense of touch and heat, and even temporal and spatial cognitions.
Formulation: Results (Exp2) • Adapting to an “optimal rate” and ignoring how users feel about changes might be over-optimistic.
Formulation: Analysis (Exp2) • R2 of logarithm regression of each track are generally higher than 0.9. • An outlier is discovered: 28.9+17.2. This is attributed to:1) the similarity of the two bitrates and 2) they both reside in middle- or low-level qualities. • The phenomena is also supported by the ANOVA test on the similarity of 28.9 and 17.2 kbps data sets (p = 0.2155).
Formulation: Analysis (Exp2) • In short, the MOS to frequencyof rate change relationship, although shows logarithmic behavior in general, depends on the magnitude of rate changes.
Some Guessing About the Subroutines SCALE() • Directly associated with the difference between hr and lr. The results in Fig. 7 provide evidence to this inference: same average bitrate, different magnitudes. • Positive correlation between the scale of regression function and rate change magnitude. • Another intention of SCALE() is to deal with small magnitude tracks that does not fit well.
Some Guessing About the Subroutines SHIFT() • Cope with human’s expectation. • As ∆T grows, the effect of fluctuation decreases and the variable-rate case will become indiscernible to a fixed-rate version. • We call this imaginary, fixed rate equivalent the dominant qualityof the fluctuation. (dominant quality ≠ average quality) • The dominant quality is the exact quality a user expects to observe when the negative impact of fluctuation diminishes.
Large-Scale Experiments: Goal • We need massive data to construct the detail of our formulas:- verify the structures of our formulas.- factors in the fixed-rate formula:- subroutines in the variable-rate formula: SCALE(hr,lr) & SHIFT(hr,lr)
Method • Same source track. • Nine levels of quality are exponentially chosen. • Five levels of rate changing frequency {1,2,3,5,10}. • 127 participants. • Score calibration with hidden reference track. • ITU Recommendations
Results: Formula Structure • Figural support:Non-parallel plots • Statistic support:ANOVA of interactivity (p=8e-14)
Results: Fixed-rate Formula • α=4.091 • β=1.515 • γ=1.000 • Another interesting discovery: lower bound of Silk.
Results: SCALE • Not surprisingly, SCALE subroutine is positively correlated with magnitude.
Results: SHIFT • This is more tricky… due its relationship with user expectation. • Base on our definition of dominant quality: • Where D(hr,lr) is the MOS of the dominant quality of rate changing pair: (hr,lr)
SHIFT (Conti.) • First we plot the estimated MOS of fixed hr, fixed lr, and D. • There is an apparent difference when hr<14.1. • Not surprising, we have already seen this reaction of MOS when a track ispaired by two similar, inferiorrates.
SHIFT (Conti.) • We plot them again in percentages:hr = 100%lr = 0% • We can then see a clear pattern when we group the tracks by their MOS magnitudes.
SHIFT (Conti.) • Finally…
Evaluation • It is not surprising that the formula outcomes of preliminary and large-scale experiments fit their ground truth. • We need a third dataset for verifying purpose. The Verifying Experiments • Different source track: conversation of two males. • Different length: 60 seconds • Different rates: {44.1, 11.8, 6.4} kpbs • Different frequencies: {1,5,10} seconds
Conclusion • Verified the user experience versus magnitude of rate change relationship exhibits the log-like behavior, echoing the Weber’s theory. • Discovered that experience versus frequency of rate change relationship also exhibits the log-like behavior. • Derived the closed form model of user experience to rate changes with 97%+ goodness of fit.