10 likes | 130 Views
Enhancing Conversational Speech Quality of VoIP in a Wired/Wireless Environment Batu Sat and Benjamin W. Wah. Wireless. Public Internet. Private IP Network. Legend:. (A & B’s common perspective). Face-to-face setting:. A. B. A. B. time. A speaks. A thinks. (A’s perspective). A.
E N D
Enhancing Conversational Speech Quality of VoIP in a Wired/Wireless Environment Batu Sat and Benjamin W. Wah Wireless Public Internet Private IP Network Legend: (A & B’s common perspective) Face-to-face setting: A B A B time A speaks A thinks (A’s perspective) A B’ A B’ time VoIP setting: B speaks B thinks (B’s perspective) MED(AB) A’ B A’ B time MED(BA) Illinois Center for Wireless Systems Goals Background • Design of VoIP End Clients • Achieving high and consistent perceptual conversational quality • Enabling natural and efficient conversation among users • Real-time adaptation to changing network delay & loss conditions • Suitable for any communication device using any IP network • VoIP • Providing interactive speech among multiple users • Utilizing public and private wired/wireless IP networks • Independent of locations of users and devices used • IP Networks • Long-haul, WAN, LAN wired/wireless networks • Non-stationary real-time packet arrivals and losses • Large disparity in delay and loss behavior among clients • Complex QoS with multiple IP providers and without cost model • Quality measured and maintained at end-points • Better scalability with end-to-end strategies Challenges • Quality Metrics • No objective metrics for quantifying conversational speech quality • Costly non-repeatable subjective tests with full implementation • Design of Play-out Scheduling and Loss Concealment • Under dynamic packet delays and losses • Conversational Dynamics & Quality Proposed Solutions • Conversational Dynamics • Different network delays among clients • Multiple realities in VoIP in contrast to face-to-face conversation • Perception of delays and efficiency affected by conversational switching (turn-taking) frequency • Collection of Traces on Delays and Losses • Using Planet-Lab nodes for collecting end-to-end traces • With packet periods and payloads typical of VoIP applications • Modeling of Two-Party and Multi-Party Conversations • Utilizing human psychological models when possible • Subjective tests to obtain parameters for simulating dynamics • Evaluation of Conversational Speech Quality (CSQ) • Identification of human-observable and system-measurable metrics • Modeling CSQ as function of these metrics • Designing human subjective tests • Designing Play-out Scheduling/Loss concealment schemes • Trade-offs on system measurable and human-observable metrics • Schemes for real-time collection and relay of network statistics • Schemes for real-time adaptive POS and LCS • Conversational Speech Quality • Multiple dimensions in user perception of quality • Quality of one-way speech segments • Naturalness and rhythm of conversation, mutual-silence durations • Trade-Offs Results • None of the previous algorithms provides consistent balance between one-way speech quality and conversational interactivity • Trade-offs among mouth-to-ear delay (MED), redundancy, and amount of packets not received in time for play-out (UCFLR) • Difficulty under dynamic delay spikes and bursty losses • With longer MED • Improved one-way speech quality • Degraded symmetry and efficiency of interactive conversation • Trade-off between minimizing pair-wise MED and maintaining a balance among MEDs perceived by users in a conversation • Our scheme • Hugging delay curve closely • Minimizing delay degradations • Providing good one-way quality • Maximizing human quality perception