1 / 18

New Models for Perceived Voice Quality Prediction and their Applications in Playout Buffer Optimization for VoIP Network

New Models for Perceived Voice Quality Prediction and their Applications in Playout Buffer Optimization for VoIP Networks. Dr. Lingfen Sun Prof Emmanuel Ifeachor. University of Plymouth United Kingdom {L.Sun; E.Ifeachor}@plymouth.ac.uk. Outline. Background Speech quality for VoIP networks

westbrook
Download Presentation

New Models for Perceived Voice Quality Prediction and their Applications in Playout Buffer Optimization for VoIP Network

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. New Models for Perceived Voice Quality Prediction and their Applications in Playout Buffer Optimization for VoIP Networks Dr. Lingfen Sun Prof Emmanuel Ifeachor University of Plymouth United Kingdom {L.Sun; E.Ifeachor}@plymouth.ac.uk

  2. Outline • Background • Speech quality for VoIP networks • Current status • Aims of the project • Main Contributions • Novel non-intrusive voice quality prediction models • Novel perceptual-based speech quality optimization (e.g. jitter buffer optimization) mechanism • Conclusions and Future Work

  3. Background – Speech Quality for VoIP Networks IP Network SCN SCN MOS • VoIP speech quality: end-user perceived quality(MOS), an important metric. • Affected by IP network impairments and other impairments. • Voice quality measurement: subjective (MOS ) or objective (intrusive or non-intrusive) Reference speech Degraded speech Intrusive measurement Gateway Gateway SCN: Switched Comm. Networks (PSTN, ISDN, GSM …) Non-intrusive measurement MOS End-to-end Perceived speech quality

  4. Current Status and Problems • Lack of an efficient non-intrusive speech quality measurement method • E-model (a complicated computational model) • Based on subjective tests to derive models/parameters, time-consuming and expensive. Only limited models exist • Lack of perceptual optimization control methods • only based on individual network parameters for buffer optimization and QoS control purposes • not perceptual-based optimization control

  5. Aims of the Project IP Network End-to-end perceived voice quality (MOS) Decoder De- packetizer Jitter buffer Encoder Packetizer Voice receiver Voice source Receiver Sender Non-intrusive measurement MOS • To develop novel and efficient method/models for non-intrusive quality prediction, • To apply the models for perceptual-based optimization control ( e.g. buffer optimization or adaptive sender-bit-rate QoS control).

  6. Novel Non-intrusive Voice Quality Prediction VoIP Network • Based on intrusive quality measurement (e.g. PESQ) to predict voice quality non-intrusively which avoids subjective tests. • A generic method which can be applied to audio, image and video. Intrusive method MOS(PESQ) Measured MOSc E-model delay PESQ Reference speech Degraded speech (packet loss, delay, codec …) Non-intrusive method New model (regression or ANN models) Predicted MOSc

  7. New Structure to Obtain MOSc MOS (PESQ) Reference speech Ie PESQ MOS  R  Ie E-model MOSc Degraded speech Delay model End-to-end delay Id • PESQ can only predict one-way listening speech quality (expressed as MOS). • By a new combined PESQ/E-model structure, a conversational speech quality (MOSc) can be obtained as Measured MOSc.

  8. Regression based Models (1) Ie Codec • Nonlinear regression models are derived for Ie based on PESQ/PESQ-LQ • Further combine Ie with Id to obtain MOSc. Ie model E-model Packet loss MOSc Id model Delay (d) Id (a) MOS (PESQ) PESQ/ PESQ-LQ MOS RIe Measured Ie Speech database Encoder Decoder Loss model Degraded speech Reference speech Nonlinear regression model (Ie model) Predicted Ie (b)

  9. Regression based Models (2) • Ie can be modelled by a logarithm fitting function with the form of • Parameters for different codecs (PESQ)

  10. Regression Models for AMR (12.2Kb/s) e.g.for AMR (12.2Kb/s), The goodness of fit is: SSE = 2.83 and R2 = 0.998 MOS vs. packet loss and delay

  11. Perceptual-based Buffer Optimization • Motivation: • only based on individual network parameters (e.g. delay or loss) • targeting only minimum average delay or minimum late arrival loss, not maximum MOS. • There is a need to design buffer algorithm to achieve optimum perceived speech quality. • Contribution • A perceptual-based optimization jitter buffer algorithm • Use regression based models for buffer optimization • Use a minimum impairment criterion instead of traditional maximum MOS score • A Weibull delay distribution based on trace analysis • A perceptual-based optimization of playout buffer algorithm

  12. Impairment Function Im • Define: impairment function Im Weilbull distribution buffer loss b Playout delay d

  13. Minimum Impairment Criterion • Define: minimum impairment criterion Given: network delay dn, network loss n and codec type Estimate: an optimized playout delay dopt Such that: minimize Imcan be reached. d1 d2 d3 d4 Minimum Im

  14. Perceptual-based Optimization Buffer Algorithm • For every packet i received, calculate network delay ni • If mode == SPIKE then • if ni  tail*old_d then • mode = NORMAL • elseif ni > head*dithen • mode = SPIKE; old_d = di • else • update delay records for the past W packets • endif • At the beginning of a talkspurt • If mode == SPIKE then • di = ni • else • obtain (, , ) for Weilbull distribution for the past W packets • search playout d which meets minimum Im criterion • endif

  15. Performance Analysis and Comparison (1) • Selected five traces from UoP to CU (USA), DUT (Germany), BUPT (China), and NC (China). • Traces 1 and 3 with high delay variation and traces 2, 4, 5 with low delay variation

  16. Performance Analysis and Comparison (2) • “p-optimum” algorithm achieves the optimum voice quality for all traces. • “adaptive” algorithm achieves sub-optimum quality with low complexity.

  17. Conclusions and Future Work • Conclusions • The development of a new methodology and regression models to predict voice quality non-intrusively. • Demonstrated the application of new non-intrusive voice quality prediction models to perceptual-based optimization of playout buffer algorithms. • Future Work • To consider buffer adaptation during a talkspurt in order to achieve the best trade-off between delay, loss and end-to-end jitter. • To extend the work to improve the performance of multimedia services (e.g. audio/image/video) over IP networks

  18. Contact Details • http://www.tech.plymouth.ac.uk/spmc • Dr. Lingfen Sun L.Sun@plymouth.ac.uk • Prof Emmanuel Ifeachor E.Ifeachor@plymouth.ac.uk • Any questions? Thank you!

More Related