370 likes | 519 Views
Tools and Processes for Testing VoIP Chris Bajorek, Director CT Labs www.ct-labs.com. About the Speaker Chris Bajorek, Director and Founder, CT Labs
E N D
Tools and Processes for Testing VoIP Chris Bajorek, DirectorCT Labswww.ct-labs.com
About the Speaker Chris Bajorek, Director and Founder, CT Labs Chris Bajorek is a 25-year veteran of computer telephony and converged communications. Bajorek has led the company to its industry-leading position in testing services which include real-world performance testing, interoperability verification, and usability and quality analysis. Customers include first-tier enterprise and carrier-grade next-generation network product manufacturers. Prior to founding CT Labs, Bajorek founded Telephone Response Technologies, Inc. (TRT), which developed and sold turnkey voice response and unified messaging products as well as award-winning toolkits for rapid development of voice-based applications. Prior to TRT he worked for Integrated Office Systems and Time and Space Processing where he performed pioneering work on voicemail and digital voice communications products. Bajorek holds a B.S.E.E. from Cal Poly, San Luis Obispo.
For Today’s Talk: Taking a Developer’s Perspective to VoIP Test • Much of CT Labs’ business is with R&D and QA groups of VoIP product manufacturers • Would like to provide a window into some of our VoIP test experiences, including • Common VoIP test myths • Testing tips and suggestions • Focus on voice quality testing—hot area for VoIP
Myths around VoIP Deployment • Voice quality is a given • VoIP is easy to deploy • VoIP is inexpensive to deploy • All VoIP-enabled phones are created equal • Once you have your VoIP network set up, you can leave it alone
VoIP Requires a Lifecycle Approach • Lack of a proper lifecycle will: • Drive Costs Up • Reduce VoIP Reliability / Availability • Risk Complete Failure of Deployment Should design new VoIP products with this in mind
VoIP Troubleshooting Areas – The Big Picture • Call Processing (i.e. call connectivity, service availability) • Voice Quality • Interoperability / Feature Interaction • Configuration / Registration • Routing • Security • Applications (conferencing, IVR, voicemail, …)
Troubleshooting example • Symptom: sporadic call “failures” • Common causes : • Gateway and switch mis-configuration • Interoperability issues between equipment • Capacity limitations • Performance issues and delays triggering timeouts • “Feature interaction” issues such as conflicting call-forwarding settings
VoIP Deployment Segments • Residential (Voice over Broadband) • Enterprise • Next-Gen Network Carriers and Service Providers All three areas are quite active now…
VoIP Products, by Segment(products that “touch” the media stream) • Residential • Analog terminal adapters, VoIP softphones, residential routers • Enterprise • IP PBXs, IP Contact Centers, VoIP phones & softphones, firewalls/ALGs, media servers (conferencing, voice mail) • Next-Gen Carriers and Service Providers • Session border controllers, media servers, media gateways, transcoding/VQ enhancement processors
VoIP Testing Areas of Focus • Service reliability • i.e. Availability of service, Call connectivity • Voice quality • Includes measurement of VQ, latency, levels, echo can., etc. • “Phone” features • CLASS features, such as call park, transfer, etc. • VoIP Access to enhanced services • Voice mail, conferencing, IVR, etc. • Each of these areas has its own set of testing challenges, but one thing is clear: all relate to the end-user Quality Experience and must be validated
Active versus Passive VoIP Testing • Active tests • Involves driving real 2-way calls thru the VoIP network • Benefits: more accurate, uses mature standards (PESQ, etc.) for automated quality assessment • Negatives: consumes network resources • Passive tests • Involves passive evaluation of call-based packet flows • Ignores (or models) VoIP endpoint-specific behaviors to network conditions
Post-Deployment, Passive Testing is Key • Deployed VoIP networks should: • Continuously monitor passive VQ, call completion rates, network packet loss, jitter, & latency • Set alarming thresholds for VoIP call performance that degrades below adaptive-corrective levels • Assumption: Pre-deployment tests resulted in… • Clean bill of network health • Baseline characterization of network during peak, off-peak times
Passive Monitoring “Embedded Components” for Product Developers • Products incorporating these can quickly adapt to changing IP network conditions • Real-time access to estimated MOS, round-trip latency • Access to level and echo information for estimate of MOS-Conversational Quality • VQMon – from Telchemy (www.telchemy.com) • PsyVoIP -- from Psytechnics (www.psytechnics.com)
A few things about Codecs • Waveform codecs • Produces waveform as identical as possible to the original (G.711 PCM, G.726 ADPCM) • Source codecs • Uses a model of how speech is generated • Can significantly alter the time-domain waveform while sounding very similar to the input (G.729a/729, G.723.1)
A few things about Codecs • Hybrid codecs • Combine techniques from waveform and source codecs • Uses different modes and bit rates depending on network conditions • AMR • Bit rate: 4.75-12.2 kbps MIPS complexity: 15-20 • AMR-WB / G.722.2 (wideband—7kHz signal bw) • Bit rate: 6.6-28.3 kbps MIPS complexity: 38 (incl. VAD and CNG) • Why knowledge of codec method(s) is useful for VQ analysis
Devices that can affect a User’s “VoIP Experience” • IP PBXs • IP Phones & VoIP endpoints • Media Gateways • IVR / Voice portals • SBCs (Border Controllers) • Media Servers • Firewalls/ALGs • Messaging Servers • Conference Bridges
Voice Quality versus Intelligibility • Voice quality: the “acceptability” of speech • Intelligibility: the “clarity” of speech • Subjective tests: Diagnostic Rhyme Test, Modified Rhyme Test • Higher frequencies more important for intelligibility, a good benefit of wideband codecs • Lower quality affects intelligibility but not necessarily vice versa
Voice Quality Measurement – A Hot Topic • What is considered the “gold standard” way to measure voice quality? • Answer: with humans, and the more of them in a listening session the better the resolution of the resulting quality scores • However, conducting a live-listener test is not as easy or cheap as you may think…
MOS Subjective Testing • It’s a Standard: ITU-T P.800 (1996) • The technique rates quality using “absolute category rating” method (ACR) 5-grade scale: 5=excellent 4=good 3=fair 2=poor 1=bad
MOS Subjective Testing • How it’s done • Requires use of a group of 32-64 “naive” listeners • Standardized male, female, and child phrases are used • Calibrating “reference” degraded conditions are intermixed with actual samples • The identical speech sample sets are played to all listeners • Listeners judge the quality of each phrase using ACR scale
MOS Subjective Testing • Strengths • Provides the definitive answer to “which sounds best?” • Weaknesses • High cost, especially when many different test conditions or sample sets must be evaluated • Takes time to schedule test and get results
Objective VQ Standards • All automated VQ measurement techniques are designed to estimate the way humans perceive voice quality • PSQM P.861 (1996) • PSQM+ handled higher distortion levels than PSQM • PESQ P.862 (2001) • Solved variable delay (“alignment”) problem of PSQM
What PESQ VQ Testing isdesigned for • PESQ is a way to quickly and cost-effectively estimate the effects of one-way speech distortion and noise on speech quality • PESQ is “endpoint-agnostic” – can be used for VoIP-to-VoIP, VoIP-to-PSTN calls, etc. • PESQ can be used for VQ assessment of wideband codecs if your test platform supports it (if not, 3.1kHz signal bandwidth applies)
What PESQ VQ Testing is not designed for • PESQ does not evaluate the effects of loudness loss, fixed latency, sidetone, or echo as related to two-way caller interactions • PESQ can not safely be used to declare a VQ “winner” when the PESQ score differential is small (i.e. <.25) • “Opposite conclusion” errors are very possible, so the bigger the score differential the better • Especially true when comparing samples with more than a single changed “variable”
Objective VQ Testing • Strengths • Provides excellent estimate of voice quality • Tests can be performed quickly • Tests are very repeatable • Weaknesses • Not good for reliably resolving small differences in quality scores
Troubleshooting VQ Issues • Must look at all the metrics of VoIP calls exactly as transmitted on the network • Packet Loss ? • Jitter ? • Delay ? • Voice Quality ? Measurement is critical for problem resolution Jitter distribution graph
Tip: How to test end-to-end VQof VoIP phones • #1: It’s usually not enough to evaluate VQ by just looking at the packet streams (i.e. E-model) • #2: Must evaluate quality all the way to the phone’s earpiece and microphone wires • So can evaluate the proper operation of the phone’s internal “VoIP gateway”, including automatic gain (AGC), voice activity detection (VAD), comfort noise generation (CNG), echo cancellation, codecs, jitter buffer management, and packet loss concealment algorithms. • In other words, there is much that can go wrong.
Tip: How to test end-to-end VQof VoIP phones • #3: Must evaluate under expected LAN/WAN impairment conditions • Packet loss, Jitter, Latency • Effective bandwidth of IP connection • i.e. Broadband versus Dialup • #4: Don’t forget interoperability testing against other VoIP devices • Verify VQ against other expected manufacturer’s devices
Testing end-to-end VQ of VoIP phones • The automated VQ test • Important for verifying VQ under many conditions • Vary one dimension at a time during subsequent test runs • The manual VQ “real user” test • Conduct 2-way calls with real users who are familiar with potential echo cancellation and other 2-way effects • Include handset and speakerphone test calls
Testing end-to-end VQ of VoIP phones • Test setup examples • Softphone to softphone test • VoIP Phone to VoIP Phone test (in lab) • VoIP Phone to PSTN call test • Variations on these themes easily set up • Wideband codecs used? If so, be sure to verify that all test equipment in the audio/media signal path can support 8 kHz.
Testing Softphone-to-Softphone Media may flow peer-to-peer or through the VoIP Network component PESQ evaluated off-line via batch process
Testing VoIP Phone-to-VoIP Phone Good setup when isolated device performance test is needed.Phone calls are manually placed with this setup.
Example: WAN Impairment Conditionsfor VQ TestConditions suitable for emulation of overseas Internet dialup conditions
Watch out for… • Do not try to compare “MOS” scores derived from different sources or evaluation engines • Even the numeric ranges from “worse” to “best” can vary (i.e. “best” = 4.5, not 5.0) • Especially, don’t compare passive with active VQ results
Real-World Next-Gen NetworkProduct Testing www.ct-labs.com 916-577-2100 Chris Bajorekchris@ct-labs.com916-577-2110 direct line