270 likes | 382 Views
Transforming Contact Centers with Speech and IP. Jack Chase, Director of Product Management , NMS Rob Kassel, Senior Manager, Network Speech Products, Nuance. Agenda. The Evolution of Contact Centers Business trends Architectures Speech Technology Update — Rob Kassel, Nuance
E N D
Transforming Contact Centers with Speech and IP Jack Chase, Director of Product Management , NMSRob Kassel, Senior Manager, Network Speech Products, Nuance
Agenda • The Evolution of Contact Centers • Business trends • Architectures • Speech Technology Update — Rob Kassel, Nuance • MRCP-enabled speech www.nmscommunications.com
Contact Center Evolution www.nmscommunications.com
Evolution of Contact Centers:Business Trends Second Generation First Generation Third Generation Virtual Call Center IVRU & ACD integration Multi-media access: Email, fax, web Integrated ERP/CRM Skills-based routing • Single and • distributed sites • Some use of • IVRU and ACD • Screen pops • Some call routing • via ACD Stand-alone sites Limited PBX routing Customer talks into phone Agent types into computer Hardware-based Cost Center Integration and Technology Solving Business Problems: Profit Center www.nmscommunications.com
The Obvious Cost Savings Target Source: Benchmark Portal, 2002 www.nmscommunications.com
The Cost of Customer Interactionis Reduced with Self Service Source: Gartner Group, 2002 www.nmscommunications.com
Evolution of Contact Centers:Technology Trends • Self-service using web, ASR and TTS is reducing the dependency on live agents; costs • Web, email, and messaging are freely mixed with phone calls in a single queue • Network based contact centers are becoming a significant phenomenon • VoIP is lowering system costs at the agent and between system components • By 2007, 30% of contact center agents will be on VoIP www.nmscommunications.com
PSTN Circuit-Based Contact Center CRM CTI IVR ACD Circuit Data www.nmscommunications.com
PSTN Circuit Data VOIP VoIP in an IP Contact Center CRM Operations Center Contact Center (ACD+CTI +IVR+Speech) Self-Service Site A VOIP IP-PBX Site B www.nmscommunications.com
PSTN Circuit Data VOIP Upgrading with MRCP and VXML CRM Operations Center VXML Server Application Server VXML Site A MRCP SIP, CCXML RTP Speech Server Media Server IP-PBX Site B www.nmscommunications.com
Rob Kassel, Senior Manager, Network Speech Products, Nuance Speech Technology Update www.nuance.com
The Need For Speech Recognition • DTMF often is used for customer self-service • Numeric entry is easy… unless you are reading • Spelling entry is more difficult • Menus need to be enumerated, can’t be too long • Deep menu structure becomes tiresome • Assignment inconsistent between vendors (e.g., voicemail) • How do you enter “5 ½%” or “Albuquerque”? • With speech, questions are answered naturally • Caller satisfaction is higher • Fewer zero-outs leads to additional cost savings www.nuance.com
SystemDictionary PronunciationRules Speech Recognition Process Speech SpeechDetector FeatureExtraction AcousticModels PhonemeClassifier Grammar GrammarCompiler Search ConfidenceScoring Results www.nuance.com
Speech Recognition Challenges • Processor and memory demands • Speech can be difficult to decode, even for humans • Fixed, confusable vocabularies: “B-C-D-E-G-P-T-V-Z” • Ambiguous boundaries: “It’s hard to wreck a nice beach!” • Speaker variability: dialect, volume, gender, etc. • Noise rejection: hands-free, mobile, telematics • Out-of-vocabulary rejection & confidence measures • Callers don’t always say what you might expect… Yes or no? www.nuance.com
Speech Recognition: State of the Art • Callers speak naturally in directed dialogs • High accuracy, infrequent confirmation • Million-word vocabularies:stocks, proper names, street addresses • Scripting to control values returned to application:“half past three” can return “1530” or “afternoon” • Open-ended responses, especially for call routing • Allows for questions like “How may I help you?” • Based on statistical methods trained from examples www.nuance.com
The Need For Text-To-Speech • Professional recordings best for fixed content • Word concatenation is difficult to do well • Often used for numeric output • Can sound mechanical; irritating when frequent • Large output vocabularies fairly common(e.g. city names) • Some applications defy recordings(e.g. messaging) www.nuance.com
SystemDictionary PronunciationRules TTS Text Analysis Source Text “Are you there?” are + you + there + <question> $31 thirty one dollars ATM eh tee em NATO nay-toh A.M. eh em CUL8R see you later TextNormalization minute = 60 seconds minute = tiny Dr. Jones doctor jones Jones Dr. jones drive 11210 eleven thousand two hundred ten (number)11210 one one two one oh (ZIP code) HomographDisambiguation PronunciationGeneration Determine which words require emphasis Insert pauses based on phrase boundaries, lung capacity Assign duration, pitch, and volume to each phoneme ProsodyGeneration Annotated Text www.nuance.com
Can mimic natural speech if parameters are set by hand In practice sounds somewhat robotic, the “drunken Swede” Can produce a variety of voices Extremely compact Units can be smaller or larger than a phoneme Database tends to be very large Preserves speaker characteristics and speaking style of voice talent Annotated Text Parameter Generation FEMALE FEMALE CHILD Vocal TractModel Speech TTS Waveform Generation Parametric Concatenative Annotated Text VoiceDatabase UnitSelection Concatenateand Smooth Speech www.nuance.com
Text-to-Speech: State of the Art • Naturalness of concatenative TTS is generally preferred for call center applications • …but voice talent takes direction, more expressive • Custom voices to maintain brand identity • Use one voice talent for both recordings and TTS • Seamlessly mix dynamic data with static prompts • Apply prompt “patches” rapidly untilcost of recording session can be justified www.nuance.com
Designing Speech Applications • Observe & interview call center agents • Listen to calls, develop caller profiles • Who are they? • What do they know? • Where are they calling from? • What are their goals? • What are their priorities? • Determine business objectives & rules • Define speech user interface • Call flows • Prompt wording • Error recovery; help and instructions • Anthropomorphism and persona www.nuance.com
MRCP and Natural Access www.nmscommunications.com
MRCP Server IP IVR Servers IVR Servers Speech Servers Speech Servers PSTN Control: MRCP/ RTSP/ TCP/ IP Speech: G.711/ RTP/ UDP/ IP What is MRCP v1? • Speech servers are connected by VoIP to IVR servers • Standard API for ASR and TTS • Easy to reconfigure system as needs change • Easy to implement redundancy www.nmscommunications.com
Natural Access and MRCP Call Control PSTN Trunking USAI (MRCP) Conferencing IVR Services VoIP (Fusion) Fax Services Video Access OAM Service Managers, Libraries SNMP IPC Driver Driver Driver PCI PCI IP PCI HMP PacketMediaHMP CG Boards CX Boards AG Boards www.nmscommunications.com
Universal Speech Access Makes Speech Integration Easy www.nmscommunications.com
Current Support for Universal Speech Access www.nmscommunications.com
What’s Next for MRCP? • MRCP v2 • draft-ietf-speechsc-mrcpv2-06, Feb 20, 2005 • Adds SIP/SDP for session setup • Replaces RTSP • Adds support for speaker verification • Little deployment yet • NMS will update USAI when deployments occur www.nmscommunications.com
Questions?Contact Info:jack_chase@nmss.comrob.kassel@nuance.com