1 / 35

Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006. Command & control. Transactional. Value. Value. Commerce. Devices. Ease of use Speed, efficiency Extended reach. Lower cost Increased cust sat From cost to revenue. Information Access.

Download Presentation

Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Speech Technology Opportunities and ChallengesDavid NahamooSpeech CTO, IBM ResearchDec 12, 2006

  2. Command & control Transactional Value Value Commerce Devices • Ease of use • Speed, efficiency • Extended reach • Lower cost • Increased cust sat • From cost to revenue Information Access USABILITY AUTOMATION Multimodal Interaction Multichannel Self-Service Problem solving Dictation Voice Web Multilingual communication Accessibility Value • Integration of voice/video with enterprise data • Indexing of large amount of multimedia info • Breaking language barriers • Accessibility Information GLOBAL ACCESS Multimedia Analytics Transcription Needs for Speech Technology

  3. Major Speech Application Opportunities • Commerce • Contact Centers • Unified Communication • Global Access • Speech To Speech Translation • Translingual MultiMedia Mining • Accessibility • Devices • Automotive • Set Top Box • Mobile Phones

  4. Speech Technology Innovation that Matters • Conversational Interaction • Dealing with Complexity • Speech Analytics • Extracting Insight / Knowledge • Multilingual Dimension • Globalization

  5. Contact Centers Of Future

  6. Contact Centers face a number of challenges as they attempt to balance costs, customer experience and revenue growth Cost Reduction/ Containment 2. Customer Experience Improvement 3. Revenue Growth BUT… BUT… BUT… Too much focus on Cost Reduction Too much focus on the Customer Experience Too much focus on Revenue Growth Can actually lead to… Can actually lead to… Can actually lead to… Limits on Revenue Growth Poor Customer Experience Limits on Revenue Growth Poor Customer Experience Rising Costs Rising Costs Differing emphasis can be placed on each one, but unless managed carefully and balanced effectively for the business, the effects can be disastrous…

  7. Web Voice Chat Email Voicemail Video Managed IP Network VoIP Gateway Public Internet Contact Centers – Logical Components and Focus Areas Back-end business processes, applications and information services (internal and external) Systems Information Analytics UIM Search MDM ODS EDW ECM CRM ERP SCM QAM KPIs KM RTA Agent Desktop Channel Services – Self-service Portal Alerts Dashboards Agent Services Outbound Services Presence Skills WFM Contact Services Dialer Universal Queue Routing Voice Callback Data Services Voice Web Chat Email Voicemail Channel Services - Assisted Mail Fax Network Services

  8. Web Voice Chat Email Voicemail Video Managed IP Network VoIP Gateway Public Internet Contact Centers – Logical Components and Focus Areas Information Integration Back-end business processes, applications and information services (internal and external) Analytics Systems Information Analytics UIM Search MDM ODS EDW ECM CRM ERP SCM QAM KPIs KM RTA Self Service Agent Desktop Channel Services – Self-service Revenue Growth Portal Alerts Dashboards Agent Performance Agent Services Outbound Services Presence Skills WFM Contact Services Dialer Universal Queue Routing Voice Callback Data Services Voice Web Chat Email Voicemail Channel Services - Assisted Multi-channel Access Mail Fax Network Services

  9. Self-Service

  10. Increased Self-Service • Self-service to 80% levels and higher is possible in at least some centers • Today’s contact centers are typically 10 to 20% self-service in most industries, but at least some companies claim 80% self-service now where Web-based interaction predominates; when voice predominates, numbers are much lower • Live-agent costs are an order of magnitude higher than the costs of self-service • Self-service adoption has been slow to take off (8% growth 2003-2005) • Self-service is more challenging technically than agent performance because of the difficulty of achieving high customer satisfaction • Self-service is often run by another group than the one that runs the contact center • Self-service will be the end-game as labor-arbitrage becomes increasingly more difficult • Whichever vendor develops ways to drive self-service fastest (while maintaining customer satisfaction) will have a commanding position in the marketplace • Self-service is clearly a huge cost-savings opportunity

  11. Customers prefer the convenience and control aspect of self-service, and have high expectations “How important was the ability to serve yourself (as a customer) in your decision to use the service provider in the first place?” • Customers prefer self-service • Self-service preferred for many types of customer contact • Viewing Bill (42%); • Checking Minutes (44%); • Checking/Changing a Talkplan/package (37%); • Subscribing to Services (38%) • Web preferred to the phone (50%) • Provided one can obtain answers in the same amount of time • And their expectations are very high • Ease of use • 86% indicate they would stop using an organization if their IVR was difficult to use • High level of service • 82% indicate lower level of service via the web unacceptable • Majority indicate they would abandon a web transaction or go to a competitor due to usability issues. Source: Fujitsu Consulting and Netonomy, Modalis Research Technologies, Genesys, Inc., Harris Interactive

  12. IVRs are still the dominant self-service channel and they are increasingly becoming speech-enabled • IVRs are still the dominant contact channel • 45% of contact starts in IVR channel • IVRs are becoming speech-enabled • Speech-enabled IVRs support more complex functionality and higher completion rates • Well-designed voice user interfaces (VUI) can reduce call time by as much as 30% and compared to traditional IVR systems and cut opt-out rates by 50% (Forrester - 2003) • Increased IVR retention rate. Companies are up to 60% more likely to retain a caller within the IVR using speech vs. touch tone (Giga)

  13. Conversational Interaction • Should support the gap between user mental model and the application model • Task Complexity • User Familiarity • User Patience • Should minimize the user effort and task completion time • Consistent • Rapid • Efficient

  14. Conversational Solutions INFORMATIONAL TRANSACTIONAL PROBLEMSOLVING PACKAGE TRACKING BANKING CALL ROUTING STOCK TRADING STOCK QUOTE CUSTOMER CARE FLIGHTRESERVATION FLIGHT STATUS TECHNICAL SUPPORT TRAVEL LOW MEDIUM HIGH COMPLEXITY

  15. PACKAGE TRACKING String of numbers & characters + checksum ASR STOCKQUOTE Large list of names and symbols ASR User model is close to application, Some decision making, Time not a factor BANKING Directed Dialog and limited syntax NLU STOCK TRADING User and application models match, Time not a factor, No decision making

  16. TRAVEL Substantial Dialog (& Language Understanding) User and application models match, No decision making, Long list of concepts CALL ROUTING Substantial Language Understanding User’s model might not match application’s, Involved decision making, Time a factor

  17. Conversational Help Desk Challenges • Help Desk is the most complex of all three types of conversational speech applications • Complexity is based on Nature of the Call • User domain model is limited at best • User is usually upset • Complex dialog and language understanding • Current Market Solution • No Industry “best practices” have been established

  18. Introducing ( ) Audrey Overview of IBM Help Desk Incoming Calls Main Menu Workstation, host, password, business app, telephone Troubleshooting Create Trouble Ticket Password Reset Self Help (FAQ/HOWTO) Not-entitled Self service (telephone) 0.5% to 3% Agent handles 97-99.5% of calls 80% Service is HOWTO

  19. Speech Analytics

  20. Customer Contact Center Analytics Contact Points Enterprise Branch office Web Agent Self Service Call Center IVR Products & Services Unstructured Call logs & transcripts Emails, Surveys Structured Customer/Product Transaction Data • Analytics enhances value for: • Self-service • Agent performance • Cross-sell/up-sell • Transformational Diagnostics • Business Intelligence • Marketing • …… Structured Agent Data Instant Market Intelligence • Customer preferences • Dissatisfaction Drivers • Lifetime Value Management Analyze Agent Performance • Improve C-Sat, Upsell Rate Analyze Contact Drivers • Improve FAQs, Web pages Integrate & Analyze Structured & Unstructured Data

  21. Call Center Operation Quality Millions of Calls Everyday • Want general information: • Are callers happy? • Are processes followed? • What are people asking for? • What is the trend of occurrence of known problems? • Are there new problems? • Need to know where to take action: • Save a customer from defecting • Apologize for mishandled calls • Show call to agent for coaching • Follow up on a missed sales opportunity Currently • Human monitoring is necessary for these things • Only a small fraction of calls can be checked • Most checking is wasted • There is no permanent record of the calls

  22. Extract audio from CM/DB2 UIMA Processing Pipeline From CM Collection Reader Turn Audio into Text Speech-to-Text Annotator Evaluate Calls Analytics Annotators audio Store Analysis and Transcripts back into CM/DB2 CAS Consumer Transcribed & Analyzed audio IBM Call Centers CallRank CM Websphere Calls & Stored Analysis Speech Analytics for Automated Quality Monitoring • Background: • IBM NA call center team listens to and evaluates ~1% of all calls • 35 questions answered • “did the agent use courteous words and phrases?” • “did the agent speak in an appropriate tone?” • “did the agent follow the closing procedure?” • “did the agent solve the problem?” • Mostly random calls, rarely interesting • Typical of all call centers • CallRank Quality MonitoringApplication: • Monitor 100% of calls • Answer questions and assign default ratings • Provide a ranked list to human monitors to focus attention on bad calls

  23. Example of a good call

  24. Example of a bad call

  25. Automated Quality Monitoring • Status: • Three times as many bad calls found for same listening effort • Processing ~ 3000 calls/day now from all North American centers • Technology: • Answer many questions with pattern matching on decoded text • Did the agent follow the appropriate closing script? • Search for “THANK YOU FOR CALLING”, “ANYTHING ELSE”, “SERVICE REQUEST” • Use other linguistic cues to improve the accuracy of the system • Number of hesitations (UH, UM, HUM, etc), total silence, longest silence, …

  26. Agent Performance

  27. Agent Performance • Personnel costs are by far the largest component of existing contact center costs • Move to off-shore operations has resulted in significant (up to 75%) labor cost reductions • Large contact centers have very large numbers of personnel • Estimated 6M agents in U.S. in 2004 and continuing to grow • Even with the rise of self-service, a percentage of calls will still be handled by live agents • Numerous opportunities exist to improve performance by automation: • Integration of systems across the business for use in the contact center • On-boarding process (e.g., accent monitoring) • Training (on-boarding, continuing education, real-time training) • Agent quality monitoring • Call logging (30% of agent time in some contact centers) • Helping the agent find the answer to the customer’s question • Workforce management • Intelligent call routing globally • Expert “multi-channel” agents • Activity-centric computing and other collaborative projects

  28. Agent Performance: Voice Assessment/Training Increased number of off-shore centers • e.g., India (>50% growth) Key focus in off-shore contact centers • Hiring • Shrinking candidate pool and high agent attrition rates • Training • Train agents to have neutral accents to improve customer experience Voice Assessment/Training System • Candidate screening for • Grammar • Pronunciation • Spoken language comprehension • Accent training • Correctness of pronunciation, intonation, speaking rate and syllable stress

  29. Contact Centers Summary • Contact centers are focal points in an enterprise from which all customer contacts are managed • Contact Centers face a number of challenges as they attempt to balance costs, customer experience and revenue growth • Customers increasingly prefer self-serviceand speech self-service is now ready for prime time • Enterprises can achieve improved agent performancewith agent productivity tools and agent hiring/training tools • Enterprises should focus on revenue growthtransforming their contact centers from cost centers to profit centers • Customer demand for choice, convenience and consistency is driving the adoption of multi-channel enablement in contact centers • Actionable intelligence from real-time and offlineanalytics of structured and unstructured customer interaction data will lead to new opportunities for cost reduction, revenue growth and improved customer experience

  30. Increasing Global Reach

  31. Global Language Barriers Different languages spoken by people living in different regions or even by different ethnic groups living in the same region • Language barriers cause… • High cost for agents – need both subject matter expertise and language skills • Call centers, insurance agents, etc. • Unreachable to broad international business or tourism travel market • Life threatening in • medical emergency • natural disaster situations • military • Multilingual on demand media and entertainment

  32. Data Point: Online population language Mismatch * * Mismatch: Diversity of languages spoken online increasing, yet language of web pages are consolidating *Global Internet Statistics (http://www.glreach.com/globstats/index.php3)

  33. Text Image Audio Video Speech Recognition Machine Translation Multimodal Multimedia Translingual Access Content Informational & Transactional Multimodal Access Translingual Access Multimodal Translingual Access OCR Multimedia Translingual Analytics Transcription, biometrics, … Translingual Analytics Multimedia Analytics Information Analytics Context Text mining, Categorization, Taxonomies, Entity extraction, Entity relation, Ontology, …

  34. S2S Translation call for innovation • Speech Recognition Challenges • Needs to work in noisy environments, with spontaneous, conversational speech in multiple languages, could be emotional speech when under stress. • Translation has to handle output of ASR system • Recognition errors • Spoken language: different from written language • Non-grammatical disfluencies • Imperfect syntax • Lack of formal characteristics of text: no punctuation or paragraphing • Translated text must be "speakable" for oral communication • not enough to translate content adequately; output must be fluent • Need to carefully consider and tune interactions between ASR, MT and NLG – need access to all components • Cost-effective development of new languages and domains • Intonation translation remains a grand challenge

  35. Speech Technology Driving New Business Opportunities • Increasing Self Service:More natural interaction with more difficult tasks is made possible • Increasing Agent Productivity, Monitoring Quality, and Increasing Sales Opportunity:Extracting insight from the content of conversation • Increasing the Global Reach:Breaking the language barrier

More Related