440 likes | 530 Views
A Few of Speech Recognition's Greatest Blunders. David Thomson CTO, SpeechPhone (VoiceXML Tools Committee chair) david@speechphone.com. Over 22 years in the field: some breakthroughs, some disasters. Field Problem Examples. Germs and money User training Echo cancellation
E N D
A Few of Speech Recognition's Greatest Blunders David Thomson CTO, SpeechPhone (VoiceXML Tools Committee chair) david@speechphone.com
Over 22 years in the field: some breakthroughs, some disasters.
Field Problem Examples • Germs and money • User training • Echo cancellation • Inexperienced management • Last-minute "improvements" • User interface testing • Half-duplex speakerphones • Ventilation • Fire safety • Leading the market • Offering too much • Component "upgrade" • Tuning
Chapter: Analog Echo Germs and Money
ATM Speaker Verification Pick up the phone and say the following digit string: 3594. 3594 • Two levels of security: PIN and voiceprint. • Random digit strings protect from recordings.
Chapter: Analog Echo User Training
MovieFone (777-FILM) Hello and welcome to MovieFone... • MovieFone w/ASR • MovieFone was the dominant U.S. movie information service, taking over 80,000,000 calls/year. • ASR overwhelmingly preferred over touch-tone in caller survey. • Users favored menu-based over spontaneous input.
Example MovieLocator Transaction What science fiction movies are playing? Near what city? Wheaton. Near Wheaton, Pirates of the Caribbean is playing at the Ogden 6 theater. What time is it showing? At the Ogden 6 theater, Pirates of the Carribean shows at 7:30. Movie information conversation. The recognizer is designed to understand any reasonable movie information request from the caller.
Would You Use This To Find Movies? never sometimes often always Newspaper 0 8 6 7 Phone the Theater 11 5 4 1 MovieFone 10 10 1 0 MovieLocator 8 5 6 2 Menu-based 3 6 8 4 Total = 22 subjects
ASR vs. Human Attendants ASR: - 96.2% calls routed correctly Receptionists: - 87% calls routed correctly Conditions: Callers were greeted with “How may I direct your call?” and were routed to one of over 30 departments. Accuracy was scored by the customer.
Chapter: Analog Echo Echo cancellation
Echo in an Analog System -11 dBm signal Prompt Generator Telephone Network Tip/Ring Card -15 dB Hybrid -6 dB Echo Canceller -7 dB Line:-9 dB -25 dbm Signal Speech Recognizer Speech: -40 dBm Echo: -33 dBm SNR: -7 dB Low speech signal strength and strong echos generated by the local network card conspire to make speech recognition difficult. Speech is up to 9 dB quieter and echos are about 31 dB louder than in a digital system, for a total signal-to-noise ratio loss of 40 dB.
Chapter: Analog Echo Inexperienced Management
Voice Verification and Dialing • Panic response to competitor. • No initial business case. • Used unproven SV platform. • Heavy use of inexperienced contractors. • Poor budgeting. • Distributed development organization. • Turf battles, technical disagreements, egos. • Changing feature requirements. • Staff of 60, 4 years, $70M.
Chapter: Analog Echo Last-Minute “Improvements”
Heat Sink Failure Epoxy Beads
Chapter: Analog Echo User Interface Testing
Multilingual Digit Dialer Vier drei fünf vier zwei null sechs drei sieben. • Complex user interface • Language dependencies ignored • No testing on naïve users • User errors exceeded ASR errors • System was deployed, then removed
Chapter: Analog Echo Half-Duplex Speakerphones
Telephone Network Name Dialing - Placing a Call (Dial tone) Call home Voice Dialer Calling “home”
What can I do for you now? Half-Duplex Speakerphones Half-Duplex Speakerphone Speaker Prompt Call messages. Speech Recognition System Response ) ) ) ) Microphone Unless user speech can force the handsfree phone to switch off the prompt, the recognition system hears nothing.
Unmasking Half-Duplex Equipment Ready? OK Speakerphone user Handset user Go. 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10. 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10.
Chapter: Analog Echo Ventilation
Extreme Temperature Environment 120 degrees Frame 1 Frame 2 Airflow Fan Door Vent Hall Window (20 yards)
A/C Frame cooling example - side view Monitor Monitor Monitor A/C Unit A/C Unit A/C Unit Master PC Master PC Master PC A. Ideal airflow B. Air leaks C. Ducted frame
Chapter: Analog Echo Fire Safety - 1
Example of Flammability Failure IR View
Chapter: Analog Echo Fire Safety - 2
Central Office Grade Speech Server LAN Card Photo of CDSUs in a frame: d:\ppt\cdsu.jpg 48V Power
Backplane Current Sense Resistors Sense Resistors
Chapter: Analog Echo Leading the Market
Telco Data Network Wi-Fi Network Wi-Fi Voice Dialing Mobile Device Call David Thomson SoftPhone VoiceDial VoIP Gateway SDK ASR TTS
Chapter: Analog Echo Offering too Much
1 2 3 4 5 6 7 8 9 * 0 # Connecting 630-555-1212 A service that does everything Business Directory. Movie Locator Messages Shopping Welcome to Lucent Technologies Automated Business Call Dialer. Please say the name of the Business to Call. For information, say ‘help.’ Weather Line VoiceXML Voice E-mail Business Directory Voice Dialing United Airlines. Calling United. To cancel, say ‘cancel.’ Business may subscribe to be listed in this service.
Talking Call Waiting http://www.ameritech.com/navigation/site/1,1935,150,00.html • Now, you can HEAR who's behind the call waiting beep. • First, you hear the Call Waiting "beep" and then you hear • the name of the second caller. • Once you've heard the name, you decide if you want to • "click over" and take the call. It's that simple! • Talking Call Waiting is only $2.50 a month if you currently • have Call Waiting on your phone line. • Talking Call Waiting is currently available in our Major • Market areas of: • Chicago, IL • Indianapolis, IN • Detroit, MI • Akron, OH • Cleveland, OH • Columbus, OH • Dayton, OH • Milwaukee, WI or Call to Order Today 1-888-635-5050 $2.50/mo. Talking Call Waiting Instructions
Chapter: Analog Echo Component “Upgrade”
Chapter: Analog Echo Tuning
Field Accuracy Improves Over Time Error Rate Land-Line Models Wireless Digit Dialing Trial New Models from Field Data Final Tuning Lab 1st Iteration 2nd Iteration Final
Other Assorted Field Problems • ASR works, forces touch-tone failures • Late beep causes people to speak early • Voice enhancement wrecked spectrum • Failure to record left developers blind • Speech takes the heat for unrelated bugs
For Slides or More Information David Thomson david@speechphone.com Phone 949-655-1693