200 likes | 215 Views
Manual vs. Speech Input - Voice Interface with UAVs. Presented to: Georgetown Linguistics 362 December 6, 2006 Dale Schalow AMN Corp Software Consulting U.S. National Library of Medicine (NIH). Paper Introduction . Official paper
E N D
Manual vs. Speech Input - Voice Interface with UAVs Presented to: Georgetown Linguistics 362 December 6, 2006 Dale Schalow AMN Corp Software ConsultingU.S. National Library of Medicine (NIH)
Paper Introduction • Official paper “MANUAL VERSUS SPEECH INPUT FOR UNMANNED AERIAL VEHICLE CONTROL STATION OPERATIONS” • Publishers & Authors • U.S. Air Force Research Laboratory (Dayton, OH) • Sytronics, Inc. (Dayton, OH) • Paper Presentation Background • Proceedings of the Human Factors & Ergonomics Society’s 47th Annual Meeting, October 2003 • SpeechTEK “Bridging the Gap” Conference, Jan/Feb 2006, San Francisco Hyatt Regency
Topics to Discuss • Overview • My interest in the paper • Experiment Methods and Results • Summary of Experiment • Bridge to NLP • For More Info…
Overview • The Big Picture: Command Unmanned Aerial Vehicles with Voice Recognition Interface • Paper attempts to proveapplied Speech Technologyis a preferred input modecompared with Manual interfaces in UAV environments • Some of the Parts… NLP HUD ASR UAV USAF Speech Manual AVO IFR
Overview • Parts Vocabulary • ASR = Automatic Speech Recognition • UAV = Unmanned Aerial Vehicle (aircraft) • Voice Macro = Spoken Word or Phrased Command • CRT = “Camera Display” • HUD = Heads-Up Display • IFR = Instrument Flight Rules pilot rating • Applied NLP Parts = Zipf’s Law, Automata, Viterbi algorithm and maybe others… • Additional Aviation or lesser NLP terms are used in the paper
My Interest in the Paper • A SpeechTEK talk given by Dr. Sally Ride of UCSD and NASA noting a strong interest among astronauts and speech technology in space experiments • Emerging commercialization issues (e.g. FAA-supervised “automated flight control”, privatized space ventures)
Experiment Methods • Ten male Instrument Flight Rules (IFR)-rated pilots are the users • Speech Input Modality • “Push-to-talk” microphone • Speech Recognition engine (Nuance v8) • Grammar structures used to support greater accuracy of speech requests • Manual Input Modality • Control stick, keyboard, trackball • “Visual Cues” output on a camera display CRT
Experiment Methods • Ground Control Station Simulator
Experiment Methods • The Pilot’s Tasks • “Voice Macros” control aircraft with speech • Motivation: 2,000 to 15,000+ grammars in advanced real-time ASRwith naturally spoken language (research and available STT solutions) • 160 Phrases used for the experiment • A construed implementation of Zipf’s Law (which I’ll link in the Experiment’s Results) • Data entry tasks included corridor navigation (e.g. banks/turns) and Info Retrieval • Visual cues of Alerts shown for 10 seconds for a Confirmation Response from the Pilot
Experiment Results • Pretense • Performance over all measures worse in High Difficulty missions compared to Low Difficulty • No further details provided aside from no significant interactions between Mission Difficulty (low/high) and Input Modes (manual/speech) • Overall Task Completion Time • Tasks completed Faster when pilots used Speech Input compared to Manual Input • Measured things like Normal Ops Tasks, Warnings, and Information Queries
Experiment Results • No. of tasks completed incorrectly with speech was < 1/3 of the number associated with manual input • “Voice Macros” - applied Zipf’s Law? • Voice macros involved fewer steps than manual input (Task Frequency) • Operators performed longer maneuversmore favorably withtheir voice (Voice Rank)
Experiment Results • Performance of speech recognition was excellent - average correct recognition across users was 95.054% (range of 86.93% to 98.29%)
Experiment Results • Time was measured between alert onset, a CRT “Visual Cue”, and user confirmation response as voice command “Confirm” • Results did show response time significantly longer for Speech Input than with Manual Input and Information Queries • Although statistically important, average difference was very short, < 1 second
Experiment Results • Over ALL Flight/Navigation Tasks, airspeed, path, and altitude errors tended to be less with Speech Input compared with Manual Input • Pilots (AVOs) personally favored Speech Input over Manual Input in final debriefing questionnaire
Experiment Summary • Definitively stated Speech Input wassuperior to Manual Input for operators in this environment • “Processing time” negligible for voice response compared with the pros of Speech Input (i.e. heads-up, hands-free environment)
Experiment Summary • The only primary difference to using speech was saying the word ‘Confirm’ • Some users said it like “Cfirm” or “Firm” • Automaton illustrates these 3 paths • WordConfirm • Phonemeskh `Ah n f `OR: m • Based on IPAAmerican EnglishAlphabet
Bridge to NLP • The main point from my view is not just that speech works ‘better’ than manual interfaces. It shows that unifying Natural Language and Speech can be applied successfully to subject matter Rules - Pilot-rated IFR Vocabularies in this case • The Final Result from the implemented methods improves controlled output with a non-option approach. This may be quite similar to a Best Path model such as the Viterbi algorithm
For More Information • Paper link • http://www.hec.afrl.af.mil/Publications/HFES03VoiceFinal%20version2.pdf • USAF “Global Hawk” in the news • CNNVideo (Nov 2006) • “Doonesbury”The Washington Post(12/03/06) • NorthropGrumman DoD Web sites
For More Information • SpeechTEK (East/West) conference • Next one in February (Hilton San Francisco)
Manual vs. Speech Input - Voice Interface with UAVs Thank you!