1 / 20

Manual vs. Speech Input - Voice Interface with UAVs

Manual vs. Speech Input - Voice Interface with UAVs. Presented to: Georgetown Linguistics 362 December 6, 2006 Dale Schalow AMN Corp Software Consulting U.S. National Library of Medicine (NIH). Paper Introduction . Official paper

dixon
Download Presentation

Manual vs. Speech Input - Voice Interface with UAVs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Manual vs. Speech Input - Voice Interface with UAVs Presented to: Georgetown Linguistics 362 December 6, 2006 Dale Schalow AMN Corp Software ConsultingU.S. National Library of Medicine (NIH)

  2. Paper Introduction • Official paper “MANUAL VERSUS SPEECH INPUT FOR UNMANNED AERIAL VEHICLE CONTROL STATION OPERATIONS” • Publishers & Authors • U.S. Air Force Research Laboratory (Dayton, OH) • Sytronics, Inc. (Dayton, OH) • Paper Presentation Background • Proceedings of the Human Factors & Ergonomics Society’s 47th Annual Meeting, October 2003 • SpeechTEK “Bridging the Gap” Conference, Jan/Feb 2006, San Francisco Hyatt Regency

  3. Topics to Discuss • Overview • My interest in the paper • Experiment Methods and Results • Summary of Experiment • Bridge to NLP • For More Info…

  4. Overview • The Big Picture: Command Unmanned Aerial Vehicles with Voice Recognition Interface • Paper attempts to proveapplied Speech Technologyis a preferred input modecompared with Manual interfaces in UAV environments • Some of the Parts… NLP HUD ASR UAV USAF Speech Manual AVO IFR

  5. Overview • Parts Vocabulary • ASR = Automatic Speech Recognition • UAV = Unmanned Aerial Vehicle (aircraft) • Voice Macro = Spoken Word or Phrased Command • CRT = “Camera Display” • HUD = Heads-Up Display • IFR = Instrument Flight Rules pilot rating • Applied NLP Parts = Zipf’s Law, Automata, Viterbi algorithm and maybe others… • Additional Aviation or lesser NLP terms are used in the paper

  6. My Interest in the Paper • A SpeechTEK talk given by Dr. Sally Ride of UCSD and NASA noting a strong interest among astronauts and speech technology in space experiments • Emerging commercialization issues (e.g. FAA-supervised “automated flight control”, privatized space ventures)

  7. Experiment Methods • Ten male Instrument Flight Rules (IFR)-rated pilots are the users • Speech Input Modality • “Push-to-talk” microphone • Speech Recognition engine (Nuance v8) • Grammar structures used to support greater accuracy of speech requests • Manual Input Modality • Control stick, keyboard, trackball • “Visual Cues” output on a camera display CRT

  8. Experiment Methods • Ground Control Station Simulator

  9. Experiment Methods • The Pilot’s Tasks • “Voice Macros” control aircraft with speech • Motivation: 2,000 to 15,000+ grammars in advanced real-time ASRwith naturally spoken language (research and available STT solutions) • 160 Phrases used for the experiment • A construed implementation of Zipf’s Law (which I’ll link in the Experiment’s Results) • Data entry tasks included corridor navigation (e.g. banks/turns) and Info Retrieval • Visual cues of Alerts shown for 10 seconds for a Confirmation Response from the Pilot

  10. Experiment Results • Pretense • Performance over all measures worse in High Difficulty missions compared to Low Difficulty • No further details provided aside from no significant interactions between Mission Difficulty (low/high) and Input Modes (manual/speech) • Overall Task Completion Time • Tasks completed Faster when pilots used Speech Input compared to Manual Input • Measured things like Normal Ops Tasks, Warnings, and Information Queries

  11. Experiment Results • No. of tasks completed incorrectly with speech was < 1/3 of the number associated with manual input • “Voice Macros” - applied Zipf’s Law? • Voice macros involved fewer steps than manual input (Task Frequency) • Operators performed longer maneuversmore favorably withtheir voice (Voice Rank)

  12. Experiment Results • Performance of speech recognition was excellent - average correct recognition across users was 95.054% (range of 86.93% to 98.29%)

  13. Experiment Results • Time was measured between alert onset, a CRT “Visual Cue”, and user confirmation response as voice command “Confirm” • Results did show response time significantly longer for Speech Input than with Manual Input and Information Queries • Although statistically important, average difference was very short, < 1 second

  14. Experiment Results • Over ALL Flight/Navigation Tasks, airspeed, path, and altitude errors tended to be less with Speech Input compared with Manual Input • Pilots (AVOs) personally favored Speech Input over Manual Input in final debriefing questionnaire

  15. Experiment Summary • Definitively stated Speech Input wassuperior to Manual Input for operators in this environment • “Processing time” negligible for voice response compared with the pros of Speech Input (i.e. heads-up, hands-free environment)

  16. Experiment Summary • The only primary difference to using speech was saying the word ‘Confirm’ • Some users said it like “Cfirm” or “Firm” • Automaton illustrates these 3 paths • WordConfirm • Phonemeskh `Ah n f `OR: m • Based on IPAAmerican EnglishAlphabet

  17. Bridge to NLP • The main point from my view is not just that speech works ‘better’ than manual interfaces. It shows that unifying Natural Language and Speech can be applied successfully to subject matter Rules - Pilot-rated IFR Vocabularies in this case • The Final Result from the implemented methods improves controlled output with a non-option approach. This may be quite similar to a Best Path model such as the Viterbi algorithm

  18. For More Information • Paper link • http://www.hec.afrl.af.mil/Publications/HFES03VoiceFinal%20version2.pdf • USAF “Global Hawk” in the news • CNNVideo (Nov 2006) • “Doonesbury”The Washington Post(12/03/06) • NorthropGrumman DoD Web sites

  19. For More Information • SpeechTEK (East/West) conference • Next one in February (Hilton San Francisco)

  20. Manual vs. Speech Input - Voice Interface with UAVs Thank you!

More Related