1 / 23

Speech Recognition and its clinical applications

Speech Recognition and its clinical applications. Thankam Thyvalikakath, MDS Center for Biomedical Informatics University of Pittsburgh. Outline. In-class assignment Background SpeechActs paper Clinical application of speech recognition Speech recognition in dentistry.

johana
Download Presentation

Speech Recognition and its clinical applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Speech Recognition and its clinical applications Thankam Thyvalikakath, MDS Center for Biomedical Informatics University of Pittsburgh

  2. Outline • In-class assignment • Background • SpeechActs paper • Clinical application of speech recognition • Speech recognition in dentistry

  3. Speech recognition ? Speech Recognition are technologies of particular interest, for their support of direct communication between humans and computers, through a communications mode, humans commonly use among themselves and at which they are highly skilled. Rudnicky, Hauptman, and Lee http://starbase.cs.trincoll.edu/~ram/cpsc352/

  4. What was the first success story of speech recognition? “Radio Rex” in the 1920’s, was the first success story in the field of speech recognition www.stanford.edu/class/linguist236/lec1.pdf

  5. Timeline of Speech recognition • 1936 - AT & T’s Bell labs started study of speech recognition (funded by DARPA) • 1974 - optical character recognition • 1975 – text to speech synthesis ( Kurzweil reading machine) • 1978 – speak and spell toy released by Texas Instruments • 1980 – Xerox started producing reading machine Text bridge • 1997 – Dragon Systems produces first continuous speech recognition product http://starbase.cs.trincoll.edu

  6. How speech recognition evolved? acoustic approach (pre - 1960’s) pattern recognition approach (1960’s) linguistic approach (1970’s) pragmatic approach (1980's)

  7. Types of speech recognition • Isolated words • Connected words • Continuous speech • Spontaneous speech (automatic speech recognition) • Voice verification and identification Fundamentals of Speech Recognition". L. Rabiner & B. Juang. 1993

  8. Speech recognition – uses and applications • Dictation • Command and control • Telephony • Medical/disabilities Fundamentals of Speech Recognition". L. Rabiner & B. Juang. 1993

  9. Challenges of speech recognition • Ease of use • Robust performance • Automatic learning of new words and sounds • Grammar for spoken language • Control of synthesized voice quality • Integrated learning for speech recognition and synthesis B.S Atal. Speech recognition in 2001: New research directions Proc.Natl.Acad.Sci USA Vol 92, pp 10046-100551Oct1995

  10. SpeechActs SpeechActs is a prototype testbed for developing spoken natural language applications Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

  11. Why develop SpeechActs? • Integrated conversational applications • No specialized language expertise • Technology independence Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

  12. Information flow in SpeechActs Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

  13. SpeechActs - Framework • Audio server presents raw digitized audio to speech recognizer • Swiftus parses the word list to produce a set of feature-value pairs • Discourse manager maintains a stack of information about the current conversation • Discourse manager and application respond to the user by sending a text string to ‘text to speech manager’ Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

  14. SpeechActs: A Spoken Language Framework • Continuous-speech recognizers require grammars that specify every possible utterance a user could say to the application • The recognizer grammar should closely synchronize with the Swiftus semantic grammar • Solved by inventing Unified Grammar Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

  15. Unified grammar • Collection of rules • Made of a pattern such as Backus-Naur Form followed by augmentations which are statement written in the Pascal-like form • Compiler that produces a grammar specific to speech recognizer and corresponding Swiftus grammar Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

  16. Swiftus – the natural language processor • Semantic representation generated in real time to facilitate conversation • Accurate understanding • Tolerance of misrecognized words • Wide variation among applications • Ease of use Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

  17. Swiftus performance - Solved Swiftus was designed by using coarse keyword matching and full, in-depth semantic analysis Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

  18. Discourse management • To support more natural speech , we need at least rudimentary discourse management • Should support discourse-segment pushing and popping • Prompt design • Error-correcting mechanism Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

  19. Discourse manager • discourse represented as a data structure consisting of functions for handling user output • maintains a stack of these structures, and the top one handles the default discourse for the current application or current dialogue • current application or dialogue popped off the stack when the user cancels the activity or the problem is resolved • keeps a simple stack of referenced items to a avoid entering into a subdialogue Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

  20. To simulate human conversation…. • conversational pacing • explicit error corrections • define the functional boundaries of an application Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

  21. Clinical applications • Medical transcription mainly in radiology and pathology • First use of speech recognition in the field of radiology in 1981 • Mean accuracy rate of reading pathology reports, using IBM Via Voice Pro software – 93.6% compared to human transcription at 99.6% M. Al.Aynati, K.Chomeyko Comparison of Voice-automated Transcription and Human Transcription in General Pathology ReportsArch Pathol Lab Med. 2003;127:721–725)

  22. Speech recognition in clinical dentistry? • 13% used voice recognition • 16% discontinued using voice recognition • 21% believed chairside computer use could be improved with better voice recognition • Using an automatic speech recognition will be the way to go!! T. Schleyer et al (unpublished data) Chairside Computer Use in Clinical Dentistry

  23. Thank you Questions or comments?

More Related