1 / 7

Twenty-First Century Automatic Speech Recognition: Meeting Rooms and Beyond

Twenty-First Century Automatic Speech Recognition: Meeting Rooms and Beyond. ASR 2000 September 20, 2000 John Garofolo John.Garofolo@NIST.gov. Challenges. Target for the new millenium in ASR Technology: Meeting Room Transcription and Annotation Task multiple sensors

gisela-hart
Download Presentation

Twenty-First Century Automatic Speech Recognition: Meeting Rooms and Beyond

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Twenty-First CenturyAutomatic Speech Recognition:Meeting Rooms and Beyond ASR 2000 September 20, 2000 John Garofolo John.Garofolo@NIST.gov

  2. Challenges • Target for the new millenium in ASR Technology: • Meeting Room Transcription and Annotation Task • multiple sensors • stationary, mobile, and arrays of mics in conjunction with video input devices • noise and microphone robustness • speaker-independent recognition • speaker identification • automatic production of usable transcriptions with speakers identified and with properly formatted, capitalized, and punctuated text. • Perfect research task to move forward the state-of-the-art • Development infrastructure will require • new metrics, evaluation tools • new I/O specifications • research corpora, new methods of collecting, compiling, and annotating data

  3. NIST Proposed Initiative • Collaborate with ASR research community to create evaluation infrastructure • Develop corpus design and transcription and ASR system output specifications • Revise and update NIST SCLITE ASR scoring software to extend beyond classical word error rate measurements • Collaborate with NIST Smart Space Lab to collect, transcribe, and annotate a pilot meeting room transcription corpus • Sponsor Evaluations and Workshops

  4. Meeting type: Possible focus group discussions requiring information lookup and real consensus building Participants: At least 4 per meeting plus moderator Native speakers? Multi-microphones: Head-mounted ‘control’ Microphone array Lapel mikes worn by, or desk-top mikes for each participant Table/wall-mounted stationary mikes Video: Wide-angle view positioned so that it can be correlated with mike array for source location. Possibly other views to capture faces head-on. Annotation: Transcription (words with capitalization/punctuation) Speaker ID Background noise conditions Some initial exploration of annotating dialogue, people movement, gestures, lip movement, interaction with devices Meeting Room Pilot Corpus

  5. Large Screen Display Camera Elements Equipment Room Array Beams Microphone Array Camera Element NIST Smart SpaceTest Bed Laboratory • 59-mic array, assorted conventional mics • Cameras/video capture • Large screen display • Pervasive devices • Palm tops • Tablets • Wireless LAN • Data collection servers • Gigabit Ethernet • High-bandwidth data flow system • Well-suited for creating pilot meeting corpus

  6. Approach for 2000 - 2001 • NIST will collaborate closely with a few research sites who will be the early users of the data to create the project specifications. • Via E-mail list and Web site • NIST will create a pilot meeting room data collection • Data storage will be a significant issue • NIST will create evaluation software for the new domain • Update SCLITE + detection-based scoring software • If feasible, NIST will coordinate an experimental evaluation • Late summer/early Fall 2001 • NIST will host a workshop (~October 2001) • to discuss research issues • to introduce the pilot corpus to the wider research community • to discuss evaluation metrics and the dry-run evaluation • to plan for future efforts (kickoff for larger DARPA program?)

  7. 21st Century Automatic Speech Recognition: Meeting Rooms and Beyond John Garofolo John.Garofolo@nist.gov NIST Speech Group: http://www.nist.gov/speech NIST Smart Space Lab: http://www.nist.gov/smartspace/ ASR 2000 September 20, 2000

More Related