1 / 8

Simple Speech Recognition

Speechbuilder is a useful tool with some limitations in grammar, but a complex compilation of domain pieces. Streamlining audio processing through a server connection can speed up the recognition process significantly.

alcalam
Download Presentation

Simple Speech Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Simple Speech Recognition • Larry Rudolph

  2. Speechbuilder +/- • Speechbuilder is a great tool, but • Grammar is limited • Forces use of “action” and “attribute” • “Domain” has lots of components • complicated compile of domain pieces -- vocabulary, natural language, scoring • Must start server with domain framework

  3. Simpler than S-Builder • Simple: • send audio to server • receive string in the domain • but what about domain? • open a connection with server w/ domain • stream audio over connection; get string when done

  4. Streaming Audio Speak (Capture) Send to server Recognize • Long process: • each step can be seconds. • Streaming can speed things up (a lot) return Process

  5. Streaming Audio Speak (Capture) • Start sending audio before user is done speaking • Start recognition before audio done arriving • Still must wait for return string Send to server Recognize return Process

  6. Streaming recognition • Recall, recognition proceedings in several stages: • waveform to phonemes • phonemes to words • words to sentences • Natural language filtering • Grammar Parse Much of this can be pipelined NL requires whole sentence

  7. Speed up process • Do not do natural language filtering • Do not do very limited vocabulary • (what should one do with extraneous words -- fastest is to recognize them since ignoring them slows down parsing)

  8. more speed • pipeline whole process. • what about grammar parse? • do a Virturbi search • get back confidence levels

More Related