1 / 11

A Unified Langauge Model Architecture for Web-based Speech Recognition Grammars

A Unified Langauge Model Architecture for Web-based Speech Recognition Grammars. XML. ABNF. IHD. BNF. JSGF. BNF. Wesley Holland, Daniel May, Julie Baca, Georgios Lazarou, Joseph Picone Center for Advanced Vehicular Systems Mississippi State University. Speech Recognition. Acoustic Model

brie
Download Presentation

A Unified Langauge Model Architecture for Web-based Speech Recognition Grammars

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Unified Langauge Model Architecture for Web-based Speech Recognition Grammars XML ABNF IHD BNF JSGF BNF Wesley Holland, Daniel May, Julie Baca, Georgios Lazarou, Joseph Picone Center for Advanced Vehicular Systems Mississippi State University

  2. Speech Recognition • Acoustic Model • Maps audio data to words or phonemes • Language Model • Specifies order in which a sequence of words or phonemes is likely to occur • Described using grammar

  3. Grammar Specifications • Backus-Naur Form (BNF) • Augmented BNF (ABNF) • JSpeech Grammar Format (JSGF) • Speech Recognition Grammar Specification (SRGS) • ISIP Hierarchical Digraph (IHD) BNF ABNF JSGF <A>::=aB <B>::=bB <B>::=ε <A>::=ab* <A>=a(b)*; XML-SRGS IHD a <item repeat=“0-”> b </item>

  4. Conversion Design • Goals • JSGF ↔ IHD • XML-SRGS ↔ IHD • Determination of equivalence • Grammar minimization • Final Architecture XML ABNF BNF IHD JSGF

  5. JSGF/XML-SRGS → ABNF • JSGF→ABNF • Trivial • Similar in syntax and structure to ABNF • XML-SRGS →ABNF • Harder than JSGF • Different in syntax and structure from ABNF • Requires enumeration of certain repeat attributes JSGF ABNF <A>=ab*; <A>::=ab* XML-SRGS ABNF <item repeat=‘1-2’> a b </item> <S>::=(ab)|(abab) <item repeat=‘2-’> a b </item> <S>::=abab(ab)*

  6. JSGF/XML-SRGS → ABNF • XML-SRGS →ABNF (continued) • Different weighting mechanisms (weight and repeat-prob attributes) a <item repeat=“0-” repeat-prob=“.45”> b </item> <one-of> <item weight=“.4”>c</item> <item weight=“.6”>d</item> </one-of>

  7. ABNF → BNF • Normalized BNF • Consists of rules of the following formats: • (RULE_NAME)::=(TERMINAL),(NON_TERMINAL) • (RULE_NAME)::=(NON_TERMINAL) • (RULE_NAME)::=ε ABNF • Break rule into multiple rules at each top-level alternation. Recurse on each rule. • For each concatenation, Kleene star, or Kleene plus, extract a set of left symbols and a set of right symbols. • For n left symbols and m right symbols, create n x m connecting rules. • ABNF → BNF • Complicated • Accomplished using a recursive algorithm that extracts sets of normalized BNF rules from a set of ABNF rules BNF

  8. BNF ↔ IHD • BNF ↔ IHD • Each arc translates to a normalized BNF • Terminals correspond to nodes; concatenations correspond to arcs BNF IHD

  9. BNF → JSGF/XML-SRGS • BNF →JSGF/XML-SRGS • Rule-by-rule • Trivial XML-SRGS <rule id=“a”> a <ruleref uri=“#b”/> </rule> <rule id=“b”> <one-of> <item> b <ruleref uri=“#b”/> </item> <item> <ruleref special= “NULL”/> </item> </one-of> </rule> BNF JSGF <A>::=aB <B>::=bB <B>::=ε <A>=aB; <B>=b|bB;

  10. Software Tools • ISIP Network Converter • Console tool to perform conversions to and from arbitrary grammar formats • ISIP Network Builder • Java-based graphical tool to design • grammars as finite state machines • Can exports grammars to JSGF, • XML-SRGS, ABNF, BNF, and IHD • ISIP Language Model Tester • Console tool for testing of grammars • Can generate valid sentences in a given grammar • Can parse sentences and determine if accepted by a given grammar.

  11. Summary • Future Work • Web-based front-end to speech recognition software • Mobile speech recognition • Public Domain Toolkit • Contains language model conversion tools • Public domain – available for download

More Related