110 likes | 225 Views
A Unified Langauge Model Architecture for Web-based Speech Recognition Grammars. XML. ABNF. IHD. BNF. JSGF. BNF. Wesley Holland, Daniel May, Julie Baca, Georgios Lazarou, Joseph Picone Center for Advanced Vehicular Systems Mississippi State University. Speech Recognition. Acoustic Model
E N D
A Unified Langauge Model Architecture for Web-based Speech Recognition Grammars XML ABNF IHD BNF JSGF BNF Wesley Holland, Daniel May, Julie Baca, Georgios Lazarou, Joseph Picone Center for Advanced Vehicular Systems Mississippi State University
Speech Recognition • Acoustic Model • Maps audio data to words or phonemes • Language Model • Specifies order in which a sequence of words or phonemes is likely to occur • Described using grammar
Grammar Specifications • Backus-Naur Form (BNF) • Augmented BNF (ABNF) • JSpeech Grammar Format (JSGF) • Speech Recognition Grammar Specification (SRGS) • ISIP Hierarchical Digraph (IHD) BNF ABNF JSGF <A>::=aB <B>::=bB <B>::=ε <A>::=ab* <A>=a(b)*; XML-SRGS IHD a <item repeat=“0-”> b </item>
Conversion Design • Goals • JSGF ↔ IHD • XML-SRGS ↔ IHD • Determination of equivalence • Grammar minimization • Final Architecture XML ABNF BNF IHD JSGF
JSGF/XML-SRGS → ABNF • JSGF→ABNF • Trivial • Similar in syntax and structure to ABNF • XML-SRGS →ABNF • Harder than JSGF • Different in syntax and structure from ABNF • Requires enumeration of certain repeat attributes JSGF ABNF <A>=ab*; <A>::=ab* XML-SRGS ABNF <item repeat=‘1-2’> a b </item> <S>::=(ab)|(abab) <item repeat=‘2-’> a b </item> <S>::=abab(ab)*
JSGF/XML-SRGS → ABNF • XML-SRGS →ABNF (continued) • Different weighting mechanisms (weight and repeat-prob attributes) a <item repeat=“0-” repeat-prob=“.45”> b </item> <one-of> <item weight=“.4”>c</item> <item weight=“.6”>d</item> </one-of>
ABNF → BNF • Normalized BNF • Consists of rules of the following formats: • (RULE_NAME)::=(TERMINAL),(NON_TERMINAL) • (RULE_NAME)::=(NON_TERMINAL) • (RULE_NAME)::=ε ABNF • Break rule into multiple rules at each top-level alternation. Recurse on each rule. • For each concatenation, Kleene star, or Kleene plus, extract a set of left symbols and a set of right symbols. • For n left symbols and m right symbols, create n x m connecting rules. • ABNF → BNF • Complicated • Accomplished using a recursive algorithm that extracts sets of normalized BNF rules from a set of ABNF rules BNF
BNF ↔ IHD • BNF ↔ IHD • Each arc translates to a normalized BNF • Terminals correspond to nodes; concatenations correspond to arcs BNF IHD
BNF → JSGF/XML-SRGS • BNF →JSGF/XML-SRGS • Rule-by-rule • Trivial XML-SRGS <rule id=“a”> a <ruleref uri=“#b”/> </rule> <rule id=“b”> <one-of> <item> b <ruleref uri=“#b”/> </item> <item> <ruleref special= “NULL”/> </item> </one-of> </rule> BNF JSGF <A>::=aB <B>::=bB <B>::=ε <A>=aB; <B>=b|bB;
Software Tools • ISIP Network Converter • Console tool to perform conversions to and from arbitrary grammar formats • ISIP Network Builder • Java-based graphical tool to design • grammars as finite state machines • Can exports grammars to JSGF, • XML-SRGS, ABNF, BNF, and IHD • ISIP Language Model Tester • Console tool for testing of grammars • Can generate valid sentences in a given grammar • Can parse sentences and determine if accepted by a given grammar.
Summary • Future Work • Web-based front-end to speech recognition software • Mobile speech recognition • Public Domain Toolkit • Contains language model conversion tools • Public domain – available for download