110 likes | 124 Views
Understand the tools, techniques, and specifications for converting speech recognition grammars. Learn about XML-SRGS, ABNF, BNF, IHD, JSGF, grammar minimization, and more.
E N D
Language Model Grammar Conversion XML ABNF IHD BNF JSGF BNF Wesley Holland, Julie Baca, Dhruva Duncan, Joseph Picone Center for Advanced Vehicular Systems Mississippi State University
Speech Recognition • Acoustic Model • Maps audio data to words or phonemes • Language Model • Specifies order in which a sequence of words or phonemes is likely to occur • Described using grammar
Grammar Specifications • Backus-Naur Form (BNF) • Augmented BNF (ABNF) • JSpeech Grammar Format (JSGF) • Speech Recognition Grammar Specification (SRGS) • ISIP Hierarchical Digraph (IHD) BNF ABNF JSGF <A>::=aB <B>::=bB <B>::=ε <A>::=ab* <A>=a(b)*; XML-SRGS IHD a <item repeat=“0-”> b </item>
Conversion Design • Goals • JSGF ↔ IHD • XML-SRGS ↔ IHD • Determination of equivalence • Grammar minimization • Final Architecture XML ABNF BNF IHD JSGF
JSGF/XML-SRGS → ABNF • JSGF→ABNF • Trivial • Similar in syntax and structure to ABNF • XML-SRGS →ABNF • Harder than JSGF • Different in syntax and structure from ABNF • Requires enumeration of certain repeat attributes XML-SRGS ABNF <item repeat=‘1-2’> a b </item> <S>::=(ab)|(abab) <item repeat=‘2-’> a b </item> <S>::=abab(ab)*
JSGF/XML-SRGS → ABNF • XML-SRGS →ABNF (continued) • Different weighting mechanisms (weight and repeat-prob attributes) a <item repeat=“0-” repeat-prob=“.45”> b </item> <one-of> <item weight=“.4”>c</item> <item weight=“.6”>d</item> </one-of>
ABNF → BNF • Normalized BNF • Consists of rules of the following formats: • (RULE_NAME)::=(TERMINAL),(NON_TERMINAL) • (RULE_NAME)::=(NON_TERMINAL) • (RULE_NAME)::=ε ABNF • Break rule into multiple rules at each top-level alternation. Recurse on each rule. • For each concatenation, Kleene star, or Kleene plus, extract a set of left symbols and a set of right symbols. • For n left symbols and m right symbols, create n x m connecting rules. • ABNF → BNF • Complicated • Accomplished using a recursive algorithm that extracts sets of normalized BNF rules from a set of ABNF rules BNF
BNF ↔ IHD • BNF ↔ IHD • Each arc translates to a normalized BNF • Terminals correspond to nodes; concatenations correspond to arcs BNF IHD
BNF → JSGF/XML-SRGS • BNF →JSGF/XML-SRGS • Rule-by-rule • Trivial XML-SRGS <rule id=“a”> a <ruleref uri=“#b”/> </rule> <rule id=“b”> <one-of> <item> b <ruleref uri=“#b”/> </item> <item> <ruleref special= “NULL”/> </item> </one-of> </rule> BNF JSGF <A>::=aB <B>::=bB <B>::=ε <A>=aB; <B>=b*;
Software Tools • ISIP Network Converter • Console tool to perform conversions to and from arbitrary grammar formats • ISIP Network Builder • Java-based graphical tool to design grammars as finite state machines • Can exports grammars to JSGF, XML-SRGS, ABNF, BNF, and IHD • ISIP Language Model Tester • Console tool for testing of grammars • Can generate valid sentences in a given grammar • Can parse sentences and determine if accepted by a given grammar.
Minimization • Minimization • Happens in BNF • Iterate over rule set, merging redundant rules • Rules can be merged if the non terminal of both rules reference the same terminal • Example: