90 likes | 387 Views
Language Model Grammar Conversion. XML. ABNF. IHD. BNF. JSGF. BNF. Wesley Holland Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering. Grammar Specifications. Backus-Naur Form (BNF) Augmented BNF (ABNF)
E N D
Language Model Grammar Conversion XML ABNF IHD BNF JSGF BNF Wesley Holland Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering
Grammar Specifications • Backus-Naur Form (BNF) • Augmented BNF (ABNF) • JSpeech Grammar Format (JSGF) • Speech Recognition Grammar Specification (SRGS) • ISIP Hierarchical Digraph (IHD) BNF ABNF JSGF <A>::=aB <B>::=bB <B>::=ε <A>::=ab* <A>=a(b)*; XML-SRGS IHD a <item repeat=“0-”> b </item>
Conversion Design • Goals • JSGF ↔ IHD • XML-SRGS ↔ IHD • Determination of equivalence • Grammar minimization • Final Architecture XML ABNF BNF IHD JSGF
JSGF/XML-SRGS → ABNF • JSGF→ABNF • Trivial • Similar in syntax and structure to ABNF • XML-SRGS →ABNF • Harder than JSGF • Different in syntax and structure from ABNF • Requires enumeration of certain repeat attributes XML-SRGS ABNF <item repeat=‘1-2’> a b </item> <S>::=(ab)|(abab) <item repeat=‘2-’> a b </item> <S>::=abab(ab)*
ABNF → BNF • Normalized BNF • Consists of rules of the following formats: • (RULE_NAME)::=(TERMINAL),(NON_TERMINAL) • (RULE_NAME)::=(NON_TERMINAL) • (RULE_NAME)::=ε ABNF • Break rule into multiple rules at each top-level alternation. Recurse on each rule. • For each concatenation, Kleene star, or Kleene plus, extract a set of left symbols and a set of right symbols. • For n left symbols and m right symbols, create n x m connecting rules. • ABNF → BNF • Complicated • Accomplished using a recursive algorithm that extracts sets of normalized BNF rules from a set of ABNF rules BNF
BNF ↔ IHD • BNF ↔ IHD • Each arc translates to a normalized BNF • Terminals correspond to nodes; concatenations correspond to arcs BNF IHD
Minimization • Minimization • Happens in BNF • Iterate over rule set, merging redundant rules • Rules can be merged if the non terminal of both rules reference the same terminal • Example:
Software Tools • ISIP Network Converter • Console tool to perform conversions to and from arbitrary grammar formats • ISIP Network Builder • Java-based graphical tool to design grammars as finite state machines • Can exports grammars to JSGF, XML-SRGS, ABNF, BNF, and IHD • ISIP Language Model Tester • Console tool for testing of grammars • Can generate valid sentences in a given grammar • Can parse sentences and determine if accepted by a given grammar.