340 likes | 362 Views
ITCS 6010. GSL Grammars. What is a Grammar?. Specifies what can be said—all the possible sentences and phrases that can be recognized Developer’s goal to: Predict set of phrases Encode phrase set Non-procedural Written in Grammar Specification Language (GSL). Writing a Good Grammar.
E N D
ITCS 6010 GSL Grammars
What is a Grammar? • Specifies what can be said—all the possible sentences and phrases that can be recognized • Developer’s goal to: • Predict set of phrases • Encode phrase set • Non-procedural • Written in Grammar Specification Language (GSL)
Writing a Good Grammar • Broad coverage • People express themselves in a variety of ways • Recognizer cannot recognize anything not in the grammar • But not too broad • Recognition accuracy can be adversely affected
Writing Grammars • Grammar writing is an iterative process • Make best guess • Collect data • Update grammar • Out-of-grammar (OOG) • When user’s phrase cannot be parsed • 5 – 10 % OOG rate acceptable
Writing Grammars (cont’d) • Focus on most common ways users will respond to question/prompt • DON’T attempt to figure out ALL possible responses – wasted effort • Two most common response types: • Information item • Literal response to question wording
Writing Grammars (cont’d) • Examples: • Question: What is the departure city? • Common responses: • Charlotte • My departure city is Charlotte • Departing from Charlotte • Question: What city would you like? • Common responses: • Charlotte • I’d like Charlotte • Departing from Charlotte
Writing Grammars (cont’d) • Important: • Word prompts carefully • Coordinate grammars and prompts • If prompt changes, change grammar to suit
Writing Grammars (cont’d) • Process for developing grammar: • Define dialog • Identify information items and define slots • Design prompts • Anticipate caller responses • Identify “core” and “filler” portions • Write GSL code
Defining Dialog • Good understanding of dialog required before grammar written • Answer following questions: • What pieces of information required to complete task? • What order will information be requested? • Will one piece of information be requested at a time (directed dialog), or several pieces (mixed initiative)?
Information and Slot Identification • Allocate one slot for each piece of required information • Slot has: • Name • Value format • Value type
Slot Example • Air travel application
Prompt Design • Design prompt before writing grammar • Prompt wording greatly affects user response • Match prompt to slot
Prompt Design Example • Directed dialog
Prompt Design Example • Mixed initiative
Anticipate Caller Response ----------------------------------------------------------------------------- What city would you like to leave from? Charlotte [city name alone] I’d like to leave from Charlotte [literal response] Uh, Charlotte [initial hesitation] Charlotte, please [final “please”] (I’m) leaving from Charlotte (I’m) departing from Charlotte [additional possibilities] ------------------------------------------------------------------------------------------------------ What city would you like to fly to? New York [city name alone] I’m flying to New York [literal response] Um, New York [initial hesitation] New York, please [final “please”] (I’d) like to fly to New York (I’m) going to New York [additional possibilities]
Identify Grammar Core and Filler • Core • Portion with most important meaning-bearing words • Highly reusable => defined as subgrammar • Filler • Depends largely on prompt wording
Core and Filler Example • Core • CITY – Charlotte, Raleigh, New York, Miami, San Diego • DATE • TIME • Filler What city would you like to leave from? CITY I’d like to leave from CITY Uh, CITY CITY, please (I’m) leaving from CITY (I’m) departing from CITY
Grammar Specification Language • Create grammar file • Text file • File extension .grammar • Can contain more than one grammar definition • Grammar definition contains the grammar name and description
GSL (cont’d) • Grammar definition format: GrammarName GrammarDescription • GrammarName • Character string used to reference grammar • Contains at least one uppercase letter (usually the first letter) • No more 200 characters in length • Can contain: • Upper and lower-case letters • Digits • - dash • _ underscore • ‘ single quote • @ at sign • . period
GSL (cont’d) • GrammarDescription • Recognizable word sequence or phrase • Consists of : • Word tokens • Grammar names • Operators • Word tokens and grammar names separated by white space character (space, tab, newline) • Word tokens are terminal symbols • Represent actual word for recognition e.g. dog • Must be lowercase
GSL Code ; This is a simple grammar Sentence ( good morning ) • A semicolon (;) indicates a comment • Sentence is the name of the grammar
OR Construction [A B C … D] A or B or C or … or D Sentence( good [ morning afternoon evening ] )
Operators • (A B C) A followed by B then C (good morning) good morning • [ A B C] A or B or C (good [morning afternoon evening night]) good morning, good afternoon, good evening, good night
Operators • ?A A is optional Command ( tell me my balance in checking ?please) tell me my balance in checking, tell me my balance in checking please • +A One or more repetitions of A Sentence( thanks +very much ) thanks very much, thanks very very much, thanks very ….much • *A Zero or more repetitions of A Sentence( thanks *very much ) thanks much, thanks very much, thanks very very much
Natural Language Interpretation • NL interpretation assigns meaning to word strings • Many utterances. . . • “withdraw fifteen hundred bucks from savings” • “take fifteen hundred out of savings” • “give me one thousand five hundred dollars from my savings account” • . . .may express the same meaning: <action "withdrawal"> <source_account "savings"> <amount 1500>
Interpretation • Slots are ... • Defined for the domain • command • amount • source • Associated with word strings in the grammar • Filled with values when the associated word string is recognized by NL Interpretation
Interpretation • Define the relevant “slots” for the domain Slot Value command "transfer" source-account "savings" destination-account "checking" amount 125.10 “Transfer one twenty five ten from savings to checking” “I want to transfer to checking from savings one hundred twenty five dollars and ten cents” “Please put a hundred twenty five dollars ten cents in checking from my savings account”
Slot-Filling Commands • NL commands go between curly braces:{ … } • Commands “attach” to the preceding item — either a word or a grammar construction • NL commands are part of the grammar file: Command ( withdraw from [ checking {<source_account "checking">} savings {<source_account "savings">} ] ) {<action "withdrawal">}
More About Grammars Subgrammars Return Commands NL Functions
Subgrammars • Subgrammars match a “part” of an utterance Account ( [ savings checking (money market) ] ?account ) • Subgrammars reduce redundancy Command [ ( tell me the balance in Account ) ( transfer from Account to Account ) ( withdraw from Account ) ]
Return Commands and Variables • To associate a return value with a grammar: {return("checking")} • “return” is like other {} commands except no slot is filled; only the value is defined • Assignment: A higher-level grammar can store the returned value in a variable: • <Sub-grammar>:<variable_name> • Example: Account:acct results in the variable acct being set to the value returned by the grammar Account • Dereferencing: To access a variable’s value, preface the variable name with ‘$’.
Return Commands and Variables Command [ ( tell me the balance in Account:acct ) {<account $acct>} ( transfer from Account:src to Account:dest ) {<source-account $src> <dest-account $dest>} ( withdraw from Account:src ) {<source-account $src>} ] Account ( [ checking {return("checking")} savings {return("savings")} ( money market ) {return("money_market")} ] ?account )
NL Functions • Slot values and return values can be function calls • Available functions: add returns the sum of two integers sub returns the result of subtracting the second integer from the first mul returns the product of two integers div returns the truncated integer result of dividing the first integer by the second (e.g., div(9 5) returns 1) neg returns the negation of an integer strcat returns the concatenation of two strings • Arguments separated by whitespace, not commas • No space between function name and parenthesis
NL Functions • Example: Digit [ one {return(1)} two {return(2)} three {return(3)} ... ] Decade [ twenty {return(20)} thirty {return(30)} forty {return(40)} ... ] Number ( Decade:d1 Digit:d2 ) {<number add($d1 $d2)>} • Matching the top-level grammar Number fills the slot number with the sum of NL variables d1 and d2