280 likes | 801 Views
The Practical Value of Statistics for Sentence Generation: The Perspective of the Nitrogen System. Irene Langkilde-Geary. How well do statistical n-grams make linguistic decisions?. Subject-Verb Agreement Article-Noun Agreement
E N D
The Practical Value of Statistics for Sentence Generation:The Perspective of the Nitrogen System Irene Langkilde-Geary
How well do statistical n-grams make linguistic decisions? Subject-Verb AgreementArticle-Noun Agreement I am 2797 a trust 394 an trust 0 the trust 1355 I are 47 a trusts 2 an trusts 0 the trusts 115 I is 14 Singular vs PluralWord Choice their trust 28 reliance 567 trust 6100 their trusts 8 reliances 0 trusts 1083
More Examples Relative pronounPreposition visitor who 9 visitors who 20 in Japan 5413 to Japan 1196 visitor which 0 visitors which 0 visitor that 9 visitors that 14 came to 2443 arrived in 544 came in 1498 arrived to 35 Singular vs Plural came into 244 arrived into 0 visitor 575 visitors 1083 came to Japan 7 arrived to Japan 0 Verb Tense came into Jap 1 arrived into Japan 0 admire 212 admired 211 came in Japan 0 arrived in Japan 4 admires 107
Nitrogen takes a two-step approach • Enumerate all possible expressions • Rank them in order of probabilistic likelihood Why two steps? They are independent.
Assigning probabilities • Ngram model Formula for bigrams: P(S) = P(w1|START) * P(w2|w1) * … * P(w n|w n-2) • Probabilistic syntax (current work) • A variant of probabilistic parsing models
Sample Results of Bigram model Random path: (out of a set of 11,664,000 semantically-related sentences) Visitant which came into the place where it will be Japanese has admired that there was Mount Fuji. Top three: Visitors who came in Japan admire Mount Fuji . Visitors who came in Japan admires Mount Fuji . Visitors who arrived in Japan admire Mount Fuji . Strengths • Reflects reality that 55% (Stolke et al. 1997) of dependencies are binary, and between adjacent words • Embeds linear ordering constraints
Limitations of Bigram model ExampleReason Visitors come inJapan. A three-way dependency He planned increase in sales. Part-of-speech ambiguity A tourist who admire Mt. Fuji... Long-distance dependency A dog eat/eats bone. Previously unseen ngrams I cannot sell their trust. Nonsensical head-arg relationship The methods must be modified to Improper subcat structure the circumstances.
Representation of enumerated possibilities(Easily on the order of 1015 to 1032 or more) • List • Lattice • Forest • Issues • space/time constraints • redundancy • localization of dependencies • non-uniform weights of dependencies
Number of phrases versus size (in bytes) for 15 sample inputs
Number of phrases versus time (in seconds)for 15 sample inputs
Generating from Templates and Meaning-based Inputs INPUT ( <label> <feature> VALUE ) VALUE INPUT -OR- <label> Labels are defined in: • input • user-defined lexicon • WordNet-based lexicon (~ 100,000 concepts) Example Input: (a1 :template (a2 / “eat” :agent YOU :patient a3) :filler (a3 / |poulet| ))
Mapping Rules • Recast one input to another • (implicitly providing varying levels of abstraction) • Assign linear order to constituents • Add missing info to under-specified inputs Matching Algorithm • Rule order determines priority. Generally: • Recasting < linear ordering < under-specification • High (more semantic) level of abstraction < low (more syntactic) • Distant position (adjuncts) from head < near (complements) • Basic properties < specialized
(a1 :venue <venue> :cusine <cuisine> ) (a2 / |serve| :agent <venue> :patient <cuisine> ) (a2 / |have the quality of being| :domain (a3 / “food type” :possessed-by <venue>) :range (b1 / |cuisine|)) Recasting
(a1 :venue <venue> :region <region> ) (a2 / |serve| :agent <venue> :patient <cuisine> (a3 / |serve| :voice active :subject <venue> :object <cuisine> ) (a3 / |serve| :voice passive :subject <cuisine> :adjunct (b1 / <venue> :anchor |BY| )) Recasting
Linear ordering (a3 / |serve| :voice active :subject <venue> :object <cuisine> ) <venue> (a4 / |serve| :voice active :object <cuisine> )
Under-specification (a4 / |serve|) (a6 / |serve| :cat noun) (a5 / |serve| :cat verb)
Under-specification (a4 / |serve|) (a5 / |serve| :cat verb) (a5 / |serve| :cat verb :tense past) (a5 / |serve| :cat verb :tense present)
Core features currently recognized by Nitrogen Syntactic relations :subject :object :dative :compl :pred :adjunct :anchor :pronoun :op :modal :taxis :aspect :voice :article Functional relations :logical-sbj :logical-obj :logical-dat:obliq1 :obliq2 :obliq3 :obliq2-of :obliq3-of :obliq1-of :attr :generalized-possesion :generalized-possesion-inverse Semantic/Systemic Relations :agent :patient :domain :domain-of :condition :consequence :reason :compared-to :quant :purpose :exemplifier :spatial-locating :temporal-locating :temporal-locating-of :during :destination :means :manner :role :role-of-agent :source :role-of-patient :inclusive :accompanier :sans :time :name :ord Dependency relations :arg1 :arg2 :arg3 :arg1-of :arg2-of :arg3-of
Properties used by Nitrogen :cat [nn, vv, jj, rb, etc.] :polarity [+, -] :number [sing, plural] :tense [past, present] :person [1s 2s 3s 1p 2p 3p s p all] :mood [indicative, pres-part, past-part, infinitive, to-inf, imper]
How many grammar rules needed for English? Sentence Constituent+ Constituent Constituent+ OR Leaf Leaf Punc* FunctionWord* ContentWord FunctionWord* Punc* FunctionWord ``and'' OR ``or'' OR ``to'' OR ``on'' OR ``is'' OR ``been'' OR ``the'' OR …. ContentWord Inflection(RootWord,Morph) RootWord ``dog'' OR ``eat'' OR ``red'' OR .... Morph none OR plural OR third-person-singular ...
Computational Complexity (x2/A2) + (y2/B2) = 1 ??? Y X
Advantages of a statistical approachfor symbolic generation module • Shifts focus from “grammatical” to “possible” • Significantly simplifies knowledge bases • Broadens coverage • Potentially improves quality of output • Dramatically reduces information demands on client • Greatly increases robustness