210 likes | 314 Views
Computational Parsing of the Navajo Verb by Mans Hulden & S.T. Bischoff mhulden@email.arizona.edu/bischoff@email.arizona.edu SoRiiL (Society for Rational Inquiry in Linguistics). Ling 538 Fall 2006. Tripartite Structure: Templates. outer prefix. plural. object. inner prefix. subject.
E N D
Computational Parsing of the Navajo Verb by Mans Hulden & S.T. Bischoff mhulden@email.arizona.edu/bischoff@email.arizona.edu SoRiiL(Society for Rational Inquiry in Linguistics) Ling 538 Fall 2006
Tripartite Structure: Templates outer prefix plural object inner prefix subject classifier stem/ mode disjunct conjunct stem Navajo verb tripartite structure: (Young & Morgan 1987, 1993; Faltz 1998; among others)
Tripartite Structure: Templates outer prefix plural object inner prefix subject classifier stem/ mode di + da + y + d + íí + ł + jééʼ disjunct conjunct stem (1) dideidííłjééʼ 'They made a fire.' (Faltz 1998:171.27) NOTE: a phonological rule changes "da + y" to "dei"
Stem sets: Lexical Prefixes outer prefix plural object inner prefix subject classifier stem/ mode disjunct conjunct stem Lexical Prefixes: outer inner ha d na j bik'i '
Dependencies: Subjects sg dpl sg dpl 1 sé siid 1 íí iid 2 síní soo 2 ííní oo 3 s or z 3 íí (z) outer prefix plural object inner prefix subject classifier stem/ mode disjunct conjunct stem A. (mode)P(s):(classifier) 0 / ł B. (mode)P(y):(classifier) 0 / ł conjunct prefix
Dependencies: Subjects outer prefix plural object inner prefix subject classifier stem/ mode yí + 0 + cha[P(y)] disjunct conjunct stem outer prefix plural object inner prefix subject classifier stem/mode sh + 0 + cha[I] disjunct conjunct stem (2) yícha 'I cried.' (P(y) Mode)(Faltz 1998:91) (3) yishcha 'I am crying.' (I mode )(Faltz 1998:55)
outer prefix plural object inner prefix subj clsfr stem/mode ha + da+ 0 + j íí + 0 + geed[P(y)] disjunct conjunct stem outer prefix plural object inner prefix subj clsfr stem/mode ha + da+ 0 + j z + 0 + geed[P(y)] disjunct conjunct stem Dependencies: Subjects (4) hadajíígeed 'Those guys dug it out.' (5) hadajizgeed 'Those guys dug them out.'
happy+ADJ+SUPER Finite-state transducer happiest happiest yishcha Finite-state abstraction (similar to chapters 3-4 in J&M) English Navajo [0+0|0+0+sh+0^cha]+I+S1 happy+ADJ+SUPER Finite-state transducer Finite-state transducer
Underlying forms [0+0|0+0+sh+0^cha]+I+S1 Morphotactics I-mode S1 subj = sh ... Intermediate forms [0+0|0+0+sh+0^cha] Phonological rules cascade 0 -> yi / ... yishcha Surface forms Computational implementation • Breakup into three components:
A simple example: Prefix … Suffix Prefix n must match Suffix n Long-distance dependencies with FSA Problem: Rules/middle part is duplicated in grammar
Potential solution Need formalism that meets three criteria: • Can be compiled into FSM representation • Can express long-distance relationships in a compact manner, so grammar remains readable • Can be efficiently applied at runtime in case complete FSM compilation turns out to be infeasible once grammar expands
Depencies outer prefix plural object inner prefix subj clsfr stem/mode ha + da+ 0 + j z + 0 + geed[P(y)] disjunct conjunct stem Need concise formalism for describing the dependencies Example: j in object slot is permitted iff subj=3rd person (z)
outer prefix plural object inner prefix subj clsfr stem/mode ha + da+ 0 + j z 0 + geed[P(y)] disjunct conjunct stem Morphotactics: Extended unification scheme Basic unification with OP [FEAT VALUE] ha geed ⊔[MODE yP] ⊔[MODE yP]
outer prefix plural object inner prefix subj clsfr stem/mode ha + da+ 0 + j z 0 + geed[P(y)] disjunct conjunct stem Morphotactics Basic unification with OP [FEAT VALUE] j z ⊔[SUBJ 4] ⊔[SUBJ 4] *Allows for j without z
outer prefix plural object inner prefix subj clsfr stem/mode ha + da+ 0 + j z 0 + geed[P(y)] disjunct conjunct stem Morphotactics Add “+” and “-” operators: j z +[SUBJ 4] ⊔[SUBJ 4] + requires [SUBJ 4] elsewhere
outer prefix plural object inner prefix subj clsfr stem/mode ha + da+ 0 + j z 0 + geed[P(y)] disjunct conjunct stem Morphotactics Add logical connectives: “⋁” and “⋀” j z +[SUBJ 4] ⊔[SUBJ3]v+[SUBJ4]
outer prefix plural object inner prefix subj clsfr stem/mode ha + da+ 0 + j z 0 + geed[P(y)] disjunct conjunct stem Morphotactics Extended unification scheme to avoid complex regular expressions: Operators: {⊔,+, -} (unification, coercion, exclusion) Logical connectives: {⋁,⋀} morphemes carry OP [FEAT VALUE] combinations
outer prefix plural object inner prefix subj clsfr stem/mode ha + da+ 0 + j z 0 + geed[P(y)] disjunct conjunct stem Morpheme order A verb V is an ordered concatenation of morphemes M1,...,Mn We draw morphemes out of bins, one from each:
0 … Z +[DPL] ⊔[SUBJ 3 v 4] … geed ⊔[MODE yP] … ha ⊔[MODE yP] … da ⊔[DPL] ⊔[DSJNCT 1] … j +[SUBJ 4] … Simple concatenation machine V: Fragment of Unification machine U: ⊔[MODE X] ¬⊔[MODE ¬X] ¬⊔[MODE ] (V ∩ U) = Complete morphotactics
hadajizgeed Complexity • Main grammar without stem-lexicon: ~70000 states as FST • Handles ~8 billion word-forms • Some remaining ambiguity/overgeneration: [ha+da|0j+0+z+0^geed]+Py+S4+DISTPL+O3 [ha+da|0j+0+z+0^geed]+Ps+S4+DISTPL+O3 [ha+da|j+0+z+0^geed]+Ps+S4+DISTPL [ha+da|j+0+z+0^geed]+Py+S4+DISTPL
Thank you! Computational Parsing of the Navajo Verb by Mans Hulden & S.T. Bischoff University of Arizona mhulden@email.arizona.edu/bischoff@email.arizona.edu