1 / 52

Computational Morphology and its Implications for Theoretical Morphology

Computational Morphology and its Implications for Theoretical Morphology. Richard Sproat University of Illinois at Urbana-Champaign PASCAL MorphoChallenge Venice April 12, 2006. “Item-and-arrangement” versus “Item and process”. Charles Hockett (1954) “Two models of grammatical description”:

vernados
Download Presentation

Computational Morphology and its Implications for Theoretical Morphology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computational Morphology and its Implications for Theoretical Morphology Richard Sproat University of Illinois at Urbana-Champaign PASCAL MorphoChallenge Venice April 12, 2006

  2. “Item-and-arrangement” versus “Item and process” • Charles Hockett (1954) “Two models of grammatical description”: • Item-and-arrangement: words are composed of morphemes that are put together by a kind of “word syntax” • Item-and-process: words are built up via the application of rules that add phonological and morphosyntactic information Computational Morphology/Theoretical Morphology

  3. Stump’s classification Affix is a lexical entry that introduces morphosyntactic features hoot+s[3sg] Ø’s/ hoot[3sg] hoots = 3sg because of -s -s is introduced due to 3sg Affix introduced because of morphosyntactic features Computational Morphology/Theoretical Morphology

  4. Computational morphology • Nearly all morphological operations can be expressed in terms of regular relations. • Only possible exception is reduplication • Regular relations are relations over pairs of strings that can be constructed solely by the operations of: • Concatenation: if R, S are regular relations then so is R• S • Union: if R, S are regular relations then so is RUS • Kleene closure: if R is a regular relation then so is R* (0 or more instances of R concatenated with itself) • Regular relations are closed under composition: if R, S are regular relations, then so is R○S • Implemented with finite-state transducers Computational Morphology/Theoretical Morphology

  5. Transducers and composition (Johnson, 1972; Koskenniemi, 1983; Kaplan & Kay, 1994; Mohri & Sproat, 1996) • Consider 3-letter alphabet {a,b,c} • Given a rule a " b, the equivalent transducer is: abbca bbbcb Computational Morphology/Theoretical Morphology

  6. Another rule b"c / _ b Computational Morphology/Theoretical Morphology

  7. The two rules composed a"b b"c / _ b abbca ccbcb Computational Morphology/Theoretical Morphology

  8. Composition and morphology • Composition is the most general computational mechanism that handles morphological operations (Roark and Sproat, 2006) • Affixation (which is more typically handled using concatenation) can also be handled using composition • Composition, and other closure properties of regular relations imply that there is no fundamental difference between morphological theories. Computational Morphology/Theoretical Morphology

  9. Any string over the alphabet Insert b Affixation as composition Computational Morphology/Theoretical Morphology

  10. Is this Rube-Goldbergesque? • No! Because many affixes either impose requirements on their base or modify their base. • Cf. Yowlumne (aka Yawelmani) (Archangeli, 1984) Computational Morphology/Theoretical Morphology

  11. Yowlumne gerundial -inay • -inay requires the template CVC(C) Composing the base with k1 will modify the base and add [+GER] Computational Morphology/Theoretical Morphology

  12. CVC(C) Computational Morphology/Theoretical Morphology

  13. Some morphological operations • Subsegmental morphology • Truncation • Infixation • Root-and-pattern morphology • Reduplication • Morphomic requirements (Aronoff, 1994) • All of these can be handled using composition Computational Morphology/Theoretical Morphology

  14. German diminutives Computational Morphology/Theoretical Morphology

  15. Koasati truncation (Lombardi & McCarthy, 1991) Computational Morphology/Theoretical Morphology

  16. Two kinds of infixation • Extrametrical infixation • E.g. Bontoc • Positively circumscribed infixation • E.g. Ulwa Computational Morphology/Theoretical Morphology

  17. Bontoc infixation (Seidenadel, 1907) Computational Morphology/Theoretical Morphology

  18. Ulwa infixation (CODIUL, 1989) Computational Morphology/Theoretical Morphology

  19. Root & pattern morphology (McCarthy 1979) k t b Computational Morphology/Theoretical Morphology

  20. Root & pattern morphology Computational Morphology/Theoretical Morphology

  21. Root & pattern morphology: related approaches • Beesley & Karttunen (2000) propose an approach using compile-replace plus merge • Kiraz (2000) proposes a multitape solution • But all of these are equivalent to composition d u u r i s d V V r V S Surface form is a regular expression Computational Morphology/Theoretical Morphology

  22. Prefix a syllable of the form (A)Cai to the stem, where C is a consonant position and A is an optional appendix Copy the onset of the stem to the C position. If there is a pre-onset appendix /s/, copy this to the appendix position Reduplication: Gothic (Wright 1910) Computational Morphology/Theoretical Morphology

  23. Bambara reduplication (Culy, 1985) This is apparently beyond the power of finite-state methods. Computational Morphology/Theoretical Morphology

  24. Factoring reduplication • Prosodic constraints • Copy verification transducer C Computational Morphology/Theoretical Morphology

  25. Gothic index transducer Computational Morphology/Theoretical Morphology

  26. Factoring reduplication • Then reduplication in Gothic can be modeled as: αo C • More generally, one can model reduplication as the following composition, where P implements the prosodic constraints, C the copy constraints, and A optional phonological adjustments: P o C o A Computational Morphology/Theoretical Morphology

  27. Other approaches • Walther (2000a, 2000b) proposes a special kind of transducer involving • Repeat arcs: move backwards in a string and repeat • Skip arcs: skip over portions of the string • Cohen-Sygal & Wintner (forthcoming) introduce finite state registered automata, extending FSA’s with registers • These methods generally seem to presume exact copies Computational Morphology/Theoretical Morphology

  28. Non-exact copies • Dakota (Inkelas & Zoll, 1999): Computational Morphology/Theoretical Morphology

  29. Basic and modified stems in Sye (Inkelas & Zoll, 1999): “they will fall all over” Non-exact copies Computational Morphology/Theoretical Morphology

  30. Morphological Doubling Theory(Inkelas & Zoll, 1999) • In contradistinction to the more common “correspondence” theory: • Reduplication involves doubling at the morphosyntactic level • Phonological doubling is thus expected, but not required Computational Morphology/Theoretical Morphology

  31. Gothic reduplication under Morphological Doubling Theory Computational Morphology/Theoretical Morphology

  32. Composition also elegantly accounts for other phenomena such as prosodic circumscription (McCarthy and Prince, 1990) or morphomic requirements (Aronoff, 1994). Composition of regular relations can model rules It can also model affixation It doesn’t matter if you describe affixation as lexical-incremental or inferential-realizational More Computational Morphology/Theoretical Morphology

  33. Morphomic requirements (Aronoff, 1994) Latin 3rd Stem Computational Morphology/Theoretical Morphology

  34. So? • 3rd stem is not morphologically uniform: • It differs across different verb classes and some verbs have idiosyncratic third stems • It is not semantically coherent: • Forms that require the 3rd stem are a motley crew • Yet there is clearly a notion of 3rd stem: • If you tell me the 3rd stem of a verb, I can tell you how the agentive noun, the supine, the perfect participle … are formed • 3rd stem has a purely morphological function Computational Morphology/Theoretical Morphology

  35. Σ* >3st:εΣ* 3rd stem is just prosodically induced affixation • Assume we have a transducer T that forms the 3rd stem of a verb: • of course, T will have to allow for a lot of idiosyncratic changes Computational Morphology/Theoretical Morphology

  36. Summary so far • Most or all morphological operations can be handled with composition • We wish to show next that this fact, along with general properties of regular languages and relations, allows us to dispense with distinctions between morphological theories. Computational Morphology/Theoretical Morphology

  37. Return to Stump (2001) • In (Roark & Sproat, 2006) we reanalyze Stump’s analyses of: • Sanskrit nominal declensions • Swahili verbal declensions • Breton double plurals • All of which purport to show the need for an realizational-inferential account. • Here we will consider: • A simple example from Beard & Volpe’s analysis of English agentive nominals • A quick overview of the Sanskrit case. Computational Morphology/Theoretical Morphology

  38. English Agentive Nominals (cf. Beard & Volpe, 2005) • read-er, stand-ee, correspond-ent, record-ist, cook • e" ent / [+ent][+noun,+agentive] S* __ $ • Call the set of all agentive rules R • We can define a new ‘metarule’ R′that is the union of all rules in R: Computational Morphology/Theoretical Morphology

  39. Feature [+noun,+agentive] • Presumably this is also introduced by rule: call this rule M • Then given a base B, the base with that feature specification added is given by B○M • Then the appropriate suffixed form is given by [B○M]○R′ • But this can be written, by associativity, as B○[M○R′] • Finally, [M○R′] can be precomposed; call this R′′ Computational Morphology/Theoretical Morphology

  40. So what? • R′′: • Introduces the morphosyntactic feature [+noun,+agentive] • Introduces the affixal morphology as appropriate to the base • In short, R′′ encodes a lexical-incremental model of morphology. Computational Morphology/Theoretical Morphology

  41. Sanskrit declensions Computational Morphology/Theoretical Morphology

  42. Sanskrit declensions Computational Morphology/Theoretical Morphology

  43. Issues with Sanskrit • Nouns have two or three stems – strong, middle and (optionally) weakest • A different series of stem alternations cross-cuts this: guna, vrddhi, and zero: • “foot”: pād-, pad-, pd- • strong stems may be guna or vrddhi • middle stems may be zero, or a lexeme-specific stem • weakest stems may be zero or lexeme-specific stem Computational Morphology/Theoretical Morphology

  44. Sanskrit declensions zero guna Computational Morphology/Theoretical Morphology

  45. Sanskrit declensions vrddhi lexeme-class particular lexeme-class particular Computational Morphology/Theoretical Morphology

  46. Further issues • Stump argues for Indexing Autonomy Hypothesis: • A stem’s index is independent of the form used for the stem • Sanskrit nominal declensions are morphomic in Aronoff’s sense • Also involved are rules of referral whereby a particular form is systematically used to represent more than one slot in the paradigm. • For example, in Latin the ablative and dative plural in nominal paradigms are identical no matter what form is used for the particular paradigm • So we have several layers of complexity here, which would seem to make an “item-and-arrangement” approach impossible Computational Morphology/Theoretical Morphology

  47. Computational analysis Computational Morphology/Theoretical Morphology

  48. But this is just an item-and-arrangement analysis Refactoring Computational Morphology/Theoretical Morphology

  49. Summary • Theoretical distinctions between different approaches to morphology seem to the issue of how cleanly one can describe a given phenomenon. • But it is not clear that they relate to important differences in underlying mechanisms. Computational Morphology/Theoretical Morphology

  50. Why morphological theory? • Morphology has tended to develop highly articulated theories that are (often) intended to represent the morphological component of some putative ‘language faculty’. • Need a set of mechanisms to account for complex morphological systems – e.g. Sanskrit. • Need to account for observed universals • These might related to built-in predispositions, but equally well might relate to historical change; cf. Blevins (2004) • Linguistic phenomena are complex: how can children learn them? • Clearly relates to learning mechanisms Computational Morphology/Theoretical Morphology

More Related