370 likes | 491 Views
Vsevolod Kapatsinski Indiana University Dept. of Linguistics & Cognitive Science Program Speech Research Lab vkapatsi@indiana.edu. Rules and analogy in Russian loanword adaptation and novel verb formation. LSA 2007. Russian stem extensions. -i- event event+i+ ‘happen’
E N D
Vsevolod Kapatsinski Indiana University Dept. of Linguistics & Cognitive Science Program Speech Research Lab vkapatsi@indiana.edu Rules and analogy in Russian loanword adaptation and novel verb formation LSA 2007
Russian stem extensions • -i- event event+i+ ‘happen’ • -a- eat it+a+ ‘eat’ • Source: The BigDictionary of Youth Slang, 2003 • Borrowed verbs • New verbs formed from nouns
The questions • How can we predict the choice of the stem extension? • Is one extension applied by default? • Predicted by the Dual Mechanism Model (Pinker and Prince 1988, 1994) • Locality effects • Analogical vs. schema-based accounts? • Do parts of the root adjacent to the root-suffix boundary influence suffix choice more than more distant parts of the root? • Do parts of the root that are not adjacent to the root-suffix boundary influence the choice of the suffix? • Unexpected under the Rule-Based Learner (Albright and Hayes 2003)
Phonotactics do not explain all the variation • Can analogy to existing words predict the stem extension taken by a borrowed verb? • Analogy: • The borrowed verb will take the stem extension of the majority of its neighbors. • Verbs are neighbors if their roots share at least 2/3 of their phonemes
kad kaz -a kap kajm kar xam -i -a kum kak kim kaj kach Analogical predictions kam
Similarity effect N=598 N=1085
i 8/11 m i a 3/11 Final consonant as a predictor KAM kajM xaM kuM groM toM weM shtorM skoroM KiM duM xroM Not just Place: b i (41/54) p a (36/57)
When analogy makes no prediction • In 8.5% of verbs, analogy makes no prediction • Numbers of nieghbors taking each stem extension are equal OR • No neighbors • What determines stem extension choice then?
N=98 (5.5%) • When there are equal numbers of neighbors rooting for –a and -i, coronals are not associated with either stem extension • What about verbs that have no neighbors?
Number of neighbors=0 N=59 (3%) When there are no neighbors, coronals are always followed by -i
Interim Summary • Analogy accounts for 87% of the data excluding velars • Analogy performs better than specifying the final consonant • Analogy predicts –i better than it predicts –a • (70% vs. 93%) • When there are no neighbors, coronals are always followed by -i
An issue for the Dual Mechanism Model • Pinker and Prince (1988, 1994): • One suffix should be more productive than the other suffix with novel lexical items that are not similar to existing ones • -i > –a after coronals -i is the default • This suffix is applied by default. Hence, analogy should be less able to predict when this suffix will occur. • Analogy is less able to predict occurrence of –a -a is the default • Possible accounts: • Analogy • Associations between parts of the root and suffixes • Associations should be stronger when the distance between the suffix and the part of the root is small
Do neighbors that don’t share the final C matter? • Albright and Hayes (2003): • The only segment strings that can be associated with a suffix are uninterrupted segment strings that include the final segment • Weaker version: • Suffixes can be associated with adjacent phonological chunks more strongly than with non-adjacent ones
KAd KAz -a KAp KAjM KAr xAM -i -a KuM KAk KiM KAj KAch Testing the hypothesis of lack of non-local dependencies KAM
Combining predictors If we know • What do most neighbors sharing final C take? • What do most words with this final C take? Do we need to know • What do most neighbors that do not share final C take?
KAjM XaM KuM KiM Final consonant vs. final-sharing neighbors Previously sharing just the final C was not enough to be considered neighbors KAM loM groM weM greM Etc.
Non-local dependencies still important • Logistic Regression: • Final C: χ2= 31.0 • Neighbors sharing final C: χ2 = 329.8 • Neighbors not sharing final C: χ2 = 181.7 Local dependencies are stronger • All predictors are significant at p<.0005 Non-local dependencies do exist
Conclusion • Huge similarity effects for both stem extensions • All productive suffixes sensitive to similarity • But, after coronals • -a is less predictable than –i based on analogy • -i is more productive than –a when there are no analogical models nearby Defining attributes of a DMM default are dissociable (cf. Kapatsinski 2005)
Conclusion • -a is less predictable than –i based on analogy • Possible reason: • There are more –i verbs than –a verbs in the lexicon • Possible analogical solution: • Thus, a given neighbor is more likely to bear –i than it is to bear –a • Thus, occurrence of an –a neighbor is more salient than occurrence of an –i neighbor
Conclusion • After coronals • -i is more productive than –a when there are no analogical models nearby • -i and –a are equally productive when there are as many neighbors bearing –i as neighbors bearing -a • Interpretation: • Use analogy whenever possible; • if both alternatives have equal support, then they are equally acceptable; • if no analogical models, use phonotactics
Conclusion • Analogy or schemas? • Activate similar words? • Activate sublexical chunks associated with suffixes? • Locality effects support the schematic account (cf. Albright and Hayes 2003): • Dependencies between adjacent segments are easier to learn than dependencies between non-adjacent ones (e.g., Hudson Kam and Newport 2005) • While adjacent dependencies are stronger, non-adjacent dependencies seem to also play a role in suffix choice (contra Albright and Hayes 2003).
Acknowledgements • N.I.H. for financial support through a training grant to David Pisoni and the Speech Research Lab • Tessa Bent, Adam Buchwald, Joan Bybee, and Susannah Levi for helpful discussion
References Albright, A., and B. Hayes. 2003. Rules vs. analogy in English past tenses: A computational/ experimental study. Cognition 90, 119-61. Bybee, J. L. 1985. Morphology: A study of the relation between meaning and form. Benjamins. Bybee, J. L. 1995. Regular morphology and the lexicon. Language and Cognitive Processes,10. 425-455. Kapatsinski, V. M. 2005. Characteristics of a rule-based default are dissociable: Evidence against the Dual Mechanism Model. In S. Franks, F. Y. Gladney, and M. Tasseva-Kurtchieva, eds. Formal Approaches to Slavic Linguistics 13: The South Carolina Meeting, 136-46. Michigan Slavic Publications. Pinker, S., and A. Prince. 1988. On language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition, 28, 73-193. Pinker, S., and A. Prince. 1994. Regular and irregular morphology and the psychological status of rules of grammar. In S. D. Lima, R. L. Corrigan, and G. K. Iverson, eds. The reality of linguistic rules, 321-51. Benjamins.
Extracting the dependencies • For a dependency between a part of the root and a suffix to be formed, many roots must share the same sublexical chunk and the same stem extension • Is this the case? • What are the major schemas? • Are they all local?
kad kaz kap kajm kar xam -i -a kum kak kim kaj kach Separate networks for –a and –i verbs kam
Conclusion • There are large clusters of verbs in the lexicon in which all verbs are similar to each other in exactly the same way, which could give rise to schema formation. • Many of such schemas would not involve sharing segments that are adjacent to the suffix.