1 / 34

A Lexical Theory of Variation

A Lexical Theory of Variation. Andries W. Coetzee Workshop on Variation, Gradience and Frequency in Phonology Stanford University, July 2007. Things that are known to influence variation Grammar Where: Where it appears and where not Frequency: How often does a process apply in some context

elvina
Download Presentation

A Lexical Theory of Variation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Lexical Theory of Variation Andries W. Coetzee Workshop on Variation, Gradience and Frequency in Phonology Stanford University, July 2007

  2. Things that are known to influence variation • Grammar • Where: Where it appears and where not • Frequency: How often does a process apply in some context • Lexical frequency • Some variable processes affect frequent words more, others affect infrequent words more. • Extra-grammatical factors • Speech style, speech rate, etc.

  3. Existing theories of variation • Grammatical • Variable rule in the Labovian tradition (Labov 1972; Sankoff 1988) • Several OT models (Anttila 1997; Boersma and Hayes 2000; Coetzee 2006; Reynolds 1994) • Reasonably successful at accounting for the grammatical influence. • Usage-based/exemplar models (Bybee 2001, 2002; Pierrehumbert 2001) Reasonably successful at accounting for the influence of lexical frequency. • Interaction between the two Models that incorporate both are still largely absent.

  4. Structure of the presentation • Usage frequency and variation • The basics of the proposal • Phonetically motivated variation • Analogically motivated variation • Learning lexical distributions

  5. Usage Frequency and Variation

  6. Phonetically motivated variable process • Typical phonological process • Applies more often to lexical items with higher usage frequency • Example: t/d deletion • Pre-C: west bank ~ wes bank • Pre-V: west end ~ wes end • Pre-##: west ~ wes Chicano English(Santa Ana 1991) Influence of frequency(Bybee 2000:70)

  7. Analogically motivated variable process • Usually some kind of regularization process – irregular plural/past tense replaced with regular • Applies more often to lexical items with lower usage frequency • Example: Regularization of past tense verbs • Infrequent verbs are more likely to regularize (Hooper 1976:100; Bybee 1985:120, 2002:269; Bybee & Slobin 1982) • Kučera and Francis frequencies (1982) as calculated at www.iphod.com. • Also many examples from the historical literature. (Phillips 1984, 2001 and references therein.)

  8. The challenge • A formal theory of variation that: • Captures the role of grammar • Determines what kind of variation is possible • Influences the frequency of application • Captures the role of lexical frequency • Variable process applies differently to different lexical items. • Different kinds of processes are differently influenced by lexical frequency.

  9. The Proposal: Variation Through Lexical Indexation

  10. Variable lexical indexation • Lexically indexed constraints (Pater 1994, 2000; Itô & Mester 1995, 1999) • Allows a way in for lexical influence • Yet still keep control in the hands of grammar • Variation through variable lexical class affiliation • Note that the grammar stays constant – what varies is the lexical class affiliation of lexical items. Variation is hence moved from the grammar into the lexicon.

  11. Lexical distribution functions • What determines the lexical class affiliation of a lexical item? • Each lexical item is stored with a probability density function. • Every time a lexical item is submitted to grammar for evaluation, a value is chosen randomly along the x-axis of the distribution function. • The x-axis is divided into equally sized adjacent regions corresponding to the number of indexed versions of the constraint. • Correlation between frequency and skewness of distribution function: • Frequent lexical items = left skewed function • Infrequent lexical items = right skewed function average low high L2 L1

  12. Example 1: Phonetically Motivated Variation

  13. t/d-deletion again Context Frequency • Grammar • Markedness constraints • *PRE-C No t/d in the context C_#C • *PRE-V No t/d in the context C_#V • *PRE-## No t/d in the context C_## • Contextual licensing constraints a la Steriade (1997) • Four indexed versions of MAX. • Ranking: MAX-L4  *PRE-C  MAX-L3  *PRE-V  MAX-L2  *PRE-##  MAX-L1

  14. The grammar in Pre-C condition Preservation if MAX-L4, deletion if MAX-L3, MAX-L2, MAX-L1

  15. The grammar in Pre-V condition Preservation if MAX-L4, MAX-L3, deletion if MAX-L2, MAX-L1

  16. The grammar in Pre-Pause condition Preservation if MAX-L4, MAX-L3, MAX-L2, deletion if MAX-L1

  17. Likelihood of deletion based on grammar alone Grammar: MAX-L4  *PRE-C  MAX-L3  *PRE-V  MAX-L2  *PRE-##  MAX-L1 Note that grammar determines: • What variation is observed – only a process that reduces markedness, only a process that is grammatically motivated. • How frequently process applies in which context. But we still need to give the lexicon its due.

  18. The influence of lexical frequency Frequencies from Francis & Kučera (1982), calculated at www.iphod.com. best modest vest MAX-L4 MAX-L3 MAX-L2 MAX-L1 *PRE-C *PRE-V *PRE-##

  19. Example 2: Analogically Motivated Variation

  20. Regularization of the strong past tense in English • Specific examples from Kučera and Francis (1982) (www.iphod.com) • Irregular morphology/suppletion as allomorphy • Two morphological options for formation of the past tense. • Both options are input to grammar, so that choice of the one allomorph does not violate faithfulness relative to the other.(Anttila 1997, Bonet 2004, Itô and Mester 2006, Kager 1996, Mascaró 1996, etc.) • Constraints • OO-FAITH: Some kind of paradigm uniformity (Benua 2000, Kenstowicz 1996,etc.) • USELISTED: The input of a candidate must be a single lexical entry (Zuraw 2000)

  21. The grammar • And the influence from the lexicon dive leap speed OO-FAITH-L2 OO-FAITHL1 USELISTED

  22. Lexical Distribution Functions

  23. What needs to be learned? Grammar : Ranking between constraints Lexicon : Lexical items, with their probabilistic distribution functions. These are two separate learning problems, each with their own solution. Learning the grammar Well developed learnability literature in OT. (Tesar and Smolensky 1998, 2000, etc.) And specifically on learning an indexed grammar. (Pater 2006, to appear). I will therefore not dwell on this aspect here. Learning the lexicon Focus here on how the lexical distribution functions might be acquired.

  24. General properties of lexical distribution functions MAX: L1 DEP: L1 L2 L3 IDENT[F]: L1 L2 L3 L4 average frequent infrequent

  25. General properties of lexical distribution functions Basic requirements • Minimum and maximum value. • Shape parameters that determine skewness Beta-distribution (Evans, Hastings & Peacock 2000) •  =   symmetric •  <   right skewed •  >   left skewed average frequent infrequent

  26. A small scale simulation • IPhOD 1.3 (www.iphod.com) • 33,432 words, with CMU transcriptions and Kučera~Francis frequencies • Multiple KF by 10 to avoid having to work with log(1) … • Calculated the following • Mean frequency of all words in IPhOD = 297.89. Log() = 2.47. • Collected all words that end [-Ct] or [-Cd], excluding past tense verbs, and took the log of the frequency for each of these. • Distribution functions:

  27. A small scale simulation aghast most modest best vest MAX-L4 MAX-L3 MAX-L2 MAX-L1 *PRE-C *PRE-V *PRE-##

  28. How well do the predictions line up with reality? • Once the values of  and  for a word are known, it is easy to calculate the likelihood of an x-value falling in a specific range along the x-axis, and hence the likelihood of deletion in each of the three contexts for each word. • Using this, I ran a simulation, feeding each [-Ct] and [-Cd] word through the grammar, according to its frequency in IPhOD. Phonological context (value in brackets is ratio to Pre-C) (Santa Ana 1991) Frequency(value in brackets is ratio to > 35/million) (Bybee 2000)

  29. How can this be refined further? • Currently, the lexical distribution functions are determined purely based on lexical frequency. But we know that different dialects show different deletion rates. • Either different dialects have different lexical frequencies. • Or there are other parameters that can be set independently from lexical frequency. • Maybe some constant is added/subtracted from the mean? • Added = more words become “infrequent” = more conservative dialect. • Subtracted = more words become “frequent” = more deletion. • Maybe the lexical space can be warped – i.e. the regions along the x-axis that correspond to lexical classes are not of equal size. • Maybe lexical distribution functions are best-fit functions – i.e. learn a function that would result in the correct deletion rate … but then we lose the connection between usage frequency and deletion rates.

  30. Conclusion

  31. Conclusion • Existing grammatical models of variation do not allow the lexicon enough opportunity to play a role. (Pierrehumbert 2001): p. 138 p. 148 • Purely usage-based models probably does not allow the grammar enough say. Bybee (2000:73) Bybee (2002:268) • LTV is an attempt to do both. Does it succeed? A second challenge arises from the fact that the differential phonetic outcomes relate specifically to word frequency. Standard generative models do not encode word frequency. They treat the word frequency effects … as matters of linguistic performance rather than linguistic competence. Thus the intrusion of word frequency into a traditional area of linguistics, namely to conditioning of allophony, is not readily accommodated in the classical generative viewpoint. The exemplar model is the only current model which has these properties. … it does mean that there is no variable rule of t/d-deletion. Rather there is a gradual process of shortening or reducing the lingual gesture … If we take linguistic behavior to be highly practiced neuromotor activity … then we can view reductive sound changes as the result of the automation of linguistic production. It is well known that repeated neuromotor patterns become more efficient as they are practiced; transitions are smoothed by the anticipatory overlap of gestures, and unnecessary or extreme gestures decrease in magnitude or are omitted.

  32. References Anttila, Arto. 1997. Deriving variation from grammar. In Frans Hinskens, Roeland van Hout and Leo Wetzels, eds. Variation, Change and Phonological Theory, Amsterdam: John Benjamins. p. 35-68. Benua, Laura. 2000. Phonological Relations Between Words. New York: Garland. Boersma, Paul and Bruce Hayes. 2000. Empirical tests of the Gradual Learning Algorithm. Linguistic Inquiry, 32: 45-86. Bonet, Eulàlia. 2004. Morph insertion and allomorphy in Optimality Theory. International Journal of English Studies, 4:73-104. Bybee, Joan L. 1985. Morphology: A Study of the Relation Between Meaning and Form. Amsterdam: Benjamins. Bybee, Joan L. 2000. The phonology of the lexicon: evidence from lexical diffusion. In Michael Barlow and Suzanne Kemmer, eds. Usage-Based Models of Language. Stanford: CSLI Publications. p. 65-85. Bybee, Joan. 2001. Phonology and Language Use. Cambridge: Cambridge University Press. Bybee, Joan. 2002. Word frequency and context of use in the lexical diffusion of phonetically conditioned sound change. Language Variation and Change, 14:261-290. Bybee, Joan L. and Dan I. Slobin. 1982. Rule and schemas in the development and use of the English past tense. Language, 58:265-289. Coetzee, Andries W. 2006. Variation as accessing “non-optimal” candidates. Phonology, 23:337-385. Itô, Junko and Armin Mester. 1995. The core-periphery structure of the lexicon and constraints on reranking. In J. Beckman, S. Urbanczyk, and L. Walsh, eds. University of Massachusetts Occasional Papers in Linguistics 18: Papers in Optimality Theory, Amherst: GLSA. p. 181-209. Itô, Junko and Armin Mester. 1999. The structure of the phonological lexicon. In Tsujimura Natsuko, ed. The Handbook of Japanese Linguistics. Malden: Blackwell. p. 62-100. Itô, Junko and Armin Mester. 2006. Indulgentia parentum filiorum pernicies: Lexical allomorphy in Latin and Japanese. In Eric Bakovic, Junko Ito, and John McCarthy, eds. Wondering at the Natural Fecundity of Things: Essays in Honor of Alan Prince. Paper 9. (http://repositories.cdlib.org/lrc/prince/9). Hooper, Joan B. 1976. Word frequency in lexical diffusion and the source of morphological change. In William M. Christie, ed. Current Progress in Historical Linguistics. Amsterdam: North-Holland Publishing Co. p. 95-105.  Kager, René. 1996. On affix allomorphy and syllable counting. In Ursula Kleinhenz, ed. Interfaces in Phonology. Berlin: Akademie Verlag. p. 155-171. Kenstowicz, Michael. 1996. Base-identity and uniform exponence: alternatives to cyclicity. In Current Trends in Phonology: Models and methods. In J. Durand and B. Laks, eds. Paris-X and Salford: University of Salford Publications. p. 363-393 Labov, William. 1972. The internal evolution of linguistic rules. In Robert P. Stockwell and Ronald K.S. Maucaulay, eds. Linguistic Change and Generative Theory. Bloomington: Indiana University Press. p. 101-171. Mascaró, Joan. 1996. External allomorphy as emergence of the unmarked. In Jacques Durand and Bernard Laks, eds. Current Trends in Phonology: Models and Methods. Salford, Manchester: European Studies Research Institute, University of Salford. pp. 473-83.

  33. References Pater, Joe. 1994. Against the underlying specification of an ‘exceptional’ English stress pattern. Toronto Working Papers in Linguistics,13:95-121. Pater, Joe. 2000. Non-uniformity in English secondary stress: the role of ranked and lexically specific constraints. Phonology, 17:237-274. Pater, Joe. 2006. The Locus of Exceptionality: Morpheme-Specific Phonology as Constraint Indexation. In L. Bateman, M. O'Keefe, E. Reilly, and A. Werle, eds. University of Massachusetts Occasional Papers in Linguistics 32: Papers in Optimality Theory III. Amherst: GLSA. p. 259-296. Pater, Joe. to appear. Morpheme-specific phonology: constraint indexation and inconsistency resolution. In Steve Parker, ed. Phonological Argumentation. London: Equinox Publishers. Phillips, Betty S. 1984. Word frequency and the actuation of sound change. Language, 60:320-342. Phillips, Betty S. 2001. Lexical diffusion, lexical frequency, and lexical analysis. In Joan Bybee and Paul Hopper, eds. Frequency and the Emergence of Linguistic Structure. Amsterdam: John Benjamins. p. 123-136. Pierrehumbert, Janet. 2001. Exemplar dynamics: Word frequency, lenition, and contrast. In Joan Bybee and Paul Hopper, eds. Frequency Effects and the Emergence of Lexical Structure. Amsterdam: John Benjamins. p. 137-157. Reynolds, Bill. 1994. Variation and Phonological Theory. Ph.D. dissertation, University of Pennsylvania. Sankoff, David. 1988. Variable rules. In Ulrich Ammon, Norbert Dittmar and Klaus J. Mattheier, eds. Sociolinguistics: An International Handbook of the Science of Language and Society. Berlin & New York: Walter de Gruyter. p. 984-997. Santa Ana, Otto. 1991. Phonetic Simplification Processes in the English of the Barrio: A Cross-Generational Sociolinguistic Study of the Chicanos of Los Angeles. Ph.D. Dissertation, University of Pennsylvania. Steriade, Donca. 1997. Phonetics in Phonology: The Case of Laryngeal Neutralization. Ms. UCLA. Tesar, Bruce, & Paul Smolensky. 1998. Learnability in Optimality Theory. Linguistic Inquiry, 29:229-268. Tesar, Bruce, & Paul Smolensky. 2000. Learnability in Optimality Theory. Cambridge, MA: MIT Press. Zuraw, Kie. 2000. Patterned Exceptions in Phonology. Ph.D. dissertation, UCLA.

  34. Die einde

More Related