340 likes | 496 Views
Li6 Phonology and Morphology. Rules. Lecture plan. Key point of tension between symbolic rationalists and numerical reductionists: Do humans extract generalisations from the data in their perceptual worlds? Put differently, is the mind a Turing Machine or a recurrent switching network?
E N D
Lecture plan • Key point of tension between symbolic rationalists and numerical reductionists: • Do humans extract generalisations from the data in their perceptual worlds? • Put differently, is the mind a Turing Machine or a recurrent switching network? • Evidence for rules • What form rules take • Degree of specificity • Formalism
Turing machine vs switch network + memory - + rules/algorithms/generalisations -
Arguments for Turing machine (or against connectionism) • Gallistel 2006 • Dead reckoning • Bee dances • Temporal learning in conditioning experiments • truly random control (Rescorla 1968) • Blocking (Kamin 1969) • Minsky and Papert 1969 on 2-layer networks: • Exclusive OR (thanks to Marc) • Can’t correctly indicate at its output neuron (or neurons) whether there are an even or an odd number of neurons firing in its input layer • Berent et al. 2006 on plurals in English compounds • Vaux • MSCs, as we’ll see later
URSR mappings and rules • We saw in lecture 1 that humans store both abstract underlying representations (URs) and more concrete surface representations (SRs) • How does one get from one type of representation to the other? • Hypothesis 1: each is simply memorized • Hypothesis 2: URSR mappings encoded in associative/connectionist network • Hypothesis 3a: All URs are transformed into SRs (and perhaps vice versa) by an ordered series of rules • Hypothesis 3b: Only regular URSR mappings involve rules Why favor this one?
Generalisation by animals Gallistel, C. 2003. Conditioning from an information processing perspective. Behavioural Processes 61.3:1234 1-13.
Generalisation by infants • Marcus et al 1999 • Question • Do infants extract linguistic generalisations, and in what form? • Method • 16 infants randomly assigned to one of two groups, each familiarized with 2-minute speech sample • ABA group: 3 reps of each of 16 3-word sentences from ABA grammar (ga ti ga, li na li, etc.) • ABB group: same with ABB grammar (ga ti ti, etc.) • After habituation, testing on sentences of 3 novel nonce words • test sentences varied as to whether they were consistent or inconsistent with the grammar of the habituation sentences. • Because none of the test words appeared in the habituation phase, infants could not distinguish the test sentences based on transitional probabilities, and because the test sentences were the same length and were generated by a computer, the infant could not distinguish them based on statistical properties such as number of syllables or prosody. • Results • The infants attend longer to sentences with unfamiliar structures. • Conclusions • “Results suggest that infants can represent, extract, and generalize abstract algebraic rules.” Mean time spent looking in the direction of the consistent and inconsistent stimuli in each condition for experiments 1, 2, and 3.
Conclusions about generalisation extraction • Ample evidence that humans extract generalisations from patterns of data in the real world • These are directly captured in rules • These are not captured insightfully (or sometimes at all) by switch-network models (surface constraints, connectionist networks)
Internal evidence • A typical line of argumentation • When does glottalization occur? • sat • Atlantic, atmosphere, coat-tails • tap, atrocious • Since glottalization/unrelease is predictable, we don’t want this to be part of the underlying representation, under the assumption that speakers don’t store redundant information. • If this is the case, we need a rule to glottalize stops in the appropriate environments. • What form should this rule take?
External evidence • Productivity • Child and adult Wug tests, e.g. Pinker and Ullmann on novel plurals • Speech therapy • Click girl undoing her problem with lightning quickness • Syllable deletion in speech errors • unanímity [junnImRi] unámity [junQmRi] • treméndously [tHrmEndsli] trémenly [tHrEmnli] • specifícity [spEsfIsRi] specífity [spEsIfRi] • What is the error in each case? • We need a rule to assign a new stress in these words; if there were no rules, we should expect the forms to be stressless • What sort of rules do we need to account for the outcome of these errors? • First-language acquisition phenomena • Over-regularization (goed for went, etc.) • Transfer in second-language acquisition • Speakers have trouble suppressing L1 rules • Japanese/Korean palatalization, epenthesis • English aspiration • Hard to explain this phenomenon (L1 non-suppression) without rules
What about extraction of generalisations from less clear patterns? Morphophonemic rules Static patterns
What about generalisations that have exceptions? • English Vowel Shift is productive for some speakers for some vowels • Cena 1978, Jaeger 1980, McCawley 1986 • Pierrehumbert 2002 • English /k/ [s] / _ i in Latinate contexts • electric-ity vs cheek-y *chee[s]y • Is the rule active, or just a historical remnant? • Method • ADJN (back formation) • In Pierre’s entire career as a curator, he had never before seen such a perfect example of hovacity. It was an electrifyingly ______ sculpture. • NADJ (forward formation) • Before Pierre stood an electrifyingly hovac sculpture. In his entire career as curator, he had never before seen such a perfect example of ______. • Results • The alternation was productive, but only for Latinate and semi-Latinate targets.
What about generalisations that show no alternations? • Esper 1925 • Test subjects break up nonce words into morphemes based on phonotactics of their L1 • Moreton 1999 • Speakers have active knowledge of constraint on monosyllables ending in lax vowel, which they use in speech perception • Pater and Tessier 2003 • toy grammars easier to acquire when their alternations conform to phonotactic generalizations in their L1 • Dell et al 2000 • speech errors conform to phonotactics of data in toy language • Cebrian 2002 • native English speakers, and Catalan learners of English, use this restriction in interpreting the morphological composition of nonce words. • Vaux 2003 • Productivity of MSCs • Kaun and Harrison 1999 on Tuvan reduplication…
Tuvan overwriting reduplication • Common assumption among phonologists: • Non-alternating structure is stored as such in underlying forms. • Alternating structure is not stored in URs. • Alternation Condition (Kiparsky ‘68), Lexicon Optimization (P&S ‘93) • Kaun and Harrison 1999: • Observation: Tuvan VH: all vowels in a root agree wrt [back] • Question:does vowel harmony apply to non-alternating forms? • Method: teach subjects Jocular Reduplication; see if new V triggers root harmony • Replace first vowel of root with [a] nom ‘book’ nom-nam • If root vowel is [a], replace it with [u] at ‘name’ at-ut • Results: harmonic forms reharmonize, disharmonic forms don’t • Harmonic words idik ‘boot’ idik-adık (not *adik) • Disharmonic words mašina ‘car’ mašina-mušina (*mušı/una)
Tuvan overwriting reduplication • Conclusions: • Disharmonic forms are fully specified underlyingly • Harmonic forms are not (“Free Ride”, McCarthy 2004) • Theoretical implication: • Generalisations can be formed over non-alternating phonological material i d i k m a š i n a | | | | [-bk] [+b] [-b] [+b]
The formal statement of rules Rules take the general form A B /X_Y • A target of rule, an element in UR • becomes • B what the segment containing A becomes • / in the environment of • _ position of the target A • X element left-adjacent to A (can be absent) • Y element right-adjacent to A (can be absent) • # word boundary • Ø zero/nothing • /X/ underlying form • [X] surface form • <X> stray segment • (X) optional segment • α,β,γ variables
Key rule types • Insertion • Ø → A / B _ C • Insert A between any BC sequence • Ø → A / _# • Insert A word-finally • Deletion • A → Ø / B _ C • Delete A between B and C • A → Ø / #_ • Delete A word-initially • Alpha Rule • [αX] → [-αX] / B _ ]σ • Invert the feature specification for X when it occurs after B at the end of a syllable
Desiderata in rules • Keys: • Elsewhere Case = UR • Use as few rules as possible • This includes trying to collapse rules dealing with (seemingly) separate phenomena, such as the English plural and other voice assimilation processes • Be as general as possible • E.g. try “stops” rather than “{p t}” • Be as predictive as possible • a rule that merely describes the facts is essentially useless • The last two points normally boil down to the same thing (use as few features as possible, etc.) • Linguists generally temper their rule formulations with consideration of what is (typologically) plausible
Choosing a UR • Relevance of Elsewhere Case • English aspiration • Insertion of material is less common than deletion • Generalisation: avoid insertion/creation of arbitrary elements • Consideration of rule typology • Final devoicing, palatalization, etc. • An example • Sound X occurs only at the ends of words, while sound Y occurs anywhere but at the ends of words. Which of the following rules is most likely to be involved? • X → Y / ___ # • Y → X / ___ # • X → Y everywhere but / ___ #
Use as few rules as possible • English aspiration • p → ph, t → th, k → kh3 rules • [-voice, -cont] → [+spread glottis] 1 rule
Use as few features as possible • Voicing neutralization • Russian voiced obstruents become voiceless word-finally • Voiced obstruents = [+voice, -sonorant, +consonantal…] • Relatively specific formulation: • [+voice, -son, +cons] → [-voice, -son, +cons] / _ # • More general/predictive formulation, using fewer features: • [-son] → [-voice] / _ # • voiceless obstruents vacuously undergo the rule
Spanish spirantization Noun definite gloss banca [baNka] la banca [la BaNka] bank demora [demoRa] la demora [la DemoRa] delay gana [gana] la gana [la ana] desire • What are the segments targeted by the rule? • In what environment(s) do they undergo the rule? • The set of sounds that undergoes this change is the voiced stops, i.e. the natural class of [+consonantal, -sonorant, -continuant, +voice] segments. • The set of sounds produced by the rule is the voicedfricatives, i.e. the natural class of [+consonantal, -sonorant, +continuant, +voice] segments. • The set of sounds that triggers the change is vowels and r, i.e. the natural class of [+continuant] segments. • We could therefore say: • [+cons, -son, -cont, +voice] [+cons, -son, +cont, +voice] / [+cont]_ • However, we want to be as general and efficient as possible. Therefore: • [+voice] [+continuant] / [+continuant] _
English plural formation • Formation of regular plurals of nouns in English: cat : cat[s] dog : dog[z] ash : ash[z] • Possible analyses: 1. Memorize each word and its plural form. 2. Memorize 3 plural endings; assign each word to class 1, 2, or 3. 3. {plural} • [s] after {p t k ...} • [z] after {b d g ...} • [z] after … 4. several general rules (holding over domains broader than the plural): • Plural selection: {plural} → /-z/ • Epenthesis: Ø → [] / _ <C> cf. knish • Voicing Assimilation: [-son] → [αvoice] / _ [αvoice] ]σcf. fif-th • Predictions? • Analyses 1 and 2 predict that speakers will be unable to deal with foreign and made-up words.
What does each model predict? Some theories of English plural formation • rule-based • [+pl] [-z] / {aeioubdgmnŋð…} _ [-s] / {ptkθf} _ [-əz] / {szčĵšž} _ • [+pl] [-z] / [+voice, -strident] _ [-s] / [-voice, -strident] _ [-əz] / [+strident] _ • [+pl] [-əz] / [+strident] _ [-s] / [-voice] _ [-z] / elsewhere • rule 1 [+pl] /-z/ rule 2 Ø [ə] / _ <C> rule 3 [+cons] [-voice] / [-voice] _ • probabilistic ( analogical, connectionist) • wug + PL 70% of g-final words take -z 70% wugz • wug + PL 70% of g-final words take -z 100% wugz • memory-based unordered ordered
Avoiding insertion/creation • Proto-Polynesian *C → Ø / _ # • Synchronic analysis: • Passive /-ia/, gerundive /-aŋa/ • V → Ø / V _ • Better to have C-deletion rule than to have many allomorphs for the passive and the gerundive • The allomorphy analysis also incorrectly predicts the existence of roots selecting -tia but -maŋa • NB Maori actually did later choose the allomorphy analysis, and then made -tia its default form Hale, Ken. 1973. Deep-surface canonical disparities in relation to analysis and change: An Australian example. Current Trends in Linguistics 11:401-458.
References Armbruster, Thomas. 1978. The Psychological Reality of the Vowel Shift and Laxing Rules Dissertation Abstracts International. 39:1516A-17A. Aske, Jon. 1990. Disembodied Rules versus Patterns in the Lexicon: Testing the Psychological Reality of Spanish Stress Rules Berkeley Ling. Soc.. Berkeley; 30-45. Proceedings of the Sixteenth Annual Meeting of the Berkeley Linguistics Society, February 16-19, 1990: General Session and Parasession on the Legacy of Grice. Hall, Kira (ed.); Koenig, Jean-Pierre (ed.); Meacham, Michael (ed.); Reinman, Sondra (ed.); Sutton, Laurel A. (ed.). Berent, Iris, Steven Pinker, G. Ghavami, and S. Murphy. 2006. The Dislike of Regular Plurals in Compounds: Phonological Familiarity or Morphological Constraint? Manuscript, Harvard University. Bernstein Ratner, N. 1984 Phonological rule usage in mother-child speech. Journal of Phonetics 12:245-254. Cena, R. 1978. When is a phonological generalization psychologically real? Bloomington: Indiana University Linguistics Club. Dell, Gary, Reed, K.D., Adams, D.R., & Meyer, A. 2000. Speech errors, phonotactic constraints, and implicit learning: A study of the role of experience in language production. Journal of Experimental Psychology: Learning, Memory, and Cognition 6:1355-1367. Gallistel, C. Randy. 2003. Conditioning from an information processing perspective. Behavioural Processes 61.3:1234 1-13. Gallistel, C.Randy. 2006. The nature of learning and the functional architecture of the brain. In Q. Jing, et al (Eds) Psychological Science Around the World, vol 1. Proceedings of the 28th International Congress of Psychology. Sussex: Psychology Press. Hale, Ken. 1973. Deep-surface canonical disparities in relation to analysis and change: An Australian example. Current Trends in Linguistics 11:401-458. Hauser, Marc, Daniel Weiss, and Gary Marcus. 2002. Rule learning by cotton-top tamarins. Cognition 86:B15–B22. Hetzron, Robert. 1972. The Shape of a Rule and Diachrony. Bulletin of the School of Oriental and African Studies 35.3:451-475. Iverson, Greg. 1994. The Reality of Linguistic Rules (Studies in Language Companion Series, no. 26), ed. with S. Lima & R. Corrigan. Amsterdam: John Benjamins. Janda, Richard, Brian Joseph, and Neil Jacobs. 1992. Systematic hyperforeignisms as maximally external evidence for linguistic rules. In Iverson et al, the reality of linguistic rules. Marcus, Gary, S. Vijayan, S. Bandi Rao, and P. Vishton. 1999. Rule learning by seven-month-old infants. Science 283.5398. Minsky, Marvin and Seymour Papert. 1969. Perceptrons. Cambridge: MIT Press. Moreton, Elliott. 1999. Evidence for phonological grammar in speech perception. In: J. J. Ohala, Y. Hasegawa, M. Ohala, D. Granville, and A. C. Bailey (eds.), Proceedings of the 14th International Congress of Phonetic Sciences, San Francisco, pp. 2215-2217. Pater, J. and A.-M. Tessier. 2003. Phonotactic Knowledge and the Acquisition of Alternations. In M.J. Solé, D. Recasens, and J. Romero (eds.) Proceedings of the 15th International Congress on Phonetic Sciences, Barcelona. 1777-1180. Pierrehumbert 2002, an unnatural process. LabPhon 8. Pinker and Prince. 1994. Regular and irregular morphology and the psychological status of rules of grammar. In: S. D. Lima, R. L. Corrigan, G. K. Iverson (eds.), The reality of linguistic rules, 321-51. Amsterdam: Benjamins. Pinker and Ullmann Trammell, Robert. 1978. The Psychological Reality of Underlying Forms and Rules for Stress Journal of Psycholinguistic Research. 7:79-94.
Truly random control • Shows that: • statistical correlations of the sort “if CS then US” do not drive generalisation formation • Categorical generalisations can be extracted from gradient distributions • vs 33% response as one might expect for Group 1
Blocking • Shows that something beyond statistical association is taking place
Esper 1925 • Method • Ss learn names of 16 objects, each having one of four different shapes and one of four different colors • Ss trained on 14 object-name associations but tested on 16 to see if they generalize what they learned • 3 experimental conditions: • names presented to Group 1: • naslig, sownlig, nasdeg, sowndeg, where nas- and sown- coded color and -lig and -deg coded shape • Since these names consisted of two phonologically legal morphemes, this group could simplify their task by learning not 16 names but 8 morphemes (if they could discover them) plus the simple rule that the color morpheme preceded the shape morpheme in each name. • Names presented to Group 2: • bi-morphemic names, as with Group 1 • unlike group 1, the morphemes were not phonologically legal for English, e.g., nulgen, nuzgub, pelgen, pezgub (where nu- and pe- were color morphemes and -lgen and -zgub were shape morphemes, the latter two violating English morpheme structure constraints) • Names presented to Group 3 (a control group): • names with no morphemic structure • no recourse but to learn 16 idiosyncratic names • Results • As expected, group 1 learned their names much faster and more accurately than group 3. • Performance of Group 2 was similar to (and marginally worse than) that of group 3 • Analysis of the errors of group 2, including how they generalized what they’d learned to the two object-name associations excluded from the training session, revealed that they tried to make phonologically legal morphemes from the ill-formed ones. • Demonstrates (i) psychological reality of MSCs; (ii) ability to conduct morphological analysis
Korean borrowing of Coda [t] • Korean word-final [t|] /t, th, t’, č, čh, č’, s, s’/ • Surface word-final postvocalic [t] in loans and nonce words invariably assigned to /s/ (Martin 1992, Kang 1998, Hayes 1998, Iverson & Lee 2004) • supermarket nom. [supəmakhet|], dat. [supəmakhese] • What appears to be involved in the Korean case is that speakers know that surface word-final [t]s most often come from underlying /s/ in their native lexicon, and they therefore assign all new words to the same pattern.