320 likes | 511 Views
Dick Hudson Manchester, March 2009. Why memory matters in English grammar. Memory. Long-term memory Short-term memory aka ‘working memory’ – used for thinking limited capacity: ‘7 ± 2’ Maybe working memory is the currently active area of long-term memory. Memory as a network.
E N D
Dick Hudson Manchester, March 2009 Why memory matters in English grammar
Memory • Long-term memory • Short-term memory • aka ‘working memory’ – used for thinking • limited capacity: ‘7 ± 2’ • Maybe working memory is the currently active area of long-term memory
Memory as a network • Long-term memory is a network • Evidence: activation spills onto neighbours • Evidence: • priming of neighbouring words • speech errors are wrongly selected neighbours • But the network’s not just language • ‘cognitive linguistics’
Network activity • Node activation • activation takes energy, and is limited • keeping a node active is expensive • Node creation • essential for processing experience • also expensive • Node binding • expensive as it tends to confuse similar nodes
Activation active!
Node building new!!
Tokens and types • Memory must include temporary tokens as well as permanent types. • Tokens are different from types • different properties, e.g. time, speaker • even conflicting properties, e.g. mispelings • But tokens are also very expensive, • because they’re the focus of attention.
Tokens in syntactic theory • What tokens can we afford? • At least one token per word • At least one dependency token per word • But do we really need more? • e.g. for we: a word, and a DP? • Phrases are expensive • so they need really strong evidence! e.g. five tokens here
Dependency structure • Just one token node per word • And one per dependency • e.g. “Dependency grammar is very ancient.” p a s a Dependency grammar very ancient is
Phrase structure • One token per word, plus: • one token per phrase-mother. • one part-whole relation per word or phrase. • e.g. “Phrase structure is very young.”
The cost of phrase structure phrase structure is very young VERY expensive! is very young very young phrase structure young. is Phrase very structure
So what? (1) • Tokens are expensive (for memory resources) as long as they’re active. • So the sooner they de-activate, the better. • Tokens can de-activate sooner in dependency structure than in phrase structure. • So dependency structure is psychologically more plausible.
Dependency distance • How long must a word token stay active? • Till it’s linked as dependent to a ‘parent’. • What’s the cost of keeping it active? • The other tokens that are active at the same time. • I.e. cost of W = number of words between W and its parent. = dependency distance
An example p a s a Dependency grammar very ancient is dependency distance 1 0 0 0 N/A
Long subjects and dependency distance This is the dog that chased the cat that caught the rat that ate the cheese that lay in the house that Jack built. max dd = 0 The dog that chased the cat that caught the rat that ate the cheese that lay in the house that Jack built is this one. max dd = 21
So what (2) • Long subjects are expensive because their head competes for activation with all the other words between it and the verb. • Dependency distance measures this precisely. • Ed Gibson (MIT) has independently developed a similar measure.
Learning syntax • Dependency patterns can only be learned from active tokens. • Most words in casual speech have dd = 0. • 74.2% in PEN treebank • 63% adults in CHILDES • only 1-4% have dd > 4. • Every English dependency allows dd = 0.
So what? (3) • Learning dependency patterns is easy. • Adjacent but non-dependent words are (by definition) random, and have no lasting effect. • Non-adjacent but dependent words don’t matter because the same patterns can always be learned from easier examples. • So most of syntax is easy to learn as data. • inducing generalizations is more tricky.
Typology • Why are SVO languages so common? • SOV = 45%, SVO = 35%, VSO = 10% (± 5%) • Each order has some benefits. • For SVO, it’s low dependency distance. S V O V S O S O V min dd = 0 min dd = 1 min dd = 1
Moreover, …. noun big book about linguistics adjective very happy to see you preposition just before Christmas
So what? (4) • One of the pressures on languages is to minimize dependency distances. • If words allow two dependents, dd is 0 if the dependents are on opposite sides. • This is possible in all English word classes, not just in verbs. • Maybe SVO is part of a more general pattern which reduces memory load. ‘consistently mixed’
Long subjects • Long subjects are hard to produce. • The head word may de-activate before the verb is produced, hence frequent non-agreement examples: “… the accuracy of the quotes have not been disputed.” • Long subjects are also hard to understand. nearest active N
Why is it-extraposition helpful? 10 1 that extraposed sentences are easier to process than their unextraposed equivalents is clear It The extraposed version is more complex but easier.
Dependency structures for it-extraposition 1 2 It ’s clear that extraposed sentences are easier to process than their unextraposed equivalents. 2 1 max dd = 2 max dd = 10 10 2 is clear That extraposed sentences are easier to process than their unextraposed equivalents 2 1
Other tactics to help memory • Extraposition from NP Two people who were on the pavement died • ‘Heavy NP shift’ 8 anaphoric distance yesterday I saw something that would have made even you laugh • Topicalisation 8 3 we sat down to rest and have a light snack when we got there 13 3
Grammaticality and weight • These special strategies override normal rules. • But they’re only allowed for ‘heavy’ (or otherwise memory-heavy) structures. • *I rang up her. • I rang up the girl who ….. • So grammarians need a theory of memory.
Thank you • The theory is called Word Grammar: www.phon.ucl.ac.uk/home/dick/wg.htm • This slide show can be found at www.phon.ucl.ac.uk/home/dick/talks.htm
So what? (5) • English grammar has evolved to minimize demands on memory. • basic word order (consistently mixed) • special orders for overriding the basic order. • Grammaticality depends on memory load as well as on grammar.