310 likes | 370 Views
Mapping meaning onto use: a Pattern Dictionary of English Verbs. Patrick Hanks Faculty of Informatics, Masaryk University, Brno, Czech Republic hanks@fi.muni.cz University of Wolverhampton, August 4 2008. Outline of the talk. What is a pattern dictionary?
E N D
Mapping meaning onto use: a Pattern Dictionary of English Verbs Patrick Hanks Faculty of Informatics, Masaryk University, Brno, Czech Republic hanks@fi.muni.cz University of Wolverhampton, August 4 2008
Outline of the talk • What is a pattern dictionary? • Distinguishing norms from exploitations • Pattern dictionary and pattern grammar • Measuring collocations • Pattern dictionary and FrameNet • Conclusions
What is a pattern dictionary? • A semantically motivated inventory of word uses and their meanings. • Shows comparative frequency of each pattern of a polysemous word. • Meanings are associated with patterns, not with words. • The colligational preferences of a word are part of its patterns. • Driven by Corpus Pattern Analysis (CPA).
Patterns, not senses • Meanings taken from WordNet or a dictionary do not yield reliable data for disambiguating senses (Ide and Wilks 2005). • WordNet lists synonym sets and other semantic relations – but not senses. • WordNet did not do contrastive analysis of word senses. • In standard dictionaries, word senses are not mutually exclusive. • There is much fuzzy overlap between senses – which may be OK for sophisticated human users, but not for learners or computers. • The patterns of all and only the normal uses of a lexical item are (normally) mutually exclusive. • However, teasing them out from corpus data is hard.
Norms and exploitations • A pattern dictionary aims to record all and only the normal uses of each word. • Exploitation of norms is a subject for separate analysis. • Types of ‘exploitation’ include creative metaphor, ellipsis, and anomalous arguments. Consider: • The goat ate the newspaper. • The verb eat has a preference for nouns of semantic type [[Food]] in the direct object clause role. • ‘[[Animate]] eat [[Document]]’ is not a normal pattern of English. • Compare John devoured the newspaper. • ‘[[Human]] devour [[Document]]’ is a normal pattern of English. It is a conventional metaphor.
Specifically, ... The Pattern Dictionary of English Verbs • will list all normal patterns of each verb lemma in BNC. • Providing a benchmark for identification of norms in other corpora • by time period: patterns in historical corpora, future corpora . • by region: e.g. patterns in American English. • by domain, e.g.: • ‘[[Human]] abate [[Problem = Nuisance]]’is a domain-specific norm in the domain of legal jargon • abate is normally intransitive, in all normal, non-legal uses.
A typical pattern dictionary entry • irritate PATTERN 1 (90%): [[Anything]] irritate [[Human]] IMPLICATURE: [[Anything]] causes [[Human]] to feel mildly annoyed. PATTERN 2 (8%): [[Stuff]] irritate [[Body Part]] IMPLICATURE: [[Stuff]] causes [[Body Part]] to become inflamed and somewhat painful. • Notes: Both patterns are transitive (V n), but they have different meanings. They are distinguished by the semantic types of the nouns. Getting the right level of semantic generalization for each n is hard.
Semantic type vs. contextual role • Mr Woods sentenced Bailey to {death | five years | life imprisonment}. • [[Human 1]] sentence [[Human 2]] to [[Event]] • Semantic type: [[Human]] • Semantic roles: [[Human 1 = Judge]], [[Human 2 = Convicted Criminal]], [[Event (= Time Period) = Punishment]] • Semantic type is an intrinsic property of a lexical item. • Contextual role is assigned by the verb (and/or other elements in the context).
Nouns and verbs • The apparatus required for analysing nouns is different from that required for predicators (verbs, adjectives, prepositions). • Nouns are grouped into lexical sets in relation to the predicators that they normally colligate with. • Typically, the lexical sets are united by a semantic type. • A shallow ontology of nouns (grouped by their semantic type) is therefore part of the apparatus of a pattern dictionary. • Semantic typing in real texts is more complex than might be expected from invented examples. • Lexical sets include alternations , parts, and attributes of types • Example: calm (next slide)
What would an empirically well-founded ontology be like? • It would have to take account of verb-specific alternations, parts, and attributes of semantic types. • For example, Pattern 2 (of 8) for calm, verb, is: [[Human 1 | Event]] calm [[Human 2]] • Alternation of Human (2 – direct object): [[Animal]] • Parts of Human (2 – direct object): nerves • Attributes: fear, anxiety, agitation, .... [[Emotion]]
Pattern Grammar • Hunston and Francis (2000): Pattern Grammar: a corpus-driven approach to the lexical grammar of English • “One of the most important observations in a corpus-driven description of English is that patterns and meanings are connected.” • PG is founded on real texts and is a real attempt at empirically valid generalizations.
How is a Pattern Dictionary different from Pattern Grammar? • Pattern Grammar seeks similarities – words with similar meanings grouped together according to syntactic similarity. • By contrast, a pattern dictionary seeks systematic differences: • In particular, the differences in pattern that pick out different meanings of a polysemous word.
Apparatus • The Pattern Grammar has an admirably simple apparatus. • target word; part of speech categories; word order; and certain function words (mainly prepositions). • Simplicity can be overdone. • To represent the distinctive features of meaning in use, we also need at least: • Systematic analysis and categorization of colligations • Lexical items grouped by semantic type • Valencies – a.k.a. clause roles (S P O C A will do)
execute Example sentence: Private Joseph Byers was the first Kitchener volunteer to be executed. In the Pattern Dictionary (but not in PG) semantic types distinguish this sense from other “V n” patterns of the same verb, e.g. ‘execute an order’.
enlist Example sentence: He was 17 and under age when he enlisted in the 1st Royal Scots Fusiliers. Pattern Grammar and Pattern Dictionary agree in contrasting this sense with other patterns such as “[[Human]] enlist [[Assistance]]” (V n).
go Example sentence: His inexperience and the horrors he witnessed caused him to go absent without leave. This is a light verb (“delexical verb” in Sinclair’s terminology), with many patterns. The “adj” in PG is a Subject Complement. The small lexical set, {absent | AWOL}, in the Pattern Dictionary activates a particular meaning of go, contrasting with other patterns of go having a Subject Complement, e.g. go {mad | bananas} .
plead Example sentence: Byers pleaded guilty. The adj in this pattern is an Object Complement. The Object Complement is populated by a lexical set of just two possible (normal) items. (“plead innocent” is plausible but not idiomatic.)
fire Example sentence: … the firing squad had fired wide to avoid killing the youth. The “adj” in this sentence has the clause role of Adjunct or (in my terminology) Adverbial of Direction, as in: The police fired into the crowd. They fired over their heads.
Collocations (1) • Hoey (2005), quoting Partington (1998), distinguishes textual, statistical, and psychological definitions of ‘collocation’. • The statistical definition: “The relationship a lexical item has with items that appear with greater than random probability in its textual context” (Hoey 1991). • That’s exactly right. • But it does not say how “greater than random probability” is to be measured. • Later: “The [statistical] definition says nothing interesting about the phenomenon.” (Hoey 2005). • This is exactly wrong. • To understand how word meaning works, measuring the statistics of collocation is both essential and interesting.
How to Measure Collocations? Various statistical tools are available, e.g.: • Mutual information (“MI”; Church and Hanks 1990) • tends to favor content words as collocates • t-scoretends to favor function words as collocates. • Log likelihood ratios (e.g. Dunning 1993) • Results in collocational analysis are intuitively unsatisfactory • Word Sketch Engine (Kilgarriff, Rychlý, et al., 2004) • measures salience scores for pre-determined colligational patterns • Uses Pointwise Mutual Information Take your pick – but it must be done, one way or the other.
Pattern Dictionary and FrameNet CPA investigates syntagmatic criteria for distinguishing different meanings of polysemous words, in a “semantically shallow” way. FrameNet: • expresses the deep semantics of situations (frames); • proceeds frame by frame, not word by word; • analyses situations in terms of frame elements; • studies meaning differences and similarities between different words in a frame; • does not explicitly study meaning differences of polysemous words; • does not analyse corpus data systematically, but goes fishing in corpora for examples in support of hypotheses; • has problems grouping words into frames, and misses some; • has no established inventory of frames; • has no criteria for completeness of a lexical entry.
Construction Grammar (1) • Focus on meaning, not just well-formedness. • Challenges reductionist theories of language • Meaning is associated with constructions. • Anything from a word to a clause can be a construction. • Example: ‘she slept her way to the top.’ • Sleep is not normally a goal-achievement verb. • But in this sentence, it is coerced into being one by the construction “[V] one’s way to [[Status]]”. • This meaning is not arrived at by a concatenation of the meanings of the lexical items of which the sentence is composed.
Construction Grammar (2) • So far so good – but Construction Grammar is in the speculative tradition. It is not based on analysis of evidence. • It is based largely on made-up examples, many of which are bizarre, e.g. The gardener watered the flowers flat. • Corpus evidence shows that the verb water does not normally participate in the resultative construction. • A distinction between normal usage and exploitation of norms is needed. • Abnormal examples are conducive to distortions in the theory. • CG needs corpus analysis. • Some sort of synthesis between PG and CG is desirable.
Theoretical consequences and practical applications (1) Pedagogical: • Anyone acquiring a language must learn competence in two kinds of rule-governed linguistic behaviour: • How to use words normally • How to exploit the norms (creative metaphors, ellipsis, etc.) • A pattern dictionary gives comparative frequency of patterns. • A lexical syllabus could select only primary norms. • “Primary norms” are a) high-frequency norms and b) concrete norms. • In error analysis: what norm was aimed at? • If learners are exploiting norms creatively, do you (the teacher) really want them to?
Theoretical consequences and practical applications (2) For theoretical linguistics: • Are some grammars better than others for representing how words are used to make meanings? ‘S NP VP’: a confusion of language with logic? • The third argument (‘adjunct’, ‘adverbial’): • CPA shows that a new grammar of adverbials is needed. • Metaphor analysis: • CPA distinguishes conventional metaphors from exploitations. • Ontologies: • The relationship between a possible ontology of words in use and scientific conceptual ontologies such as WordNet.
Theoretical consequences and practical applications (3) • For computational linguistics and AI: • Improving machine translation • Getting the pattern right is more likely to select the right translation. • Parsing and word-class tagging: • CLAWS achieves ~90% accuracy in word-class tagging in BNC • CPA reveals some systematic errors in CLAWS tagging. • Anaphora resolution: • He found a glass of water on the table and drank it. • ‘[[Animate]] drink [[Liquid]]’ selects water as a direct object
Conclusions • Goal: to work out how people use words to make meanings. • Building a framework for this purpose requires: • Clause role analysis of corpus data • Statistical analysis of colligations in a corpus. The result: An inventory of mutually exclusive patterns of usage of polysemous verbs, with a meaning (or “implicature”) attached to each pattern. A “shimmering ontology” of lexical sets of nouns: • Each lexical set consists not only of a semantic type, but also of: • a) prototypical members, b) canonical members, c) ad-hoc members, d) parts, and e) attributes.
Thanks • To you, for listening, • To the late John Sinclair and the (still extant) James Pustejovsky, who have inspired this approach, • To Karel Pala, Pavel Rychlý, Adam Rambousek, and Adam Kilgarriff, who have created tools that make this kind of analysis possible, • and to the Academy of Sciences of the Czech Republic (project T100300419) and the Czech Ministry of Education (National Research Program II project 2C06009), for, in part, funding the research.