370 likes | 384 Views
Learn about the relationship between denotation and collocation in computational semantics, exploring assumptions, models, and empirical investigations. Discover the nuances of adjective-noun frequencies and grammaticality judgments.
E N D
Investigating adjective denotation and collocation Ann Copestake Computer Laboratory, University of Cambridge
Outline • introduction: compositional semantics, GL and semantic space models. denotation and collocation • distribution of `magnitude’ adjectives • hypotheses about adjective denotation and collocation • semi-productivity
Themes • semi-productivity: extending paper in GL 2001 to phrases • statistical and symbolic models interacting • generation as well as analysis • computational account
Different branches of computational semantics • compositional semantics: capture syntax, (some) close-class words and (some) morphology • every x [ dog’(x) -> bark’(x)] • large coverage grammars as testbed for GL (constructions, composition, underspecification) • lexical semantics, e.g., • GL (interacts with compositional semantics) • WordNet • meaning postulates etc • semantic space models, e.g., • LSA • Schütze (1995) • Lin (multiple papers), Pado and Lapata (2003)
semantic spaces • acquired from corpora • generally, collect vectors of words which co-occur with the target • more sophisticated models incorporate syntactic relationships
Semantic space models and compositional semantics? • do spaces correspond to predicates in compositional semantics? e.g., bark’ • attractions • automatic acquisition • similarity metrics, priming • fuzziness, meaning variation, sense clustering • statistical approximation to real world knowledge? (but fallacy with parse selection techniques) • problems • classical lexical semantic relations (hyponymy etc) aren’t captured well • can’t do inference • sensitivity to domain/corpus • role of collocation?
Denotation: assumptions • Truth-conditional, logically formalisable (in principle), refers to `real world’ (extension) • Not necessarily decomposable: natural kinds (dog’ – canis familiaris), natural predicates • Naive physics, biology, etc • Computationally: specification of meaning that interfaces with non-linguistic components • Selectional restrictions? • bark’(x) -> dog’(x) or seal’(x) or ...
Collocation: assumptions • Significant co-occurrences of words in syntactically interesting relationships • `syntactically interesting’: for this talk, attributive adjectives and the nouns they immediately precede • `significant’: statistically significant (but on what assumptions about baseline?) • Compositional, no idiosyncratic syntax etc (as opposed to multiword expression) • About language rather than the real world
Collocation versus denotation • Whether an unusually frequent word pair is a collocation or not depends on assumptions about denotation: fix denotation to investigate collocation • Empirically: investigations using WordNet synsets (Pearce, 2001) • Anti-collocation: words that might be expected to go together and tend not to • e.g., ? flawless behaviour (Cruse, 1986): big rain (unless explained by denotation) • e.g., buy house is predictable on basis of denotation, shake fist is not
Collocation and denotation investigations • can this notion of collocation be made precise, empirically testable? • assumptions about denotation determine whether something is a collocation • semantic space models will include collocational effects • initial, very preliminary, investigations with magnitude adjectives • attributive adjectives: can get corpus data without parsing • only one argument to consider
Distribution of `magnitude’ adjectives: summary • some very frequent adjectives have magnitude-related meanings (e.g., heavy, high, big, large) • basic meaning with simple concrete entities • extended meaning with abstract nouns, non-concrete physical entities (high taxation, heavy rain) • extended uses more common than basic • not all magnitude adjectives – e.g. tall • nouns tend to occur with a limited subset of these extended adjectives • some apparent semantic groupings of nouns which go with particular adjectives, but not easily specified
Distribution • Investigated the distribution of heavy, high, big, large, strong, great, major with the most commonco-occurring nouns in the BNC • Nouns tend to occur with up to three of these adjectives with high frequency and low or zero frequency with the rest • My intuitive grammaticality judgments correlate but allow for some unseen combinations and disallow a few observed but very infrequent ones • big, major and great are grammatical with many nouns (but not frequent with most), strong and heavy are ungrammatical with most nouns, high and large intermediate
heavy: groupings? magnitude: dew, rainstorm, downpour, rain, rainfall, snowfall, fall, snow, shower: frost, spindrift: clouds, mist, fog: flow, flooding, bleeding, period, traffic: demands, reliance, workload, responsibility, emphasis, dependence: irony, sarcasm, criticism: infestation, soiling: loss, price, cost, expenditure, taxation, fine, penalty, damages, investment: punishment, sentence: fire, bombardment, casualties, defeat, fighting: burden, load, weight, pressure: crop: advertising: use, drinking: magnitude of verb: drinker, smoker: magnitude related? odour, perfume, scent, smell, whiff: lunch: sea, surf, swell:
high: groupings? magnitude:esteem, status, regard, reputation, standing, calibre, value, priority; grade, quality, level; proportion, degree, incidence, frequency, number, prevalence, percentage; volume, speed, voltage, pressure, concentration, density, performance, temperature, energy, resolution, dose, wind; risk, cost, price, rate, inflation, tax, taxation, mortality, turnover, wage, income, productivity, unemployment, demand magnitude of verb: earner
heavy and high • 50 nouns in BNC with the extended magnitude use of heavy with frequency 10 or more • 160 such nouns with high • Only 9 such nouns with both adjectives: price, pressure, investment, demand, rainfall, cost, costs, concentration, taxation
Basic adjective denotation with simple concrete objects: high’(x) => zdim(x) > norm(zdim,type(x),c) heavy’(x) => wt(x) > norm(wt,type(x),c) where zdim is distance on vertical, wt is weight (measure functions, MF) norm(MF,class,context) is some standard for MF for class in context (high’ also requires selectional restriction – not animate)
Metaphor • Different metaphors for different nouns (cf., Lakoff et al) • `high’ nouns measured with an upright scale: e.g., temperature: temperature is rising • `heavy’ nouns metaphorically like burden: e.g., workload: her workload is weighing on her • Empirical account of distribution? • predictability of noun classes? high volume? high and heavy taxation • adjective denotation for inference etc? via literal denotation? • Discussed again at end of talk
Possible empirical accounts of distribution • Difference in denotation between `extended’ uses of adjectives • Grammaticized selectional restrictions/preferences • Lexical selection • stipulate Magn function with nouns (Meaning-Text Theory) • Semi-productivity / collocation • plus semantic back-off
Computational semantics perspective • Require workable account of denotation: not too difficult to acquire, not over-specific • Require account of distribution for generation • Robustness and completeness • Can’t assume pragmatics / real world knowledge does the difficult bits!
Denotation account of distribution • Denotation of adjective simply prevents it being possible with the noun. • heavy and high have different denotations heavy’(x) => MF(x) > norm(MF,type(x),c) & precipitation(x) or cost(x) or flow(x) or consumption(x)... (where rain(x) -> precipitation(x) and so on) • But: messy disjunction or multiple senses, open-ended, unlikely to be tractable. • e.g., heavy shower only for rain sense, not bathroom sense • Not falsifiable, but no motivation other than distribution. • Dictionary definitions can be seen as doing this (informally), but none account for observed distribution.
Selectional restrictions and distribution • Assume the adjectives have the same denotation • Distribution via features in the lexicon • e.g., literal high selects for [ANIMATE false ] • approach used in the LinGO ERG for in/on in temporal expressions • grammaticized, so doesn’t need to be determined by denotation (though assume consistency) • can utilise qualia structure • Problem: can’t find a reasonable set of cross-cutting features! • Stipulative approach possible, but unattractive.
Lexical selection • MTT approach • noun specifies its Magn adjective • in Mel’čuk and Polguère (1987), Magn is a function, but could modify to make it a set, or vary meanings • stipulative: if we’re going to do this, why not use a corpus directly?
Collocational account of distribution • all the adjectives share a denotation corresponding to magnitude (more details later), distribution differences due to collocation, soft rather than hard constraints • linguistically: • adjective-noun combination is semi-productive • denotation and syntax allow heavy esteem etc, but speakers are sensitive to frequencies, prefer more frequent phrases with same meaning • cf morphology and sense extension: Briscoe and Copestake (1999) • blocking (but weaker than with morphology) • anti-collocations as reflection of semi-productivity
Collocational account of distribution • computationally, fits with some current practice: • filter adjective-noun realisations according to n-grams (statistical generation – e.g., Langkilde and Knight) • use of co-occurrences in WSD • back-off techniques
Collocational vs denotational differences heavy high Denotation difference low Collocation difference
Back-off and analogy • back-off: decision for infrequent noun with no corpus evidence for specific magnitude adjective • based on productivity of adjective: number of nouns it occurs with • default to big • back-off also sensitive to word clusters • e.g., heavy spindrift because spindrift is semantically similar to snow • semantic space models: i.e., group according to distribution with other words • hence, adjective has some correlation with semantics of the noun
Metaphor again • extended metaphor idea is consistent with idea that clusters for backoff are based on semantic space • words cluster according to how they co-occur • e.g., high words cluster with rise words? • but this doesn’t require that we interpret high literally and then coerce
More details: denotation of extended adjective uses • mass: e.g., rain, and some plural e.g., casualties • cf much, many • inherent measure: e.g., grade, percentage, fine • other: e.g., rainstorm, defeat, bombardment • attribute in qualia has Magn – heavy rainstorm equivalent to storm with heavy rain • also heavy drinker etc
More details • Different uses cross-cut adjective distinction and domain categories • Want to have single extended sense and some form of co-composition • Further complications: nouns with temporal duration • heavy rain – not the same as persistent rain • heavy fighting but heavy drinking • how much of this do we have to encode specifically?
Connotation • heavy often has negative connotations • heavy fine but not ? heavy reward etc • heavy taxation versus high taxation • consistent with the semantic cluster / extended metaphor idea
Necessary experiments • None of this is tested yet! • Specify denotation, check for accuracy • Implement semi-productivity model with back-off • Determine predictability of adjective based on noun alone • Extension to other adjectives? Magnitude adjectives may be more lexical than others.
Conclusions • Testing collocational account of distribution requires fixing denotation • Magnitude adjectives: assume same denotation • more complex denotations would need different experiments • Semi-productivity at the phrasal level • Back-off account is crucial
Some final comments • denotation, selectional restriction, collocation: choice between mechanisms? • ngrams for language models for speech recognition • variants of semantic space models that are less sensitive to collocation effects? • can we `remove’ collocation?