Generation of Referring Expressions (GRE)

Generation of Referring Expressions (GRE) The Incremental Algorithm (IA) Dale & Reiter (1995)

The task: GRE • NLG can have different kinds of inputs: • ‘Flat’ data (collections of atoms, e.g., in the tables of a database) • Logically complex data • In both cases, unfamiliar constants may be used, and this is sometimes unavoidable

No familiar constant available: • The referent has a familiar name, but it’s not unique, e.g., ‘John Smith’ • The referent has no familiar name: trains, furniture, trees, atomic particles, … ( In such cases, databases use database keys, e.g., ‘Smith$73527$’, ‘TRAIN-3821’ ) 3. Similar: sets of objects.

Natural Languages are too economic to have a proper name for everything • Names may not even be most appropriate • So, speakers/NLG systems have to invent ways of referring to things. E.g., ‘the 7:38 Trenton express’

Older work on GRE • Winograd (1972) – the SHRDLU system, and especially • Appelt (1985) – the KAMP system: trying to understand reference as part of speech acts: • How can RE’s sometimes add information? • Why can RE1 be more relevant than RE2? Dale and Reiter isolate GRE as a separate task, and focus on simple cases

Dale & Reiter: best description fulfils Gricean maxims. • (Quality:) list properties truthfully • (Quantity:) list sufficient properties to allow hearer to identify referent – but not more • (Relevance:) use properties that are of interest in themselves * • (Manner:) be brief * Slightly different from D&R 1995

D&R’s expectation: • Violation of a maxim leads to implicatures. • For example, • [Quantity]‘the pitbull’ (when there is only one dog). • There’s just one problem: …

…people don’t speak this way For example, • [Quantity] ‘the red chair’(when there is only one red object in the domain). • [Quantity]‘I broke my arm’ (when I have two). General: empirical work shows much redundancy Similar for other maxims, e.g., • [Quality]‘the man with the martini’ (Donellan)

Consider the following formalization: Full Brevity (FB): Never use more than the minimal number of properties required for identification (Dale 1989) An algorithm:

Dale 1989: • Check whether 1 property is enough • Check whether 2 properties is enough …. Etc., until success {minimal description is generated} or failure {no description is possible}

Problem: exponential complexity • Worst-case, this algorithm would have to inspect all combinations of properties. n properties combinations. Some algorithms may be faster, but … • Theoretical result: any FB algorithm must be exponential in the number of properties.

D&R conclude that Full Brevity cannot be achieved in practice. • They designed an algorithm that only approximatesFull Brevity: the Incremental Algorithm (IA).

Psycholinguistic inspiration behind IA (e.g. Pechmann 89; overview in Levelt 89) • Speakers often include “unnecessary modifiers” in their referring expressions • Speakers often start describing a referent before they have seen all distractors (as shown by eye-tracking experiments) • Some Attributes (e.g. Colour) seem more likely to be noticed and used than others • Some Attributes (e.g., Type) contribute strongly to a Gestalt. Gestalts help readers identify referents. (“The red thing” vs. “the red bird”) Let’s start with a simplified version of IA, which uses properties rather than <Attribute:Value> pairs.Type and head nouns are ignored, for now.

Incremental Algorithm (informal): • Properties are considered in a fixed order: P = • A property is included if it is ‘useful’: true of target; false of some distractors • Stop when done; so earlier properties have a greater chance of being included. (E.g., a perceptually salient property) • Therefore called preference order.

r = individual to be described • P = list of properties, in preference order • P is a property • L= properties in generated description (Recall: we’re not worried about realization today)

P = < desk (ab), chair (cde), Swedish (ac), Italian (bde), dark (ade), light (bc), grey (a), 100£ ({ac}), 150£(bd) , 250£ ({}), wooden ({}), metal (abcde), cotton ({d}) > Domain = {a,b,c,d,e} . Now describe: a = <...> d = <...> e = <...>

P = <desk (ab), chair (cde), Swedish (ac), Italian (bde), dark (ade), light (bc), grey (a), 100£ (ac),150£ (bd),250£ ({}), wooden ({}), metal (abcde), cotton (d) > Domain = {a,b,c,d,e} . Now describe: a = <desk {ab}, Swedish {ac}> d = <chair,Italian,150> (Nonminimal) e = <chair,Italian, ....> (Impossible)

Incremental Algorithm • It’s a hillclimbing algorithm: ever better approximations of a successful description. • ‘Incremental’ implies no backtracking. • Not always the minimal number of properties.

Incremental Algorithm • Logical completeness: A unique description is found in finite time if there exists one • Question: is IA logicaly complete? • Computational complexity: Assume thattesting for usefulness takes constant time.Then worst-case time complexity is O(np) where np is the number of properties in P.

Better approximation of Full Brevity(D&R 1995) • Attribute + Value model: Properties grouped together as in original example: Origin: Sweden, Italy, ... Colour: dark, grey, ... • Optimization within the set of properties based on the same Attribute

Incremental Algorithm, using Attributes and Values • r = individual to be described • A = list of Attributes, in preference order • Def: = Value j of Attribute i • L= properties in generated description

FindBestValue(r,A): - Find Values of A that are true of r, while removing some distractors (If these don’t exist, go to next Attribute) - Within this set, select the Value that removes the largest number of distractors(NB: discriminatory power) - If there’s a tie, select the most general one - If there’s still a tie, select an arbitrary one

Example: D = {a,b,c,d,f,g} • Type: furniture (abcd), desk (ab), chair (cd) • Origin: Europe (bdfg), USA (ac), Italy (bd) Describe a: {desk, American} (furniture removes fewer distractors than desk) Describe b: {desk, European} (European is more general than Italian) N.B. This disregards relevance, etc.

This is a better approximation of Full Brevity But is it a better algorithm? • Question 1: Is it true that all values of an attribute are (roughly) equally preferred? • If the colour of a car is pink, this is more notable than if it’s white • Question 2: Doesn’t the new algorithm sometimes fail unnecessarily?

About question 2 • Exercise: Construct an example where no description is found, although one exists. • Hint: Let Attribute have Values whose extensions overlap.

Example: D = {a,b,c,d,e,f} • Contains: wood (abe), plastic (acdf) • Colour: grey (ab), yellow (cd) Describe a: {wood, grey, ...} - Failure (wood removes more distractors than plastic) Compare: Describe a: {plastic, grey} - Success

Conlusion • The version of IA that uses <Attribute,Value> format allows the use of simple ontological information (e.g., Italian ⊂ European) • But grouping properties into Attributes makes it difficult to model the “unusualness” of a property • And the idea of using discriminatory power leads to logical incompleteness. • IA is therefore (?) often used in its simpler form, without the <Attribute,Value> format

Complexity of the algorithm nd = nr. of distractors nl = nr. of properties in the description nv = nr. of Values (for all Attributes) According to D&R: O(nd nl ) (Typical running time) Alternative assessment: O(nv) (Worst-case running time)

Minor complication: Head nouns • Another way in which human descriptions are nonminimal • A description needs a Noun, but not all properties are expressed as Nouns • Example: Suppose Colour was the most-preferred Attribute, and suppose target = a

Colours: dark (ade), light (bc), grey (a) • Type: furniture (abcde), desk (ab), chair (cde) • Origin: Sweden (ac), Italy (bde) • Price: 100 (ac), 150 (bd) , 250 ({}) • Contains: wood ({}), metal ({abcde}), cotton(d) target = a Describe a: {grey} ‘The grey’ ? (Not in English)

D&R’s repair: • Assume that Values of the AttributeType can be expressed in a Noun. • After the core algorithm: - check whether Type is represented. - if not, then add the best Value of the Type Attribute to the description

Versions of Dale and Reiter’s Incremental Algorithm (IA) have often been implemented • Still the starting point for many new algorithms. • But how human-like is the output of the IA really? The paper does not contain an evaluation of the algorithms discussed

Comments on the algorithm • Redundancy exists, but not for principled reasons, e.g., for - marking topic changes, etc. (Corpus work by Pam Jordan et. al.) - making it easy to find the referent (Experimental work by Paraboni et al.)

Limitations of the algorithm • Targets are individual objects, never sets. What changes when target = {a,b,c} ? • Incremental algorithm uses only conjunctions of atomic properties. No negations, disjunctions, etc.

Limitations of D&R • No relations with other objects, e.g., ‘the orange on the table’. • Differences in salience are not taken into account. • When we say “the dog”, does this mean that there is only one dog in the world? • Language realization is disregarded.

Limitations of D&R 7. Calculation of complexity is iffy • Role of “Typical” run time and length of description is unclear • Greedy Algorithm (GA) dismissed even though it has polynomial complexity • GA: always choose the property that removes the maximum number of distractors

More fundamental features • Speaker and Hearer have shared knowledge • This knowledge can be formalised using atomic statements • Foundations were left unformalised, e.g. • Closed-World Assumption • Unique Name Assumption • The aim of GRE is to identify the target referent uniquely. (I.e., the aim is to construct a “distinguishing description” of the referent.)

Discussion: How badis it for a GRE algorithm to take exponential time choosing the best RE? • How do human speakers cope? • More complex types of referring expressions problem becomes even harder • Restrict to combinations whose length is less than x problem not exponential. • Example: descriptions containing at most n properties (Full Brevity)

Linguist’s view • We don’t pretend to mirror psychologically correct processes. (It’s enough if GRE output is correct). • So why worry if our algorithms are slow?

Mathematicians’ view • structure of a problem becomes clear when no restrictions are put. Practical addition: • What if the input does not conform with these restrictions?(GRE does not control its own input!)

A compromise view • Compare with Description Logic:- Increasingly complex algorithms …- that tackle larger and larger fragments of logic …- and whose complexity is ‘conservative’ • When looking at more complex phenomena, take care not to slow down generation of simple cases too much

A note on the history of the IA • Appelt (1985) did not focus on distinguishing descriptions • did not describe an algorithm in detail • suggested attempting properties one by one • cited the Gricean maxims • suggested that the shortest description may not always be the best one

Generation of Referring Expressions (GRE)