430 likes | 576 Views
Generation of Referring Expressions (GRE). Reading: Dale & Reiter (1995) (key paper in this area). The task: GRE. NLG can have different kinds of inputs: ‘Flat’ data (collections of atoms, e.g., in the tables of a database) Logically complex data
E N D
Generation of Referring Expressions (GRE) Reading: Dale & Reiter (1995) (key paper in this area)
The task: GRE • NLG can have different kinds of inputs: • ‘Flat’ data (collections of atoms, e.g., in the tables of a database) • Logically complex data • In both cases, unfamiliar constants may be used, and this is sometimes unavoidable
No familiar constant available: • The referent has a familiar name, but it’s not unique, e.g., ‘John Smith’ • The referent has no familiar name: trains, furniture, trees, atomic particles, … ( In such cases, databases use database keys, e.g., ‘Smith$73527$’, ‘TRAIN-3821’ ) 3. Similar: sets of objects (lecture 4).
Natural Languages are too economic to have a proper name for everything • Names may not even be most appropriate • So, speakers/NLG systems have to invent ways of referring to things. E.g., ‘the 7:38 Trenton express’ • Note: the problem arises whether the referent is a token or a type
GRE tries to find the best description • GRE is microcosm of NLG: e.g., determines • which properties to express (Content Determination) • which syntactic configuration to use(Syntactic Realization) • which words to choose (Lexical Choice)
This lecture: • Simplification 1:Content Determination only (until lecture 5). • Simplification 2:Definite descriptions only(Pronouns, demonstratives, etc., are disregarded; until tomorrow)
Dale & Reiter (1995): best description fulfills the Gricean maxims. E.g., • (Quality:) list properties truthfully • (Quantity:) list sufficient properties to allow hearer to identify referent – but not more • (Relevance:) use properties that are of interest in themselves * • (Manner:) be brief * Slightly different from D&R 1995
D&R’s expectation: • Violation of a maxim leads to implicatures. • For example, • [Quantity]‘the pitbull’ (when there is only one dog). • [Manner] ‘Get the cordless drill that’s in the toolbox’ (Appelt). • There’s just one problem: …
…people don’t speak this way For example, • [Manner] ‘the red chair’(when there is only one red object in the domain). • [Manner/Quantity]‘I broke my arm’ (when I have two). General: empirical work shows much redundancy Similar for other maxims, e.g., • [Quality]‘the man with the martini’ (Donellan)
Example Situation c, £100 d, £150 e, £? Swedish Italian b, £150 a, £100
Formalized in a KB • Type: furniture (abcde), desk (ab), chair (cde) • Origin: Sweden (ac), Italy (bde) • Colours: dark (ade), light (bc), grey (a) • Price: 100 (ac), 150 (bd) , 250 ({}) • Contains: wood ({}), metal ({abcde}), cotton(d) Assumption: all this is shared knowledge.
Violations of … • Manner: * ‘The £100 grey Swedish desk which is made of metal’ (Description of a) • Relevance: ‘The cotton chair is a fire hazard? ?Then why not buy the Swedish chair?’ (Descriptions of d and c respectively)
In fact, there is a second problem with Manner. Consider the following formalization: Full Brevity: Never use more than the minimal number of properties required for identification (Dale 1989) An algorithm:
Dale 1989: • Check whether 1 property is enough • Check whether 2 properties is enough …. Etc., until success {minimal description is generated} or failure {no description is possible}
Problem: exponential complexity • Worst-case, this algorithm would have to inspect all combinations of properties. n properties combinations. • Recall: one grain of rice on square one; twice as many on any subsequent square. • Some algorithms may be faster, but … • Theoretical result: algorithm must be exponential in the number of properties.
D&R conclude that Full Brevity cannot be achieved in practice. • They designed an algorithm that only approximatesFull Brevity: the Incremental Algorithm.
Incremental Algorithm (informal): • Properties are considered in a fixed order: P = • A property is included if it is ‘useful’: true of target; false of some distractors • Stop when done; so earlier properties have a greater chance of being included. (E.g., a perceptually salient property) • Therefore called preference order.
r = individual to be described • P = list of properties, in preference order • P is a property • L= properties in generated description (Recall: we’re not worried about realization today)
P = < furniture (abcde), desk (ab), chair (cde), Swedish (ac), Italian (bde), dark (ade), light (bc), grey (a), 100£ ({ac}), 150£(bd) , 250£ ({}), wooden ({}), metal (abcde), cotton ({d}) > Domain = {a,b,c,d,e} . Now describe: a = <...> d = <...> e = <...>
P = <furniture (abcde), desk (ab), chair (cde), Swedish (ac), Italian (bde), dark (ade), light (bc), grey (a), 100£ (ac),200£ (bd),250£ ({}), wooden ({}), metal (abcde), cotton (d) > Domain = {a,b,c,d,e} . Now describe: a = <desk {ab}, Swedish {ac}> d = <chair,Italian,dark,200> (Nonminimal) e = <chair,Italian,dark, ....> (Impossible)
[ An aside: shared info will be assumed to be complete and uncontroversial. Consider • Speaker: [[Student]] = {a,b,…} • Hearer: [[Student]] = {a,c,…} Does this make a referable? ]
Incremental Algorithm • It’s a hillclimbing algorithm: ever better approximations of a successful description. • ‘Incremental’ means no backtracking. • Not always the minimal number of properties.
Incremental Algorithm • Logical completeness: A unique description is found in finite time if there exists one. (Given reasonable assumptions, see van Deemter 2002) • Computational complexity: Assume thattesting for usefulness takes constant time.Then worst-case time complexity is O(np) where np is the number of properties in P.
Better approximation of Full Brevity(D&R 1995) • Attribute + Value model: Properties grouped together as in original example: Origin: Sweden, Italy, ... Colour: dark, grey, ... • Optimization within the set of properties based on the same Attribute
Incremental Algorithm, using Attributes and Values • r = individual to be described • A = list of Attributes, in preference order • Def: = Value i of Attribute j • L= properties in generated description
FindBestValue(r,A): - Find Values of A that are true of r, while removing some distractors (If these don’t exist, go to next Attribute) - Within this set, select the Value that removes the largest number of distractors - If there’s a tie, select the most general one - If there’s still a tie, select an arbitrary one
Example: D = {a,b,c,d,f,g} • Type: furniture (abcd), desk (ab), chair (cd) • Origin: Europe (bdfg), USA (ac), Italy (bd) Describe a: {desk, American} (furniture removes fewer distractors than desk) Describe b: {desk, European} (European is more general than Italian) N.B. This disregards relevance, etc.
P.S. Note the similarity with Van Rooy & Dekker’s semantic of answers: Let A and B be truthful answers to a question, then A is a better answer than B Utility(A) > Utility(B) or Utility(A) = Utility(B) & B A(More about this in the next lecture …)
Exercise on Logical Completeness: Construct an example where no description is found, although one exists. • Hint: Let Attribute have Values whose extensions overlap.
Example: D = {a,b,c,d,f} • Contains: wood (abe), plastic (acdf) • Colour: grey (ab), yellow (cd) Describe a: {wood, grey, ...} - Failure (wood removes more distractors than plastic) Compare: Describe a: {plastic, grey} - Success
Complexity of the algorithm nd = nr. of distractors nl = nr. of properties in the description nv = nr. of Values (for all Attributes) Alternative assessment: O(nv) (Worst-case running time) According to D&R: O(nd nl ) (Typical running time)
Minor complication: Head nouns • Another way in which human descriptions are nonminimal • A description needs a Noun, but not all properties are expressed as Nouns • Example: Suppose Colour was the most-preferred Attribute, and target = a
Colours: dark (ade), light (bc), grey (a) • Type: furniture (abcde), desk (ab), chair (cde) • Origin: Sweden (ac), Italy (bde) • Price: 100 (ac), 150 (bd) , 250 ({}) • Contains: wood ({}), metal ({abcde}), cotton(d) target = a Describe a: {grey} ‘The grey’ ? (Not in English)
D&R’s repair: • Assume that Values of the AttributeType can be expressed in a Noun. • After the core algorithm: - check whether Type is represented. - if not, then add the best Value of the Type Attribute to the description
Versions of Dale and Reiter’s Incremental Algorithm have often been implemented • Still the starting point for many new algorithms. (See later lectures.) • Worth reading!
Limitations of the algorithm • Redundancy does not arise for principled reasons, e.g., for - marking topic changes, etc. (Corpus work by Pam Jordan et. al.) - making it easy to find the referent (Experimental work by Paraboni et al. - Next lecture)
Limitations of the algorithm • Targets are individual objects, never sets. What changes when target = {a,b,c} ?(Lecture 4) • Incremental algorithm uses only conjunctions of atomic properties. No negations, disjunctions, etc. (Lecture 4)
Limitations of the algorithm • No relations with other objects, e.g., ‘the orange on the table’. (Lecture 3) • Differences in salience are not taken into account. (Lecture 3) • Language realization is disregarded. (Lecture 5)
Discussion: How badis it for a GRE algorithm to take exponential time? • More complex types of referring expressions problem becomes even harder • Restrict to combinations whose length is less than x problem not exponential. • Example: descriptions containing a most n properties (Full Brevity)
However: • “Mathematicians’ view”: structure of problem shows when no restrictions are put. • What if the input does not conform withthese restrictions?(GRE does not control its own input!)
Compare with Description Logic:- Increasingly complex algorithms …- that tackle larger and larger fragments of logic …- and whose complexity is ‘conservative’ • Question: how do human speakers cope?