500 likes | 1.13k Views
Defining Comparison in a Computational Linguistics Framework. Maria Milosavljevic Intelligent Interactive Technologies Group, CSIRO Mathematical and Information Sciences North Ryde, NSW http://www.cmis.csiro.au/Maria.Milosavljevic/. Overview. The Basic Ideas Language Technology
E N D
Defining Comparison in a Computational Linguistics Framework Maria Milosavljevic Intelligent Interactive Technologies Group, CSIRO Mathematical and Information Sciences North Ryde, NSW http://www.cmis.csiro.au/Maria.Milosavljevic/
Overview • The Basic Ideas • Language Technology • Some Definitions • An Ontology of Comparisons • Comparison in Context • Conclusions and Future Directions
The Basic Idea • Learning is incremental… we augment existing knowledge with new knowledge in order to maximise the extent to which the new knowledge coheres with our existing knowledge
The User’s Knowledge • Teaching should capitalise on the user’s existing knowledge: • maximise the hearer’s conceptual coherence of that entity • prevent the hearer from forming misconceptions about that entity • Most NLG systems utilise a model of the user’s knowledge to prevent repetition • It should also be used to build on her existing understanding
Overview • The Basic Ideas • Language Technology • Some Definitions • An Ontology of Comparisons • Comparison in Context • Conclusions and Future Directions
Natural Language Generation Natural Language Analysis Text Text Natural Language Technology ‘Meaning’
Objectives of Text Generation • Reduce information overload by constructing appropriate presentations on-demand • Tailor text to the individual’s knowledge, needs, abilities, situation, language, previous interactions, etc. • Decrease document construction and maintenance costs: texts are updated as underlying knowledge changes
Overview • The Basic Ideas • Language Technology • Some Definitions • An Ontology of Comparisons • Comparison in Context • Conclusions and Future Directions
Some Definitions • A propertyp of an entity is an ordered pair <a, v> consisting of an attributea and its corresponding valuev, for example <colour, red>. • A focused entity is the topic of a text, or the entity being discussed in a text.
Some Definitions • A proposition is a predication of a property to an entity, or the relationship which holds between two entities (for example, <part-of, mouth-piece, clarinet>). • A description of a focused entity is defined as the linguistic realisation of a set of one or more propositions, the purpose of which is to allow the hearer to build a mental model of the focused entity.
Definition: Comparative proposition • A comparative proposition is a proposition which states the existence of a difference or a similarity between two entities. For example, the comparative proposition below states that there is a difference between the entities dromedary camel and bactrian camel. Note that the attributes match (number-of-humps). This is important in order to draw similarities and differences together. (difference (hasprop dromedary-camel (number-of-humps 1)), (hasprop bactrian-camel (number-of-humps 2)))
Definition: Comparison • A comparative clause is the linguistic realisation of a comparative proposition. • A comparison is a set of one or more comparative propositions which together express the differences and/or similarities between two entities. • A comparative text is the linguistic realisation of a comparison. • For convenience sake, we will also use the term comparison to mean comparative text. • A comparator entity is the entity which is being compared to the focused entity within a comparative text.
Definition: Uni/Bi-Focal • A uni-focal comparison is a comparative text which has one primary focused entity. “Hearing aids have the same basic components as any public-address system, but all the components are miniature and the amplified sound is delivered to the ear of the hearing-aid user only.” • A bi-focal comparison is a comparative text which has two equally-important foci.
Bi-focal Comparison Rabbits and Hares, common name for certain small, furry mammals with long ears and short tails. Although the names rabbit and hare are often used interchangeably, in zoological classification the species called rabbits are characterised by the helplessness of their offspring, which are born naked and with closed eyes, and by their gregarious habit of living in colonies in underground burrows. (The exception is the cottontail of North America, which does not dig burrows; its nest is on the surface, usually in dense vegetation, and it is not social.) Species designated zoologically as hares are born furred and with open eyes, and the adults merely construct a simple nest and rarely live socially. Furthermore, the hare is generally larger than the rabbit and has longer ears with characteristic blackmarkings. Moreover, the skulls of rabbits and hares are distinctly different. ... (Encarta Encyclopedia)
Definition: Multi-Focal • A n-focal comparison is a comparative text which has n equally-important foci. We will use the term multi-focal comparison to refer to a comparison with more than two foci. “The buffeo, the smallest dolphin, is less than 1.2 m (less than 4 ft) long; the largest, the bottle-nosed dolphin, reaches a length of 3 m (10 ft). The killer whale is considered a dolphin despite its much greater length of 9 m (30 f t).”
Overview • The Basic Ideas • Language Technology • Some Definitions • An Ontology of Comparisons • Comparison in Context • Conclusions and Future Directions
Whole text Partial text Bi-focal Direct comparison Ontology of Comparisons Comparative text user-initiated system-initiated
Direct Comparison 1 Rabbits and Hares, common name for certain small, furry mammals with long ears and short tails. Although the names rabbit and hare are often used interchangeably, in zoological classification the species called rabbits are characterised by the helplessness of their offspring, which are born naked and with closed eyes, and by their gregarious habit of living in colonies in underground burrows. (The exception is the cottontail of North America, which does not dig burrows; its nest is on the surface, usually in dense vegetation, and it is not social.) Species designated zoologically as hares are born furred and with open eyes, and the adults merely construct a simple nest and rarely live socially. Furthermore, the hare is generally larger than the rabbit and has longer ears with characteristic blackmarkings. Moreover, the skulls of rabbits and hares are distinctly different. ... (Encarta Encyclopedia)
Direct Comparison 2 Microsoft Carpoint Example Comparison between a BMW and an Audi Choose a model 1998 1998 BMW Audi 3-Series A4 Price Range $21,390 - $41,500 $23,790 - $30,040 Airbags Driver, Passenger, Driver, Passenger, Side Side Choose a trim 318i 2.8 Base Price (MSRP) $26,150 $28,390 Base Invoice $22,930 $24,944 Destination Charge $570 $500 Driver Airbag Standard Standard Passenger Airbag Standard Standard ...
Definition: Direct • A direct comparison is a bi-focal comparative text which exists as an entire text, and whose purpose is to: (i) highlight that the two foci exist and are highly similar; (ii) describe the foci; and (iii) distinguish the two foci.
Whole text Partial text Bi-focal Multi-focal Direct comparison Significant type comparison Ontology of Comparisons Comparative text user-initiated system-initiated Objective: distinguish
Significant Type Comparison In colder climates, ground squirrels commonly hibernate; tree squirrels do not.(Encarta Encyclopedia) The feathers of the male bird may be different in appearance from those of the female bird of the same species. (Encarta Encyclopedia)
Definition: Significant Type • The significant types of an entity are the partitionings of that entity into groups or parts of some kind. For example, an animal class can be partitioned into groups such as: male and female; captive and free; and young and adult, or into sub-parts such as head, body and tail, and so on. • A significant type comparison is a multi-focal comparative text which is used within a description of a focused entity in order to: (i) inform the reader of the presence of some or all of the significant types of the focused entity; and (ii) provide the most relevant distinction(s) between these significant types.
Whole text Partial text Bi-focal Multi-focal Uni-focal Direct comparison Significant type comparison Domain-based Ontology of Comparisons Comparative text user-initiated system-initiated Objective: distinguish Comparator: potential confusor Objective: 1. misconception prevention 2. express uniqueness
Definition: Domain-based • A domain-based comparison is a uni-focal comparative text which occurs within a description of a focused entity, and which draws the hearer's attention to another similar entity within the domain in order to: (i) exemplify the uniqueness or non-uniqueness of the focused entity; and (ii) prevent the hearer from forming misconceptions about the similarity or otherwise of the two entities.
Whole text Partial text Bi-focal Multi-focal Uni-focal Direct comparison Significant type comparison Domain-based Set Complement comparison Ontology of Comparisons Comparative text user-initiated system-initiated Objective: distinguish Comparator: potential confusor Objective: 1. misconception prevention 2. express uniqueness
Set Complement Comparison ... the claws are short and lack the sheath that covers retracted claws in other cat species. (Grolier Encyclopedia) Being a cylindrical pipe stopped at one end, the clarinet overblows to the interval of a 12th above the fundamental pitch (unlike flutes and oboes, which overblow to the octave).(Encarta Encyclopedia)
Definition: Set Complement • A contrast set is any form of grouping to which the focused entity belongs, such as its parent class or supertype in a generalisation hierarchy. • A set complement comparison is a domain-based comparison, between a focused entity and its complement in a contrast set to which it belongs. • NOTE: In order to determine the uniqueness of the focused entity in the contrast set, we need to compare the focused entity to its complement in the contrast set, since the focused entity is not different to itself.
Whole text Partial text Bi-focal Multi-focal Uni-focal Direct comparison Significant type comparison Domain-based Set Complement comparison Clarificatory comparison Ontology of Comparisons Comparative text user-initiated system-initiated Objective: distinguish Comparator: potential confusor Objective: 1. misconception prevention 2. express uniqueness
Clarificatory Comparison Track bikes are similar in appearance and construction to road racing bicycles, except that they lack brakes, have no variable gear mechanism, and weigh about 7 to 9 kg (about 15 to 20 lbs). Mountain bikes are built to withstand the rigorous conditions of off-road riding. Although their frames are commonly constructed of the same materials as other racing bikes, they have sturdier tubing. (Encarta Encyclopedia) Sheep, are hollow-horned ruminants belonging to the genus Ovis, suborder Ruminata, family Bovidae. Similar to goats, sheep differ in their stockier bodies, the presence of scent glands in face and hind feet, and the absence of beards in the males. Domesticated sheep are also more timid and prefer to flock and follow a leader. (Grolier Encyclopedia)
Definition: Clarificatory • A potential confusor of a focused entity is an entity which is highly similar to the focused entity, and which the hearer might confuse for the focused entity. • A clarificatory comparison is a domain-based comparison, between a focused entity and its potential confusor. The purpose of the comparison is to distinguish the focused entity clearly from the potential confusor, thus preventing the hearer from forming misconceptions about the similarity (or otherwise) of the entities.
Whole text Partial text Bi-focal Multi-focal Uni-focal Direct comparison Significant type comparison Domain-based Familiarity-based Set Complement comparison Clarificatory comparison Ontology of Comparisons Comparative text user-initiated system-initiated Objective: distinguish Comparator: known Objective: better understanding Comparator: potential confusor Objective: 1. misconception prevention 2. express uniqueness
Definition: Familiarity-based • A familiarity-based comparison is a uni-focal comparative text which occurs within a description of a focused entity which draws the hearer's attention to the similarities and/or differences between a focused entity and another entity with which the hearer is familiar, in order to allow the hearer to form a conceptual model of the focused entity more easily.
Whole text Partial text Bi-focal Multi-focal Uni-focal Direct comparison Significant type comparison Domain-based Familiarity-based Set Complement comparison Clarificatory comparison Like-entity comparison Ontology of Comparisons Comparative text user-initiated system-initiated Objective: distinguish Comparator: known Objective: better understanding Comparator: potential confusor Objective: 1. misconception prevention 2. express uniqueness
Like-entity Comparison Sheep, are hollow-horned ruminants belonging to the genus Ovis, suborder Ruminata, family Bovidae. Similar to goats, sheep differ in their stockier bodies, the presence of scent glands in face and hind feet, and the absence of beards in the males. Domesticated sheep are also more timid and prefer to flock and follow a leader. (Grolier Encyclopedia) All spiders are alike in some ways. Spiders have eight legs. Their bodies have two parts. Some people think that spiders are insects. But insects have six legs, and their bodies have three parts. Spiders and insects are two different kinds of animals. (National Geographic Encyclopedia K-2)
Definition: Like-entity • A like-entity comparison is a familiarity-based comparison between the focused entity and a highly similar comparator entity.
Whole text Partial text Bi-focal Multi-focal Uni-focal Direct comparison Significant type comparison Domain-based Familiarity-based Set Complement comparison Clarificatory comparison Like-entity comparison Illustrative comparison Ontology of Comparisons Comparative text user-initiated system-initiated Objective: distinguish Comparator: known Objective: better understanding Comparator: potential confusor Objective: 1. misconception prevention 2. express uniqueness
Illustrative Comparison Tachyglossus aculeatus, found in many habitats across Australia and Tasmania, is 35 to 53 cm long and has spines like a hedgehog's. (Encyclopedia Britannica) They are about the size of a large cat and have long, bushy tails, a shaggy brown coat, and large ears. (Aye-aye, Encarta Encyclopedia) Slightly larger than chinchillas, the mountain viscachas have long, rabbitlike ears and a long squirrel-like tail.(Encarta Encyclopedia)
Definition: Illustrative • An illustrative comparison is a familiarity-based comparison whose purpose is to enhance the hearer's understanding of an attribute of the focused entity, by gauging the value for that attribute against the value of the same attribute for another entity which the hearer is familiar with.
Whole text Partial text Bi-focal Multi-focal Uni-focal Direct comparison Significant type comparison Domain-based Familiarity-based Set Complement comparison Clarificatory comparison Like-entity comparison Illustrative comparison Ontology of Comparisons Comparative text
Overview • The Basic Ideas • Language Technology • Some Definitions • An Ontology of Comparisons • Comparison in Context • Conclusions and Future Directions
Background Knowledge Purpose Previous Discourse Speaker Hearer Audience Familiarity-based comparison Domain-based comparison Participants Event Time Place Objects Topic Setting Objects Message form Context of Discourse Discourse Firth 1957, Hymes 1962, Lewis 1972, Brown & Yule 1983
Objects and the Hearer • Relationship to the focused entity: • similarity • relatedness • spatial proximity • Hearer information: • goals • knowledge • perceivability
W V Opportunistic Links I User Knowledge E C Distinguishing Characteristics Liken Distinguish Creating the Context
Animal Domain Domestic Cat Domestic Dog Human f Cheetah Leopard, Cat Class Distinguishing Characteristics Liken Distinguish Example - Animal Domain
Jewelry Domain Other jewels in this case Discourse History Jewels sharing some property Opportunistic Links Jewel Instance Supertype classes Similar jewels Distinguishing Characteristics Liken Distinguish Example - Jewellery Domain
Example • The Alligator is a member of the Crocodylidae Family that has a broad, flat, rounded snout. It is similar in appearance to the related Crocodile. The Crocodile is a member of the Crocodylidae Family that has a narrow snout. The Crocodile is much longer than the Alligator (5.25 m vs 3.75 m). The Alligator has longer teeth on the lower jaw which cannot be seen when its mouth is closed whereas the Crocodile has one longer tooth on each side of the lower jaw which can be seen sticking up when its jaw is closed.
Overview • The Basic Ideas • Language Technology • Some Definitions • An Ontology of Comparisons • Comparison in Context • Conclusions and Future Directions
Conclusions • Analysis of types of comparison • Ontology & Definitions • Description via comparison • Improving hearer’s conceptual coherence • Preventing hearer misconceptions • Comparisons in context