CPSC 503 Computational Linguistics

CPSC 503Computational Linguistics Natural Language Generation Lecture 20 Giuseppe Carenini CPSC503 Spring 2004

Understanding Generation Knowledge-Formalisms Map Intended meaning Pragmatics Discourse and Dialogue AI planners Logical formalisms (First-Order Logics) Semantics Rule systems (features and unification) Syntax Morphology State Machines Discourse (English) CPSC503 Spring 2004

NLG Systems (see handout) • Communicative Goals • Domain Knowledge • Context Knowledge NLG System Text Examples • FOG – Input: numerical data about future. Output: textual wheatear forecasts • IDAS – Input: KB describing a machinery (e.g., bike), user’s level of expertise Output: hypertext help messages • ModelExplainer – Input: OO model. Output: textual description of information on aspects of the model • STOP – Input: user history and attitudes toward smoking Output: personalize smoking cessation letter CPSC503 Spring 2004

GEA: the Generator of Evaluative Arguments CPSC503 Spring 2004

Four Basic Types of Arguments • Factual Argument (e.g., Canada is the only country outside of Asia to record SARS-related deaths) • Causal Argument (e.g., Travelers from Honk Kong brought SARS to Toronto….) • Recommendation (e.g., You should not go to China in the next few weeks…..) • Evaluative Argument (e.g., Some Asian governments were inefficient in stopping the SARS outbreak…) CPSC503 Spring 2004

Single entity House-A is great! Although it is somewhat old, the house is spacious and is in an excellent location. Comparison Vancouver is better than Seattle. There is less crime. Also, social services are more accessible. Sample Textual Evaluative Arguments CPSC503 Spring 2004

Evaluative Arguments: Importance Natural Language Generation Theory: model of argument type which is pervasive in natural human communication. Applications: • Advisor, Personal assistants • Recommendation systems • Critiquing systems CPSC503 Spring 2004

Limitations of Previous Research [Ardissono and Goy 99] [Chu-Carroll and Carberry 1998] [Elhadad 95] [Kolln 95] [Klein 94] [Morik 89] • Focus on specific aspects of generation • Selection of content • Realization of content into language • Lack of systematic evaluation • proof-of-concept system • analyzed on a few examples CPSC503 Spring 2004

Methodology • Develop evaluative argument generator • complete • integrate and extend previous work • Develop evaluation framework • Perform experiment within framework to test generator CPSC503 Spring 2004

Outline • Generator of Evaluative Arguments (GEA) • Evaluation Framework • Experiment CPSC503 Spring 2004

Communicative Goals Knowledge Sources: - User Model - Domain Model Text Planner Communicative Strategies Text Plan Text Micro-planner Linguistic Knowledge Sources: - Lexicon - Grammar Sentence Generator English Text Generator Architecture Content Selection and Organization (User (dis)likes entity degree) Content Realization CPSC503 Spring 2004

Represent values and preferences of user • Enable identification of supporting and opposing evidence • Provide measure of evidence importance GEA User Model Argumentation Theory tells us[Miller 96, Mayberry 96] • Supporting (opposing) evidence depends on values and preferences of audience • Evidence arranged according to importance (i.e., strength of support or opposition) • Concise: only important evidence included User Model must … and can be elicited in practice ... CPSC503 Spring 2004

COMPONENT VALUE FUNCTIONS 0.4 OBJECTIVES 0.7 0.6 Neighborhood 0.3 Location 0.8 House Value Park-Distance 0.2 Amenities Deck-Size Porch-Size Model of User’s Preferences Additive Multi-attribute Value Function (AMVF) • Decision Theory and Psychology (Consumer’s Behavior) • Can be elicited in practice [Edwards and Barron 1994] User-1 CPSC503 Spring 2004

+ 0.78 + 0.6 + + 0.9 _ _ 0.32 0.25 + 0.6 + Likes it _ Does not like it AMVF application User-1 OBJECTIVES COMPONENT VALUE FUNCTIONS Neighborhood 0.4 House-A Location 0.7 Westend 0.6 House Value Park-Distance 0.5 km 0.3 0.64 0.8 Amenities 20 m2 Deck-Size 36 m2 0.2 Porch-Size CPSC503 Spring 2004

House-A n2 o Parent(o) relation + + supporting 0.5 km _ _ supporting _ 20 m2 + opposing + _ + opposing + 36 m2 + + _ _ + + Likes it Supporting _ _ _ _ + + + + + Does not like it Opposing Supporting and Opposing Evidence User-1 0.4 Neighborhood Location 0.6 0.78 0.7 0.6 House Value Park-Distance 0.9 0.64 0.3 0.8 Amenities Deck-Size 0.32 0.25 0.2 Porch-Size 0.6 CPSC503 Spring 2004

1 + 0.24 + 0.55 vo 0 0.5 1 + + 0.54 _ _ 0.2 0.6 + 0.12 + Likes it Supporting _ _ _ _ + + + + + Does not like it Opposing Measure of Importance [Klein 94] User-1 0.4 Neighborhood Location 0.6 0.78 0.7 House-A 0.6 House Value Park-Distance n2 0.9 0.64 0.3 0.5 km 0.8 Amenities Deck-Size 20 m2 0.32 0.25 36 m2 0.2 Porch-Size 0.6 CPSC503 Spring 2004

Why AMVF? - summary An AMVF • Represents user’s values and preferences • Enables identification of supporting and opposing evidence • Provides measure of evidence importance • Evidence arranged according to importance • Concise arguments can be generated • Can be elicited in practice CPSC503 Spring 2004

GEA Architecture Content Selection and Organization Communicative Goals (User (dis)likes entity degree) Knowledge Sources: - User Model - Domain Model Text Planner AMVF Communicative Strategies Text Plan Content Realization Text Micro-planner Linguistic Knowledge Sources: - Lexicon - Grammar Sentence Generator English CPSC503 Spring 2004

Argumentative Strategy [Carenini and Moore INLG-2000] Based on guidelines from argumentation theory [Miller 96, Mayberry 96] Selection: include only “important” evidence (i.e., above threshold on z-scores of measure of importance) Organization: (1) Main Claim(e.g., “This house is interesting”) (2) Opposing evidence (3) Most important supporting evidence (4) Further supporting evidence -- ordered by importance withstrongest last Strategy applied recursively on supporting evidence CPSC503 Spring 2004

Sample GEA Text Plan EVALUATIVE ARGUMENT MAIN-CLAIM SUPPORTING EVIDENCE (VALUE (House-A) 0.72) SUB-CLAIM OPPOSING EVIDENCE SUPPORTING EVIDENCE (VALUE (Location) 0.7) (VALUE (distance-from-park 1.8m) 0.3) (VALUE (distance-from-rap-trans 0.5 mi) 0.75) (VALUE (distance-from-work 1mi) 0.75) decomposition ordering rhetorical relations CPSC503 Spring 2004

GEA Architecture Content Selection and Organization Communicative Goals (User (dis)likes entity degree) Knowledge Sources: - User Model - Domain Model Text Planner AMVF Argumentative Strategy Communicative Strategies Text Plan Content Realization Text Micro-planner Linguistic Knowledge Sources: - Lexicon - Grammar Sentence Generator English CPSC503 Spring 2004

Text Micro-Planner • Aggregation: combining multiple propositions in one single sentence[Shaw 98] • Scalar Adjectives (e.g., nice, far, convenient)[Elhadad 93] • Discourse cues (e.g., although, because, in fact) [Knott 96; Di Eugenio, Moore and Paolucci 97] • Pronominalization: deciding whether to use a pronoun to refer to an entity(centering[Grosz,Joshi and Weinstein 95]) CPSC503 Spring 2004

Aggregation (Logical Forms) • Conjunction via shared participants “House B-11 is far from a shopping area” + “House B-11 is far from public transportation” = “House B-11 is far from a shopping area and public transportation”. • Syntactic embedding “House B-11 offers a nice view” + “House B-11 offers a view on the river” = “House B-11 offers a nice view on the river”. CPSC503 Spring 2004

Scalar Adjectives Selection The house has an excellent location The house has an excellent location Value > 0.8 Value > 0.8 … … a convenient … a convenient … 0.65 < Value < 0.8 0.65 < Value < 0.8 HOUSE-LOCATION HOUSE-LOCATION … … a reasonable … a reasonable … 0.5 < Value < 0.65 0.5 < Value < 0.65 HAS_PARK_DISTANCE HAS_PARK_DISTANCE … an average… … an average… 0.35 < Value < 0.5 0.35 < Value < 0.5 … … a bad … a bad … 0.2 < Value < 0.35 0.2 < Value < 0.35 HAS_COMMUTING_DISTANCE HAS_COMMUTING_DISTANCE … a terrible … … a terrible … Value < 0.2 Value < 0.2 HAS_SHOPPING_DISTANCE HAS_SHOPPING_DISTANCE HOUSE-AMENITIES HOUSE-AMENITIES . . . CPSC503 Spring 2004

Discourse Cues Selection Type-of- nesting Rel-type Discourse cue Typed-ordering ("CORE" "CONCESSION" "EVIDENCE") or …. CONCESSION Although (placed on contributor) ROOT EVIDENCE Even though (placed on contributor) ("CORE" "CONCESSION" "EVIDENCE") EVIDENCE SEQUENCE CPSC503 Spring 2004

Pronominalization Centering tells us: entity providing link preferentially realized as pronoun (within a discourse segment) • Successive references always pronoun • First reference in segment pronoun only if both conditions hold: • Segment boundary explicitly marked by discourse cue • No pronoun was used in previous sentence CPSC503 Spring 2004

Output of MicroPlanning Lexicalized Functional Descriptions (LFDs) Example: “House-B11 is close to shops and reasonably close to work” ((CAT CLAUSE) (PROCESS ((TYPE ASCRIPTIVE) (MODE ATTRIBUTIVE)((POLARITY POSITIVE(EPISTEMIC-MODALITY NONE))) (PARTICIPANTS ((CARRIER ((CAT NP)(COMPLEX APPOSITION) (RESTRICTIVE YES) (DISTINCT ((AND ((CAT COMMON)(DENOTATION ZERO-ARTICLE-THING)(HEAD ((LEX "house")))) ((CAT PROPER) (LEX "B-11")))(CDR NONE)))) (ATTRIBUTE (AND((CAT AP)(HEAD ((CAT ADJ)(LEX "close"))) (QUALIFIER ((CAT PP) (PREP ((CAT PREP) (LEX "to"))) (NP((CAT COMMON) (NUMBER PLURAL)(DEFINITE NO) (HEAD ((CAT NOUN) (LEX "shop"))))))))) ((CAT AP)(HEAD ((CAT ADJ)(LEX "reasonably close"))) (QUALIFIER ((CAT PP) (PREP ((CAT PREP) (LEX "to"))) (NP ((CAT COMMON)(DEFINITE NO) (HEAD ((CAT NOUN)(LEX "work"))))))) ))))))))))) CPSC503 Spring 2004

Last Step: Sentence Generator • Unify LFDs with large grammar of English (FUF/SURGE[Elhadad 93, Robin 94]) • fill in syntactic constraints (e.g., agreement, ordering) • choose closed class words (e.g., prepositions, articles) • Apply morphology • Linearize as English sentences CPSC503 Spring 2004

GEA Highlights • GEA implements a computational model of generating evaluative arguments • All aspects covered in a principled way: • argumentation theory • decision theory • computational linguistics CPSC503 Spring 2004

Hot List Subtask1 1st best User presented with info about set of alternatives - Select preferred N alternatives - Order them by preference 2nd best ….. nth best Subtask2 Hot List 1st best Where? 2nd best YES ..... User presented with Evaluative argument about NewInstance Include? NewInstance is created nth best NO End Fill-out final questionnaire [Carenini INLG-2000] Evaluation Framework: Task Efficacy User Model has been elicited CPSC503 Spring 2004

Selection Task in Real-Estate • Why Real-Estate? • No background or expertise • But still presents challenging decision task • Instructions • Move to new town • Buy house • Use system for data exploration CPSC503 Spring 2004

Data Exploration System 2-13 CPSC503 Spring 2004

Argument is presented… 2-13 CPSC503 Spring 2004

SAMPLE SELF-REPORT How would you judge the new house? The more you like the house the closer you should put a cross to “good choice” bad choice: ___ : ___ : ___ : ___ : __ : ___ : ___ : ___ : ___: good choice Satisfaction Z-score X Measures of Effectiveness • Behavior and Attitude change • Record of user actions • Whether or not adopts new instance • Position in Hot List • Final Questionnaire • How much likes new instance • How much likes the instances in Hot-List • Others(Final questionnaire) • Decision Confidence • Decision Rationale CPSC503 Spring 2004

Two Empirical Questions[Carenini and Moore IJCAI-2001, ACL-2000] • Argument content, structure and phrasing tailored to user-specific AMVF, but . . . • Does this tailoring actually contribute to argument effectiveness? • Arguments should be concise. • Conciseness can be varied, but…. • What is the optimal level of conciseness? CPSC503 Spring 2004

Experimental Conditions • Tailored-Concise (~ 50% of objectives) • Tailored-Verbose (~ 80% of objectives) • Non-Tailored-Concise (~ 50% of objectives) • No-Argument CPSC503 Spring 2004

> ? ? > ? > Experimental Hypotheses Tailored-Verbose Tailored-Concise Non-Tailored-Concise No-Argument CPSC503 Spring 2004

Experimental Procedure 40 subjects (10 for each condition) PHASE1 Online questionnaire to acquire preferences (AMVF - 19 objectives, 3 layers) [Edwards and Barron 1994] • PHASE2 • - randomly assigned to condition • interacts with evaluation framework • - fill-out questionnaire CPSC503 Spring 2004

Experiment Results Satisfaction Z-score Decision Confidence Decision Rationale CPSC503 Spring 2004

0.05 Tailored-Verbose p=0.02 e.s.=0.8 0.18 > ? ? 0.31 Tailored-Concise > Non-Tailored-Concise p=0.04 e.s.=0.9 0.33 0.88 ? > 0.31 p=0.03 e.s.=0.8 No-Argument 0.25 Results Satisfaction Z-score CPSC503 Spring 2004

Summary Generator of Evaluative Argument (GEA):generates concise arguments tailored to a model of the user’s preferences (AMVF) Evaluation Framework • Basic decision tasks • Evaluate wide range of generation techniques Experiment • Tailoring to AMVF is effective • Differences in conciseness influence effectiveness CPSC503 Spring 2004

AT&T MATCH system Future Work (in 2001) Argument Generator • More Complex Textual Arguments • Speech • Other domains • Other languages • Arguments combining text and graphics More Experiments to test all extensions CPSC503 Spring 2004

Multimodal Access to City Help (MATCH) (AT&TJohnston, Ehlen, Bangalore, Walker, Stent, Maloor and Whittaker 2002) Multimodal interface • Portable Fujitsu tablet • Input: Pen for deictic gestures and Speech input • Output: Text, Speech and graphics CPSC503 Spring 2004

User:“Recommend/Compare” MATCH Example: User: “Show me Italian restaurants in the West Village” MATCH generates responses using techniques inspired by GEA • Evaluation (Lab - argument quality judgments) • Users prefer tailored responses • Future: Field Study CPSC503 Spring 2004

Next Time (Wed 8:30 sharp!) Project update - 5 min presentation in class • Brief description of the research problem you are targeting. • Describe your original research plan • Describe/Justify any change to your original plan • Describe what part of your (new) plan you have: • completed, • currently working • left to be done • For the part of the plan you still have to work on give an estimate of how much time each step will take. • Any other info you feel appropriate...... CPSC503 Spring 2004

CPSC 503 Computational Linguistics