1.83k likes | 2.14k Views
Discourse theories and technologies. Dan Cristea “Al. I. Cuza” University of Iasi, Faculty of Computer Science and Romanian Academy, Institute of Theoretical Computer Science dcristea@infoiasi.ro. Content. Introduction What is discourse? T ext and discourse. Coherence and cohesion.
E N D
Discourse theories and technologies Dan Cristea “Al. I. Cuza” University of Iasi, Faculty of Computer Science and Romanian Academy, Institute of Theoretical Computer Science dcristea@infoiasi.ro Borovets, sept. 2003
Content Introduction • What is discourse? Text and discourse. Coherence and cohesion. Theories • attentional state theory • rhetorical structure theory • centering theory • veins theory Technologies • segmentation of discourse • Marcu’s parser • VT parser Related issues on discourse • anaphora resolution, summarisation, information extraction Borovets, sept. 2003
What is discourse? Longman: 1. a serious speech or piece or writing on a particular subject: Professor Grant delivered a long discourse on aspects of moral theology. 2. serious conversation between people: You can’t expect meaningful discourse when you two disagree so violently. 3. the language used in particular kinds of speech or writing: scientific discourse. Borovets, sept. 2003
Text and discourse Syntactically – a discourse is more than a single sentence. A text is not a discourse! But it becomes a discourse the very moment it is read or listen by a human... or a machine. Borovets, sept. 2003
real time 1 2 discourse time 1 2 story time 2 1 800 920 1000 1030 Time and discourse Discourse has a dynamic nature Time axes Borovets, sept. 2003
Cohesion and coherence A text has cohesion when its parts closely correlate. A text is coherent when it makes sense, with respect to an accepted setting, real or virtual. Borovets, sept. 2003
knowledge about the language knowledge about the world knowledge about the situation knowledge about the author text knowledge base Interpretation of discourse discourse interpretation Borovets, sept. 2003
Discourse phenomena: interruptions and flash-backs E: Now attach the pull rope to the top of the engine. By the way, did you buy gasoline today? A: Yes. I got some when I bought the new lawnmower wheel. I forgot to take the gas with me, so I bought a new one. E: Did it cost much? A: No, and we could use another anyway to keep with the tractor. E: OK, how far have you got? Did you get it attached? from [Allen, 1987] Borovets, sept. 2003
Discourse phenomena: pop-overs E: Now attach the pull rope to the top of the engine. By the way, did you buy gasoline today? A: Yes. I got some when I bought the new lawnmower wheel. I forgot to take the gas with me, so I bought a new one. E: Did it cost much? A: No, and we could use another anyway to keep with the tractor. E: OK, how far have you got? Did you get it attached? from [Allen, 1987] Borovets, sept. 2003
Discourse phenomena: inference load and pronoun use Why is it that some discourses seem more difficult to understand than others? Why do we use the pronouns as we do? Borovets, sept. 2003
Discourse theories? Sub-domain of Computational Linguistics: searching for the laws that govern the discourse and the models making possible an automated analysis, representation and generation of the discourse. Borovets, sept. 2003
Discourse theories • atentional state theory • rhetorical state theory • centering theory • veins theory Borovets, sept. 2003
Attentional state theory(AST) Barbara Grosz & Candence Sidner, 1987 Models the linguistic structure of discourse Gives an account on intentions and how are they combined Explains the shift of attention during discourse interpretation and referentiality in terms of discourse structure 3 components Borovets, sept. 2003
AST: 1st component • a linguistic structure: • more sentences are aggregated in the same segment • segments display a recursive structure Borovets, sept. 2003
AST: 2nd component • an intentional structure: • a segment communicates an intention, it has a goal to accomplish in the reader; • the goals of the component segments contribute to the realisation of the goal of the overall segment; • two type of relations between segment goals: dominance and satisfaction-precedence Borovets, sept. 2003
A AA AB AC AAA AAB ABA ABB AST: 2nd component Relations: dominance DSP A dominates DSP AA: the intention associated with DSP AA contributes to the satisfaction of the intention associated with DSP A Borovets, sept. 2003
A AA AB AC AAA AAB ABA ABB AST: 2nd component Relations: satisfaction-precedence DSP AA satisfaction-precedes DSP AB: DSP AA must be satisfied before DSP AB Borovets, sept. 2003
A SA AST: 3rd component • an attentional state • to each segment corresponds a space of entities under focus • these spaces have the dynamics of a stack A AA AB AC AAA AAB ABA ABB Borovets, sept. 2003
AA SAA AST: 3rd component • an attentional state • to each segment corresponds a space of entities under focus • these spaces have the dynamics of a stack A A AB AC SA AAA AAB ABA ABB Borovets, sept. 2003
SAAA AAA AST: 3rd component • an attentional state • to each segment corresponds a space of entities under focus • these spaces have the dynamics of a stack A A AA AB AC SAA SA AAB ABA ABB Borovets, sept. 2003
SAAB AAB AST: 3rd component • an attentional state • to each segment corresponds a space of entities under focus • these spaces have the dynamics of a stack A A AA AB AC SAA SA AAA ABA ABB Borovets, sept. 2003
AST: 3rd component • an attentional state • to each segment corresponds a space of entities under focus • these spaces have the dynamics of a stack A A AA AB AC SAB SA AAA AAB ABA ABB Borovets, sept. 2003
AST: 3rd component • an attentional state • to each segment corresponds a space of entities under focus • these spaces have the dynamics of a stack A A SABA AA AB AC SAB SA AAA AAB ABA ABB Borovets, sept. 2003
AST: 3rd component • an attentional state • to each segment corresponds a space of entities under focus • these spaces have the dynamics of a stack A A SABB AA AB AC SAB SA AAA AAB ABA ABB Borovets, sept. 2003
AST: 3rd component • an attentional state • to each segment corresponds a space of entities under focus • these spaces have the dynamics of a stack A A AA AB AC SAC SA AAA AAB ABA ABB Borovets, sept. 2003
AST: 3rd component • an attentional state • accessibility modeled by the top-down access in the stack A SABB AA AB AC SAB SA AAA AAB ABA ABB Borovets, sept. 2003
AST: pluses • Discourse structure: • a proposal for discourse structure (an example) • stack behavior models hierarchical relationships among text segments • Reference: accounted for by accessibility in stack • Interruptions • Flash-backs Borovets, sept. 2003
AST: interruptions E: Now attach the pull rope to the top of the engine. By the way, did you buy gasoline today? A: Yes. I got some when I bought the new lawnmower wheel. I forgot to take the gas with me, so I bought a new one. E: Did it cost much? A: No, and we could use another anyway to keep with the tractor. E: OK, how far have you got? Did you get it attached? from [Allen, 1987] An interruption is a discourse segment whose DSP is not dominated nor satisfaction-preceded by the DSP of the immediately proceeding segment. Borovets, sept. 2003
… … … AST: interruptions By the way, did you buy gasoline today? A: Yes. I got some when I bought the new lawnmower wheel. E: Now attach the pull rope to the top of the engine. I forgot to take the gas with me, so I bought a new one. E: Did it cost much? A: No, and we could use another anyway to keep with the tractor. E: OK, how far have you got? Did you get it attached? … Borovets, sept. 2003
AST: flashbacks Sinit … SABC … SBillOK. Now how do I say that Bill is... Whoops I forgot about ABC. I need an individual concept for the company ABC. … SBillNow back to Bill. How do I say that Bill is an employee of ABC? From [Grosz & Sidner, 1987] A flashback is a particular kind of interruption whose DSP satisfaction-precedes the interrupted segment or a segment that dominates the interrupted segment. Sinit SFB SABC SBill SBill SFB SFB SFB SABC SBill SBill Sinit Sinit Sinit Borovets, sept. 2003
AST: minuses Stack mechanism fails for certain dominant/dominated segment configurations when granularity is sufficiently fine Does not accommodate left satellites Borovets, sept. 2003
AST doesn‘t accommodate left satellites a. Jack and Sue went to buy a new lawn mower b. since their old one was stolen. c. Sue had seen the men who took it and d. had chased them down the street, e. but they'd driven away in a truck. f. After looking in the store g. they realized they couldn't afford a new one. h. By the way, Jack lost his job last month i. so he's been short of cash recently. j. He has been looking for a new one, k. but so far hasn't had any luck. l. Anyway, they finally found a used one at a garage sale. Allen, 1993 Borovets, sept. 2003
AST doesn‘t accommodate left satellites a. Jack and Sue went to buy a new lawn mower b. since their old one was stolen. f. After looking in the store g. they realized they couldn't afford a new one. l. Anyway, they finally found a used one at a garage sale. c. Sue had seen the men who took it and d. had chased them down the street, e. but they'd driven away in a truck. h. By the way, Jack lost his job last month i. so he's been short of cash recently. j. He has been looking for a new one, k. but so far hasn't had any luck. Borovets, sept. 2003
a. Jack and Sue went to buy a new lawn mower a. Jack and Sue went to buy a new lawn mower b. since their old one was stolen. a. Jack and Sue went to buy a new lawn mower b. since their old one was stolen. f. After looking in the store g. they realized they couldn't afford a new one. a. Jack and Sue went to buy a new lawn mower b. since their old one was stolen. f. After looking in the store g. they realized they couldn't afford a new one. l. Anyway, they finally found a used one at a garage sale. a. Jack and Sue went to buy a new lawn mower b. since their old one was stolen. f. After looking in the store g. they realized they couldn't afford a new one. a. Jack and Sue went to buy a new lawn mower b. since their old one was stolen. h. By the way, Jack lost his job last month i. so he's been short of cash recently. j. He has been looking for a new one, k. but so far hasn't had any luck. h. By the way, Jack lost his job last month i. so he's been short of cash recently. j. He has been looking for a new one, k. but so far hasn't had any luck. c. Sue had seen the men who took it and d. had chased them down the street, e. but they'd driven away in a truck. c. Sue had seen the men who took it and d. had chased them down the street, e. but they'd driven away in a truck. c. Sue had seen the men who took it and d. had chased them down the street, e. but they'd driven away in a truck. c. Sue had seen the men who took it and d. had chased them down the street, e. but they'd driven away in a truck. h,i,j,k c,d,e a a,b,f,g,l a,b,f,g a,b Attentional state stack Borovets, sept. 2003
Problem:a finer granularity • Jack and Sue went to buy a new lawn mower • g. they realized they couldn't afford a new one. • l. Anyway, they finally found a used one at a garage sale. b. since their old one was stolen. f. After looking in the store Borovets, sept. 2003
Problem c,d,e b b f a a a a a, g Borovets, sept. 2003
Rhetorical structure theory William Mann & Sandra Thompson, 1987 Basics • text span: un uninterrupted linear interval of text • relation: holds between two non-overlapping spans, called nucleus and satellite • a nucleus is more important than a satellite (deletion and substitution tests) • relations: hypotactic (nucleus + satellite) and paratactic (2 nuclei) • scheme: integrates by a relation two or more text spans (like grammar rules) • RST analysis are trees • they reflect a judge interpretation (therefore could be subjective) Borovets, sept. 2003
relation text span: satellite text span: nucleus relation text span: nucleus text span: nucleus RST schemes Borovets, sept. 2003
relation2 relation1 relation2 relation1 relation1 relation2 RST schemes: equivalences Borovets, sept. 2003
relation relation relation relation relation relation RST schemes: equivalences Borovets, sept. 2003
1-3 EVIDENCE 1 2-3 EVIDENCE relation EVIDENCE constraint on N: R might not believe N to a degree satisfactory to W constraint on S: R believes S or finds it credible effect: R’s belief of N is increased 1. The program as published for calendar year 1980 really works. 2. In only a few minutes, I entered all the figures from my 1980 tax return 3. and got a result which agreed with my hand calculations to the penny. Borovets, sept. 2003
1-2 CONCESSION 1 2 CONCESSION relation CONCESSION constraint on N: W has positive regard to the situation presented in N constraint on S: W is not claiming that the situation presented in S doesn’t hold constraint on the combination N+S: W acknowledges a potential incompatibility between the situations presented in N and S; W regards the situation presented in N and S as compatible effect: R’s positive regard for the situation presented in N is increased 1. Although Dioxin is toxic to certain animals, 2. evidence is lacking that it has any serious long-term effects on human beings. Borovets, sept. 2003
1-2 CIRCUMSTANCE 1 2 CIRCUMSTANCE relation CIRCUMSTANCE constraint on N: none constraint on S: S presents a situation constraint on the combination N+S: S sets a framework (spatial or temporal) within which R is intended to interpret the situation presented in N effect: R recognizes that the situation presented in S provides the framework for interpreting N 1. Probably the most extreme case of Visitors Fever I ever witnessed was a few summers ago 2. when I visited relatives in Midwest. Borovets, sept. 2003
1-7 background 1-3 4-7 evidence volitional result 4 2-3 5-7 concession circumstance 2 3 5 6-7 antithesis 6 7 A more complex example 1. Farmington Police had to help control traffic recently 2. when hundreds of people lined up to be among the first applying for jobs at the yet-to-open Marriot Hotel. 3. The hotel’s help-wanted announcement – for 300 openings – was a rare opportunity for many unemployed. 4. The people waiting in line carried a message of claims that the jobless could be employed if only they showed enough moxie. 5. Every rule has exceptions, 6. but the tragic and too-common tableaux of hundreds of people snake-lining up for any task with a paycheck illustrates a lack of jobs, 7 not laziness. Borovets, sept. 2003
RST relations Presentational (intentional) Motivation Antithesis Background Enablement Evidence Justify Concession Subject matter (informational) Elaboration Circumstance Solutionhood Volitional Cause Volitional Result Non-Volitional Cause Non-Volitional Result Purpose Condition Otherwise Interpretation Evaluation Restatement Summary Sequence Contrast Borovets, sept. 2003
Motivational level Intentional level condition 3 motivation condition 1 1 2 motivation 2 3 Problem: multiple interpretations [Moore & Polack, 1992] 1. Come back at 5:00. 2. Then we can go to the hardware store before it closes. 3. This way we can finish the bookshelves tonight. Borovets, sept. 2003
Any other complains? • no indication on referentiality • how many relations? • how relations are discovered? • ... Borovets, sept. 2003
How distant are AST & RST? • Mosser&Moore (1996) and Marcu (1997): • granularity: AST - undefined, RST - fine (clause level) • structure: trees • internal nodes: relations (AST:2, RST: 28, Hobbs, Knott: hierarchy of relations) Borovets, sept. 2003
Centering - a theory of local discourse coherence • Joshi,A.K. and Weinstein,S., 1981: “Control of Inference: Role of Some Aspects of Discourse-Structure Centering“ • Grosz,B.; Joshi,A.K. and Weinstein,S.,1986: “Towards a computational theory of discourse interpretation” • Brennan,S.E.; Friedman,M.W.and Pollard,C.J., 1987: “A Centering approach to pronouns“ • Grosz,B.; Joshi,A.K. and Weinstein,S, 1995: “Centering: A framework for modeling the local coherence of discourse” • Strube,M. and Hahn,U., 1996: “Functional Centering“ • Walker,M.A.; Joshi,A.K. and Prince,E.F. (eds.), 1997: “Centering in Discourse“ • Kameyama,M., 1997: “Intrasentential Centering: A Case Study“ Borovets, sept. 2003
Goals of the theory • explains why certain texts are more difficult to process than others • explains why we use the pronouns as we use them • anchors a practical approach for anaphora resolution Borovets, sept. 2003