760 likes | 782 Views
This talk outlines the syntax of discourse functions and applications including anaphora resolution, summarization, and sentence condensation. It delves into basic discourse functions, typology of encoding, and LFG approaches. Examples in various languages illustrate these concepts. The presentation covers structural encoding, preverbal focus, DF markers, and the interaction of syntax with DFs. It also explores intonation, combinations of markers and positions, and the mapping proposal of clause-prominence and sentence structure.
E N D
F-Structures, Information Structure, and Discourse Structure Tracy H. King Annie Zaenen PARC PARC
Talk Outline • Information Structure: Syntax of discourse functions • Applications: Anaphora resolution • Discourse structure • Applications: Summarization and Sentence Condensation • Conclusions PARC
Information Structure: Syntax of discourse functions • Basic discourse functions • Typology of encoding • LFG approaches PARC
Basic discourse functions • DFs encode and divide up the information structure of the sentence. • DFs are notoriously difficult to define • Topic/Theme/Given • Focus/Rheme/New • Contrastiveness • What to do with non-DF information, e.g. background information? PARC
Example: Clefts • It is the [box]Focus [that]Topic I opened. • Construction encodes focus of the clefted constituent. • The referent of that constituent is the topic of the subordinate clause. • The ‘relative clause’ material is ‘presupposed’. • Question-answer pairs are often used to determine DFs. • What did you open? It was the box that I opened. PARC
Basic discourse functions • Here focus on: • how to encode these • what they can be used for • Choice of relevant DFs depends on what they are needed for. PARC
Typology of encoding • Structural position • initial • preverbal • Discourse markers/particles • Intonation • Combinations of these PARC
Structural encoding • Position indicates discourse function. • Language specific • Topics are initial • Focus are pre/post verbal • Background information is postverbal • Constructions: clefts • Subject as default topic • LFG: designated c-structure position PARC
Initial topics • Object marker on the verb • Anaphoric agreement • The OM is the object • Chichewa (Bresnan & Mchombo 1987) Alenje zi-ná-wá-lu-ma njuchi. hunters SM-past-OM-bite-indic bees `The bees bit them, the hunters.' PARC
Preverbal focus • Turkish (Enc 1991) bu kitab-i Hasan ban-a ver-dir this book-acc Hasan I-dat give `This book Hasan gave to ME.' PARC
DF markers • Morphemes can mark DF • Japanese wa • Hindi (Sharma 2003) • hI exclusive contrastive focus (only) • bhI inclusive contrastive focus (also) • tO contrastive topic PARC
Hindi example Exclusive focus: rAdha=ne=hI baccho=kO kahAnI sunAyI Radha=erg=Foc children=ACC story hear `It was (only) Radha who told the children a story' Contrastive topic: mOmbattI=tO milI, kEkin abh mAchis gum gayE candle=Top found but now match lost go `The candle was found but not the matches are lost.' PARC
Intonation • Most DFs have a specific intonation associated with them • Intonation alone can signal a DF Did you see Mary or John? I saw JOHN. It was a RED hat that I wore. PARC
Combinations • Most positionally and marker-signaled DFs also have intonation marking. • Can combine position and marker • ay inversion in Tagalog (Kroeger 1993) ay marker as head of I SpecIP is Topic=Subj or Focus=non-Subj • Ni lapis ay hindi nagdala si=Rosa even pencil AY not bring nom=Rosa `Even a pencil Rosa didn't bring.' PARC
LFG approaches • Syntax-DF interactions • F-structure vs. I-structure • OT-LFG PARC
Syntax-DF interactions • Subcategorized DFs • Predicates can subcategorize for DFs. • C-structure annotations • C-structure nodes can be associated with DFs, similar to GF assignment in configurational languages. PARC
Subcategorized DFs • Malay Topic (Alsagoff 1992) • verb affix identifies Topic and equates it with a GF • meng- ( TOP)=( SUBJ) • di- (i) ( TOP)=( SUBJ) (ii) < ( SUBJ) ( OBL) > log obj log subj • 0-( TOP)=( OBJ) PARC
PRED 'pinch< ( SUBJ), ( OBJ)> ( TOP)' SUBJ [ PRED 'Miriam' ] TOP [ ] OBJ [ PRED 'doctor' ] Malay example Miriam MENG-cubit doktor itu Miriam MENG-pinch doctor the `Miriam pinched the doctor.' MENG-cubit (PRED)='pinch< ( SUBJ), ( OBJ)> (TOP)' PARC
CP S NP ( TOPIC)= C' NP ( TOPIC)= VP anaphoric binding TOPIC [ …] SUBJ [ PRED 'pro' ] PRED 'X<SUBJ,…>' Chichewa and Tagalog topic Chichewa: Bresnan & Mchombo 1987 Tagalog: Kroeger 1993 PARC
VP XP ( FOCUS)= V' Urdu preverbal focus Urdu: Butt & King 1996 PARC
XP YP DF XP ZP Adjunct XP C- to F-structure Mapping proposal • Clause-Prominence of DFs: DF adjuncts (i.e., in adjoined positions) must be clause-prominent, occurring either at an edge of the clause or adjacent to the head of the clause. (Bresnan 2001:192) PARC
FP SpecFP DF F' Mapping proposal • Specifiers of functional categories are the grammatical discourse functions (Topic, Focus, Subj). (Bresnan 2001:102) PARC
Intonation • Much work is done on this association • Steedman (2000) on Categorial Grammar • Less in LFG • Bengali and the syntax-prosody mapping (Butt and King 1998) • Russian clause-final focus (King 1995) • Integration of prosody into the LFG projection architecture needs more exploration. PARC
X(P) X(P) Cl-disc (FOC ) hI Discourse markers • Constructive case/morphology approach (Sharma 2003) • hI (FOC ) PARC
F-structure vs. I-structure • DFs are often represented in the f-structure. • Malay subcategorizes for Topics • Chichewa incorporated pronouns • Scope of DFs may conflict with that of GFs. • project DFs into an I(nformation)-structure PARC
F-structure PRED 'eat<SUBJ,OBJ> SUBJ [ PRED 'Mary' ] OBJ [ PRED 'cake' ] TNS past DF-GF mismatches VP focus: Mary [F ate the cake]. How can the focus be represented? Form I-structure constituents. PARC
OT-LFG approaches • OT constraints for encoding of DFs (Choi 1999) • [New]-X: Place [+New] in a salient position X • [Prom]-X: Place [+Prom] in a salient position X • Languages • rank these constraints • define possible instantiations of X PARC
Summary: Syntax of DFs • DFs can be encoded by: • structural position • morphological markers • intonation • Linguistic theories need a way to capture these interactions • Much LFG work on structural position and morphological markers • Are F and T the only elements worth distinguishing? • Need more work on integrating generalizations about intonation • Need more work on how syntactic distinctions relate to semantic and pragmatic concepts PARC
Form and function relation • A radical proposal: • Prince: the relation between syntax and pragmatics is as arbitrary as that between sound and word meaning • Cross language variation: • e.g. functions of Left-dislocation in Yiddish and English are different (Prince) • Functions of clefting and topicalization are different across Germanic languages • Functions of Left-Dislocations (or Contrastive topicalization) and Right dislocations in Romance languages and in Germanic are different (see e.g. Lambrecht 1981 on Spoken French). • Not a one-to-one correspondence between form and function PARC
Talk Outline • Information Structure:Syntax of discourse functions • Applications: Anaphora resolution • Discourse structure • Applications: Summarization and Sentence Condensation • Conclusions PARC
Applications for Discourse Functions Anaphora resolution • DFs determine saliency • Saliency partially determines resolution PARC
Anaphora Resolution • Have a sentence with pronouns or referring NPs (the president) • Want to know what they refer to • some restrictions are purely syntactic: (most) reflexives refer to Subjects • others are heuristic: prefer closer referents prefer high saliency referents PARC
Role of Discourse Functions • Topic, and topic shift, are relevant for anaphora • Centering theory and its variants • have an ordered list of salient elements • have a referring expression • first salient element to match features is the antecedent • update the list based on this PARC
Anaphora resolution example Brennan drives an AR. Brennan =Old, AR=New She drives too fast. She=Brennan=Old Friedman races her on weekends. Friedman=Old, Brennan=Old, Her=Brennan=Old She drives to Laguna Seca. She=Friedman=Old She often beats her. She=Friedman=Old Her=Brennan=Old Discourse functions determine correct anaphora resolution. PARC
Pro-Drop and Anaphora Resolution • Pro-drop is (partly) licensed by DFs • Already established topics are more likely to be pro-dropped • Centering theory: • Continue and Smooth-shift transition favor null subjects • Chinese (Song 2003) • Yiddish (Prince 1998) PARC
Summary: Anaphora resolution • DFs are essential for determining anaphora resolution • Pro-drop is licensed in part by IS • But a lot remains to be worked out. PARC
Talk Outline • Information Structure:Syntax of discourse functions • Applications: Anaphora Resolution • Discourse structure • Applications: Summarization and Sentence Condensation • Conclusions PARC
Discourse Structure • A simple model • Its relation to syntax PARC
D S S S S S A too simple idea PARC
Progression and elaboration • Joan got up early. She showered. Then she made some tea. … • Mary is a model professor. Last year she wrote ten papers. She also advised 20 doctoral students and she was a member of the Committee on Women in Science. PARC
D S S D S S S S A still very simple idea Discourse progresses sentence by sentence or Subparts elaborate on previous parts PARC
S C a b a b One type of discourse trees (Linguistic Discourse Model) John fell. Bill pushed him. Bill pushed John. He fell. a and b are BDUs (Basic Discourse Unit) A BDU basically corresponds to a segment with an event variable in its semantics. PARC
BDU Relations • Not all types of relations can be classified as belonging to the subordinating or the coordinating type. • We will ignore the rest here. • Some elements in a sentence can explicitly indicate what type of relation we have, e.g. ‘because’ is a subordination relation. • They will be called “operator segments.” PARC
How do discourse trees relate to sentence syntax trees? • Some textual elements guide the discourse tree construction. • A BDU is not necessarily a complete sentence or vice versa. PARC
C S a b c Sentence does NOT equal BDU [The man dove into the pool.]a [It was warm and soothing]b and [he decided to remain for a little longer than usual.]c PARC
S a b ADJUNCT clauses [Joan left]a because [she was tired.]b Three segments: Two BDUs and 1 operator PARC
Textual elements that guide the construction of discourse trees • Hypothesis 1: Subordinating conjunctions indicate discourse subordination. • Needs checking: it is often true but is it always true? PARC
Textual elements cont. • Hypothesis 2: tense and aspect • John dove into the pool. The water was warm and soothing. • John Smith was wearing a long coat. It looked brand new. • Stative predicates do not push the discourse forward and often indicate subordination. • English is not very rich in this type of indicator. • perfective/imperfective distinctions are more explicit in other languages (e.g. French). (e.g. Asher and Lascarides, 2003) PARC
Textual elements cont. • Hypothesis 3: pronominalization • John Smith was wearing a long coat. It looked brand new. Often the ‘promotion’ of (the referent of) an OBJ or a OBL to a SUBJ in the following sentence reflects a discourse subordination. (Polanyi et al. 2004) • But • John hit Bill. He fell. The tense and aspect information takes precedence. PARC
What is the role of Information Structure in the construction of Discourse trees? • [John Smith]T1 was wearing [a long coat]F1. [It]T2 looked brand new. Focus-1 -->Topic-2 • [John]T1 likes [sweets]F1. [He]T2 eats [three dishes of ice cream]F2 and [five chocolate bars]F2 every day . Topic-1 --> Topic-2 (cf. centering theory ‘shifts’) In Discourse Structure both are subordinations PARC