840 likes | 1.32k Views
F-Structures, Information Structure, and Discourse Structure. Tracy H. King Annie Zaenen PARC. Talk Outline. Information Structure: Syntax of discourse functions Applications: Anaphora resolution Discourse structure Applications: Summarization and Sentence Condensation Conclusions.
E N D
F-Structures, Information Structure, and Discourse Structure Tracy H. King Annie Zaenen PARC PARC
Talk Outline • Information Structure: Syntax of discourse functions • Applications: Anaphora resolution • Discourse structure • Applications: Summarization and Sentence Condensation • Conclusions PARC
Information Structure: Syntax of discourse functions • Basic discourse functions • Typology of encoding • LFG approaches PARC
Basic discourse functions • DFs encode and divide up the information structure of the sentence. • DFs are notoriously difficult to define • Topic/Theme/Given • Focus/Rheme/New • Contrastiveness • What to do with non-DF information, e.g. background information? PARC
Example: Clefts • It is the [box]Focus [that]Topic I opened. • Construction encodes focus of the clefted constituent. • The referent of that constituent is the topic of the subordinate clause. • The ‘relative clause’ material is ‘presupposed’. • Question-answer pairs are often used to determine DFs. • What did you open? It was the box that I opened. PARC
Basic discourse functions • Here focus on: • how to encode these • what they can be used for • Choice of relevant DFs depends on what they are needed for. PARC
Typology of encoding • Structural position • initial • preverbal • Discourse markers/particles • Intonation • Combinations of these PARC
Structural encoding • Position indicates discourse function. • Language specific • Topics are initial • Focus are pre/post verbal • Background information is postverbal • Constructions: clefts • Subject as default topic • LFG: designated c-structure position PARC
Initial topics • Object marker on the verb • Anaphoric agreement • The OM is the object • Chichewa (Bresnan & Mchombo 1987) Alenje zi-ná-wá-lu-ma njuchi. hunters SM-past-OM-bite-indic bees `The bees bit them, the hunters.' PARC
Preverbal focus • Turkish (Enc 1991) bu kitab-i Hasan ban-a ver-dir this book-acc Hasan I-dat give `This book Hasan gave to ME.' PARC
DF markers • Morphemes can mark DF • Japanese wa • Hindi (Sharma 2003) • hI exclusive contrastive focus (only) • bhI inclusive contrastive focus (also) • tO contrastive topic PARC
Hindi example Exclusive focus: rAdha=ne=hI baccho=kO kahAnI sunAyI Radha=erg=Foc children=ACC story hear `It was (only) Radha who told the children a story' Contrastive topic: mOmbattI=tO milI, kEkin abh mAchis gum gayE candle=Top found but now match lost go `The candle was found but not the matches are lost.' PARC
Intonation • Most DFs have a specific intonation associated with them • Intonation alone can signal a DF Did you see Mary or John? I saw JOHN. It was a RED hat that I wore. PARC
Combinations • Most positionally and marker-signaled DFs also have intonation marking. • Can combine position and marker • ay inversion in Tagalog (Kroeger 1993) ay marker as head of I SpecIP is Topic=Subj or Focus=non-Subj • Ni lapis ay hindi nagdala si=Rosa even pencil AY not bring nom=Rosa `Even a pencil Rosa didn't bring.' PARC
LFG approaches • Syntax-DF interactions • F-structure vs. I-structure • OT-LFG PARC
Syntax-DF interactions • Subcategorized DFs • Predicates can subcategorize for DFs. • C-structure annotations • C-structure nodes can be associated with DFs, similar to GF assignment in configurational languages. PARC
Subcategorized DFs • Malay Topic (Alsagoff 1992) • verb affix identifies Topic and equates it with a GF • meng- ( TOP)=( SUBJ) • di- (i) ( TOP)=( SUBJ) (ii) < ( SUBJ) ( OBL) > log obj log subj • 0-( TOP)=( OBJ) PARC
PRED 'pinch< ( SUBJ), ( OBJ)> ( TOP)' SUBJ [ PRED 'Miriam' ] TOP [ ] OBJ [ PRED 'doctor' ] Malay example Miriam MENG-cubit doktor itu Miriam MENG-pinch doctor the `Miriam pinched the doctor.' MENG-cubit (PRED)='pinch< ( SUBJ), ( OBJ)> (TOP)' PARC
CP S NP ( TOPIC)= C' NP ( TOPIC)= VP anaphoric binding TOPIC [ …] SUBJ [ PRED 'pro' ] PRED 'X<SUBJ,…>' Chichewa and Tagalog topic Chichewa: Bresnan & Mchombo 1987 Tagalog: Kroeger 1993 PARC
VP XP ( FOCUS)= V' Urdu preverbal focus Urdu: Butt & King 1996 PARC
XP YP DF XP ZP Adjunct XP C- to F-structure Mapping proposal • Clause-Prominence of DFs: DF adjuncts (i.e., in adjoined positions) must be clause-prominent, occurring either at an edge of the clause or adjacent to the head of the clause. (Bresnan 2001:192) PARC
FP SpecFP DF F' Mapping proposal • Specifiers of functional categories are the grammatical discourse functions (Topic, Focus, Subj). (Bresnan 2001:102) PARC
Intonation • Much work is done on this association • Steedman (2000) on Categorial Grammar • Less in LFG • Bengali and the syntax-prosody mapping (Butt and King 1998) • Russian clause-final focus (King 1995) • Integration of prosody into the LFG projection architecture needs more exploration. PARC
X(P) X(P) Cl-disc (FOC ) hI Discourse markers • Constructive case/morphology approach (Sharma 2003) • hI (FOC ) PARC
F-structure vs. I-structure • DFs are often represented in the f-structure. • Malay subcategorizes for Topics • Chichewa incorporated pronouns • Scope of DFs may conflict with that of GFs. • project DFs into an I(nformation)-structure PARC
F-structure PRED 'eat<SUBJ,OBJ> SUBJ [ PRED 'Mary' ] OBJ [ PRED 'cake' ] TNS past DF-GF mismatches VP focus: Mary [F ate the cake]. How can the focus be represented? Form I-structure constituents. PARC
OT-LFG approaches • OT constraints for encoding of DFs (Choi 1999) • [New]-X: Place [+New] in a salient position X • [Prom]-X: Place [+Prom] in a salient position X • Languages • rank these constraints • define possible instantiations of X PARC
Summary: Syntax of DFs • DFs can be encoded by: • structural position • morphological markers • intonation • Linguistic theories need a way to capture these interactions • Much LFG work on structural position and morphological markers • Are F and T the only elements worth distinguishing? • Need more work on integrating generalizations about intonation • Need more work on how syntactic distinctions relate to semantic and pragmatic concepts PARC
Form and function relation • A radical proposal: • Prince: the relation between syntax and pragmatics is as arbitrary as that between sound and word meaning • Cross language variation: • e.g. functions of Left-dislocation in Yiddish and English are different (Prince) • Functions of clefting and topicalization are different across Germanic languages • Functions of Left-Dislocations (or Contrastive topicalization) and Right dislocations in Romance languages and in Germanic are different (see e.g. Lambrecht 1981 on Spoken French). • Not a one-to-one correspondence between form and function PARC
Talk Outline • Information Structure:Syntax of discourse functions • Applications: Anaphora resolution • Discourse structure • Applications: Summarization and Sentence Condensation • Conclusions PARC
Applications for Discourse Functions Anaphora resolution • DFs determine saliency • Saliency partially determines resolution PARC
Anaphora Resolution • Have a sentence with pronouns or referring NPs (the president) • Want to know what they refer to • some restrictions are purely syntactic: (most) reflexives refer to Subjects • others are heuristic: prefer closer referents prefer high saliency referents PARC
Role of Discourse Functions • Topic, and topic shift, are relevant for anaphora • Centering theory and its variants • have an ordered list of salient elements • have a referring expression • first salient element to match features is the antecedent • update the list based on this PARC
Anaphora resolution example Brennan drives an AR. Brennan =Old, AR=New She drives too fast. She=Brennan=Old Friedman races her on weekends. Friedman=Old, Brennan=Old, Her=Brennan=Old She drives to Laguna Seca. She=Friedman=Old She often beats her. She=Friedman=Old Her=Brennan=Old Discourse functions determine correct anaphora resolution. PARC
Pro-Drop and Anaphora Resolution • Pro-drop is (partly) licensed by DFs • Already established topics are more likely to be pro-dropped • Centering theory: • Continue and Smooth-shift transition favor null subjects • Chinese (Song 2003) • Yiddish (Prince 1998) PARC
Summary: Anaphora resolution • DFs are essential for determining anaphora resolution • Pro-drop is licensed in part by IS • But a lot remains to be worked out. PARC
Talk Outline • Information Structure:Syntax of discourse functions • Applications: Anaphora Resolution • Discourse structure • Applications: Summarization and Sentence Condensation • Conclusions PARC
Discourse Structure • A simple model • Its relation to syntax PARC
D S S S S S A too simple idea PARC
Progression and elaboration • Joan got up early. She showered. Then she made some tea. … • Mary is a model professor. Last year she wrote ten papers. She also advised 20 doctoral students and she was a member of the Committee on Women in Science. PARC
D S S D S S S S A still very simple idea Discourse progresses sentence by sentence or Subparts elaborate on previous parts PARC
S C a b a b One type of discourse trees (Linguistic Discourse Model) John fell. Bill pushed him. Bill pushed John. He fell. a and b are BDUs (Basic Discourse Unit) A BDU basically corresponds to a segment with an event variable in its semantics. PARC
BDU Relations • Not all types of relations can be classified as belonging to the subordinating or the coordinating type. • We will ignore the rest here. • Some elements in a sentence can explicitly indicate what type of relation we have, e.g. ‘because’ is a subordination relation. • They will be called “operator segments.” PARC
How do discourse trees relate to sentence syntax trees? • Some textual elements guide the discourse tree construction. • A BDU is not necessarily a complete sentence or vice versa. PARC
C S a b c Sentence does NOT equal BDU [The man dove into the pool.]a [It was warm and soothing]b and [he decided to remain for a little longer than usual.]c PARC
S a b ADJUNCT clauses [Joan left]a because [she was tired.]b Three segments: Two BDUs and 1 operator PARC
Textual elements that guide the construction of discourse trees • Hypothesis 1: Subordinating conjunctions indicate discourse subordination. • Needs checking: it is often true but is it always true? PARC
Textual elements cont. • Hypothesis 2: tense and aspect • John dove into the pool. The water was warm and soothing. • John Smith was wearing a long coat. It looked brand new. • Stative predicates do not push the discourse forward and often indicate subordination. • English is not very rich in this type of indicator. • perfective/imperfective distinctions are more explicit in other languages (e.g. French). (e.g. Asher and Lascarides, 2003) PARC
Textual elements cont. • Hypothesis 3: pronominalization • John Smith was wearing a long coat. It looked brand new. Often the ‘promotion’ of (the referent of) an OBJ or a OBL to a SUBJ in the following sentence reflects a discourse subordination. (Polanyi et al. 2004) • But • John hit Bill. He fell. The tense and aspect information takes precedence. PARC
What is the role of Information Structure in the construction of Discourse trees? • [John Smith]T1 was wearing [a long coat]F1. [It]T2 looked brand new. Focus-1 -->Topic-2 • [John]T1 likes [sweets]F1. [He]T2 eats [three dishes of ice cream]F2 and [five chocolate bars]F2 every day . Topic-1 --> Topic-2 (cf. centering theory ‘shifts’) In Discourse Structure both are subordinations PARC