140 likes | 156 Views
LING 575 Ruth Morrison. Information Structure and Prosody. Papers. Prevost, 1996 Context-sensitive prosody in text generation using information structure Calhoun et al., 2005 Human annotation of information structure Hirschberg, 1990 Improving prosody in TTS using a variety of features.
E N D
LING 575 Ruth Morrison Information Structure and Prosody
Papers • Prevost, 1996 • Context-sensitive prosody in text generation using information structure • Calhoun et al., 2005 • Human annotation of information structure • Hirschberg, 1990 • Improving prosody in TTS using a variety of features
Information Structure: Newness • New information: • Discourse elements introduced during the utterance that cannot be inferred • Mediated Information: • Discourse elements that were not mentioned previously but can be inferred • Old Information: • Previously mentioned discourse elements
Information Structure: Theme and Rheme • Theme • The part of the utterance which contains previous information and ties the utterance to previous discourse • Rheme • The part of the utterance which contributes new information
Information Structure: Contrastive Focus or Kontrast • Special focus placed on elements to contrast them with others in the previous or current phrase • Occurs within theme and rheme • Some disagreement about terminology and theory, according to Calhoun et al.
Some previously-used indicators of prosody (Hirschberg) • “Accent is predictable (if you're a mind-reader)” (Bolinger, 1972), “mind-reading attempts continue” • Newness (“attentional state”) • Open versus closed class words • Cue phrases (“now”, “well”, “by the way”) • Syntactic information • Lexical stresses of compounds • “Discourse-based indicators of contrastiveness”
Prevost (1996): Overview • System for describing items from a knowledge base with generated spoken language • Need to incorporate information from theme/rheme and contrastive focus to generate adequate prosody
Theme/Rheme and Intonation • Theme and rheme have prototypical tunes • Theme: L+H* LH% • Rheme: H* LL% • Good enough for simple declarative sentences • But where is the pitch accent? (*)
Generation • Uses information about previous discourse context, previously given information • Three phases: content generation, sentence planning, sentence realization • Determines theme and rheme (in content generation), contrastive focus (in sentence planning) • Surface form of sentence is computed from semantic form
Criticisms/Discussion • More applicable to more complex information state dialogs with limited domains than general TTS as in Hirschberg • Theory seems solid, but no quantitative evaluation shown • Is more human-like prosody really extrinsically better?