1 / 14

Information Structure and Prosody

LING 575 Ruth Morrison. Information Structure and Prosody. Papers. Prevost, 1996 Context-sensitive prosody in text generation using information structure Calhoun et al., 2005 Human annotation of information structure Hirschberg, 1990 Improving prosody in TTS using a variety of features.

lewiskevin
Download Presentation

Information Structure and Prosody

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LING 575 Ruth Morrison Information Structure and Prosody

  2. Papers • Prevost, 1996 • Context-sensitive prosody in text generation using information structure • Calhoun et al., 2005 • Human annotation of information structure • Hirschberg, 1990 • Improving prosody in TTS using a variety of features

  3. Information Structure: Newness • New information: • Discourse elements introduced during the utterance that cannot be inferred • Mediated Information: • Discourse elements that were not mentioned previously but can be inferred • Old Information: • Previously mentioned discourse elements

  4. Information Structure: Theme and Rheme • Theme • The part of the utterance which contains previous information and ties the utterance to previous discourse • Rheme • The part of the utterance which contributes new information

  5. Information Structure: Contrastive Focus or Kontrast • Special focus placed on elements to contrast them with others in the previous or current phrase • Occurs within theme and rheme • Some disagreement about terminology and theory, according to Calhoun et al.

  6. Example (from Prevost)

  7. Example (from Calhoun et al.)

  8. Some previously-used indicators of prosody (Hirschberg) • “Accent is predictable (if you're a mind-reader)” (Bolinger, 1972), “mind-reading attempts continue” • Newness (“attentional state”) • Open versus closed class words • Cue phrases (“now”, “well”, “by the way”) • Syntactic information • Lexical stresses of compounds • “Discourse-based indicators of contrastiveness”

  9. Prevost (1996): Overview • System for describing items from a knowledge base with generated spoken language • Need to incorporate information from theme/rheme and contrastive focus to generate adequate prosody

  10. Theme/Rheme and Intonation • Theme and rheme have prototypical tunes • Theme: L+H* LH% • Rheme: H* LL% • Good enough for simple declarative sentences • But where is the pitch accent? (*)

  11. Finding contrastive focus

  12. Generation • Uses information about previous discourse context, previously given information • Three phases: content generation, sentence planning, sentence realization • Determines theme and rheme (in content generation), contrastive focus (in sentence planning) • Surface form of sentence is computed from semantic form

  13. Example Output

  14. Criticisms/Discussion • More applicable to more complex information state dialogs with limited domains than general TTS as in Hirschberg • Theory seems solid, but no quantitative evaluation shown • Is more human-like prosody really extrinsically better?

More Related