250 likes | 260 Views
This article explores the challenges of representing meaningful speech variation, such as pitch range, loudness, and speech rate. Different schemes and models for capturing intonational variation are discussed, including linear and superpositional models. The advantages and disadvantages of these models are also examined.
E N D
Representing Intonational Variation Julia Hirschberg CS 4706
Today • How can we represent meaningful speech variation s.t. we can communicate this to others? • Expanded vs. compressed pitch range? • Louder vs. softer speech? • Faster vs. slower speech? • Differences in intonational prominence? • Differences in intonational phrasing? • Differences in pitch contours?
Schemes for Representing Intonational Variation • An early proposal: Joshua Steele • Language Learning Approaches • / IS it INteresting / • / d’you feel ANGry? / • / WHAT’S the PROBlem? / (McCarthy, 1991:106) • How can we capture all and only the meaningful intonational variation for a given language unambiguously?
Intonation Models • No commonly agreed upon model for one language, let alone all • Researchers work in different traditions and focus on different aspects of intonation • Different models may arise from different types of data • Auditory • Acoustic • Perceptual • …
Intonation Models • Auditory: • ESL-orientated; empirical data scarce; even trained listeners do not always agree on what they hear • Acoustic: • Distinction between linguistically relevant and irrelevant details in acoustic signal • Perceptual approach • Experimental data, often w/ manipulated f0 • Hard to design experiments with naïve listeners which give adequate control over parameters used in making decisions
Intonation models • Basic division into linear and superpositional models • Linear models: intonation involves a succession of individual choices from an intonation lexicon • Superpositional models: the intonation of an utterance involves a combination of local and utterance-sized components • Speakers may combine aspects of linear and superpositional models in the production of intonation
Intonation Models • Linear or Tone sequencemodels • British school (Kingdon ’58, O’Connor & Arnold ’73, Cruttenden ’97): based on auditory analysis • American School (Pierrehumbert ’80, ToBI): mainly acoustic analysis • Dutch school (‘t Hart, Collier and Cohen 1990): perceptual data • Superpositional models (Fujisaki 1983, Möbius et al. 1993): acoustic/physiological
Superpositional models • Pitch pattern of intonation modeled with two components: phrase component and accent component. • Phrase has basic shape, and pitch movements for individual accents are superimposed over basic shape: plus = Apples, oranges and tomatoes
Good for modeling declination • Declination: downtrend in f0 over the course of an utterance • Best seen as statistical abstraction: if one takes f0 measurements from enough utterances, over time, a downtrend in f0 will emerge Lily and Rosa thought this was divine. Prince William was gorgeous and he was looking for a bride. They dreamed of wedding bells.
Superpositional models • Advantages • Good at modeling declination in intonation languages • Successful in speech synthesis for languages like Japanese (little variation in accent type, e.g.) • Capture prosodic structure in languages which have both tone and intonation (e.g. Mandarin) • Disadvantages • All contours must be modeled with an accent and a phrase component • Many SAE contours cannot be captured easily
Intonation contours cannot be modeled as sequences of prosodic events • No account of different accent types, or variations in phrase endings • No notation system which allows users to share observations from large speech corpora or to compare contours • A method primarily for synthesis, analysis of speech production
Tone sequence models • General assumption: intonation is generated from sequences of (possibly) categorically different and phonologically distinctive accents • Two types of models within the group of tone sequence models: Type 1: Intonation made up of sequences of pitch movements Type 2: Intonation made up of sequences of pitch levels or targets
a a t t r r g g e e t t Two types of tone-sequence model Type 1: based on pitch movements Type 2: based on pitch levels H The British School The Dutch School L The American School
Tone Sequence Models • Overall shape of intonation phrase is not component of models • Model is a succession of independent accent and boundary tone choices from an intonation lexicon • Do not model phrase-level phenomena (e.g. declination, pitch range, nuclear accent)
The British School • Tone sequence model and pitch movement analysis (e.g.falling vs. rising intonation) • Auditory model: teaching English as a second language • O’Connor and Arnold 1972: • Earliest textbook for English instruction that tells user which contour appropriate in which context • No empirical evidence • British school analyses applied to English, German, Dutch, French, …
Concepts in the British School • Basic unit of intonational description: intonation phrase (tone unit) • Delimited by pauses, phrase-final lengthening, pitch movement • Syllables within a tone unit can be stressed or accented • telephone • Accented syllables are stressed and pitch prominent
Accent Stressed syllable has full vowel and is perceived as involving a rhythmic beat Pitch prominence • syllable produced with moving pitch or • syllable part of a pitch jump from a preceding syllable or onto a following syllable or • syllable at a point in the utterance where the direction of pitch movement changes (e.g. from rising to falling)
Pitch Prominence • Syllable produced with moving pitch • Syllable part of a pitch jump from a preceding syllable or onto a following syllable • Syllable at a point in utterance where direction of pitch movement changes i g r the l g r i l in the gar den the n e d r a g e h t n i l r i g the
An example and I think it’sHOrriblerrible you have toCLEANit ...aPOINTwhere There’s a point where you have to clean it and I think it’s horrible...
Intonation Phrase Structure • Intonational phrases have an internal structure • Structure determined by location of accents in an IP • Each accent defines the beginning of a prosodic constituent
Intonation phrase structure • Two types of accent unit in the British School: • Prenuclear accent units; also called the Head • Nuclear accent units; also called the Nucleus • The nuclear accent unit is the last accent unit in the IP • The head comprises all prenuclear accent units
Prenuclear accent unit Nuclear accent unit Prehead Stressed syllable Intonation phrase structure ‘Head’ ‘Nucleus’ But JOHN’s never BEEN to Jamaica
a m a i c a c i J a a a J m falling rising m a a i a J c i a c a J m a rising-falling falling-rising a i c m a i a a a c J a J m a level Rising-falling-rising Six nuclear choices in English
Strengths and Weaknesses • How are accents, prominence defined? How are they related to segments? Too many options…. • Are prenuclear accents qualitatively different from nuclear accents? What is the evidence? • Does each pitch accent begin a new ‘prosodic unit’ in the phrase? What is the evidence?
Next Class • The American School and Laboratory Phonology • ToBI • Read the ToBI conventions • Listen to the ToBI training data or cardinal examples • Bring your laptop and headphones to class