340 likes | 450 Views
A cross-linguistic study on perception of length contrast in Finnish and Japanese. January 7, 2010 84 th Annual Meeting of Linguistic Society of America Kenji Yoshida Kenneth de Jong Department of Linguistics Indiana University, Bloomington Pia-Maria P ä ivi ö
E N D
A cross-linguistic study on perception of length contrast in Finnish and Japanese January 7, 201084th Annual Meeting of Linguistic Society of America Kenji Yoshida Kenneth de Jong Department of Linguistics Indiana University, Bloomington Pia-Maria Päiviö Department of Slavic Languages and Literature University of Toronto
Acknowledgements • Financial Support • International Scholarship Award, Finlandia Foundation • The experiment in Helsinki • Reijo Aulanko & Marjut Mäenpää at Department of Speech Sciences, University of Helsinki • Seppo Kittilä at Department of Linguistics, University of Helsinki • The experiment in Japan • Donna Erickson, Takuya Oomae at Showa University of Music • Yosuke Igarashi at University of Hiroshima
Research Interests • Prosodic typology: Can languages with quantity distinction be different in quantity categorization? • Contextual effect in speech perception: How does language- specific knowledge about speech sound affect quantity categorization? Examine speech perception of two "quantity languages" with different language-specific knowledge (Finnish & Japanese)
Finnish and Japanese seem to be similar in quantity contrast (Ham, 2001: 213) Geminate / Single
Finnish and Japanese are different • Word prosody • Finnish: Fixed Stress on the initial syllable • Japanese: Lexical pitch accent associated to any mora in a word • Temporal organization • Finnish: Tendency toward equal total duration of disyllabic feet (Suomi, 2005: 297) • Japanese: Tendency for the words with the same number of moras to have about the same overall duration (Port, et. al., 1987: 1581) • Acoustic cues for quantity other than duration • Finnish: F0 fall as a cue for “long” vowel (Järvikivi et. al., 2007) • Both: Robust "covariants" for geminates (Idemaru & Guion, 2008, Doty et. al., 2007)
Contextual variation in vowel duration in Finnish and Japanese • Finnish : Vowel duration conditioned by the word-initial syllable structure (Suomi, 2005, etc.) Word-initial syllable is CV Longer V2 = half-long vowel • Japanese : Vowel duration is conditioned by the quantity of the following consonant (Ofuka, et. al., 2005) "anti-compensatory" with following consonants … (C)V.CV … (C)VC.CV half-long vowel = Longer vowel in 2nd σ (relative to CVC__ or CVV__ ) Before single: Shorter Before geminate: Longer
Two hypotheses: The effect of language-specific phonetic knowledge Contextual variation of vowel duration (half-long in FIN / anti-compensation in JPN) … • Is cancelled out (perceptually compensated) "Cancellation" • A strong version of "Acoustic invariance" (e.g., Hirata & Whiton, 2005; "the duration of one part of an utterance will have a consistent relationship with the duration of another part of the same utterance, leading to a constant ratio") • Takes effect (shifts categorical boundary) "Contextual effect" • E.g., Listener’s experience of durational covariance shifts the criterion of length categorization (Kingston, et. al., 2009) • Anti-compensatory vowel duration shifts single/geminate boundary in Japanese (Ofuka, et. al., 2005)
Experiment: 2AFC (non)word identification(minimal pairs by p ~ pp) Finnish speaker insert silent intervals(7 steps) Finnish listeners (N=22) (non)wordidentification Acoustic Stimuli 'mata' Stimulus Set (N=84) 6 nonsense words (3 minimal pairs: p~pp) 9 responses each Japanese speaker insert silent intervals(7 steps) Japanese listeners (N=17) (non)wordidentification Acoustic Stimuli 'mata' 2 talkers × 6 words × 7 steps = 84 stimulus types × 9 responses / participant
Design of the acoustic stimuli • Three minimal pairs (nonsense for both languages) (1) Effect of preceding vowel (ma- vs. man- for Finnish, p- vs. pp- for Japanese) • Cancellation No shift of category boundary for the L1 stimuli • Contextual effect Shift of category boundary for the L1 stimuli (L2? Contextual effect expected only for Finnish listening to Japanese, but in the opposite direction to Japanese because of anti-compensation in Japanese) (2) Effect of location within a word(ma- vs. mana-) • Cancellation / Contextual effect: The same expectation as for ma- for mana- case No location effect is expected (relevant only for Finnish)
Data Analysis: Slope and 50% threshold Examples: Fin01, J-talker • Logistic regression was performed for each speaker, each of the 6 original words (matapana, matappana…) % of pp identification= x = stimulus number (1~7) a = Slope of the identification function b = 50% threshold (p pp) Proportion of geminate response short Duration of silent interval long
Results 1.1: Finnish stimuli vs. Japanese stimuli(Finnish listeners) Slope Threshold Finnish stimuli Finnish stimuli Japanese stimuli Japanese stimuli Sharper slope for FIN stimuli man- [t(43) = 6.14, p<.0001]mana- [t(43) = 3.74, p<.001] No significant difference [ps >.013] (α = .0125)
Results 1.2: Finnish stimuli vs. Japanese stimuli (Japanese listeners) Slope Threshold Finnish stimuli Finnish stimuli Japanese stimuli Japanese stimuli No significant difference[ps>.033] Later threshold for JPN stimuli[ma-: t(33)= –6.65, p>.0001 ][man-: t(33)= –2.70, p>.0107][mana-: t(33)= –3.97, p>.001]
Results 2.1: pp-original vs. p-original(Finnish listeners) Slope Threshold pp original pp original p original p original Slope: No significant effect[ps<.122] Earlier threshold for the stimuli created from pp-originalFIN [t(65) = –5.06, p<.0001]JPN [t(65)= –3.84, p<.0001]
Results 2.2: pp-original vs. p-original(Japanese listeners) Slope Threshold pp original pp original p original p original No significant effect[ps>.280] Earlier threshold for the stimuli created from pp-originalFIN [t(65) = –6.82, p<.0001]JPN [t(65)= –8.26, p<.0001]
Results 3.1: ma- (CV) vs. man- (CVC)(Finnish listeners) Slope Threshold man- man- ma- ma- Sharper slope for ma- for the Japanese stimuliFIN: [t(43) = 1.16, p=.253]JPN: [t(43) = –3.25, p=.002] Later threshold for ma- for the Finnish stimuliFIN: [t(43) = –3.90, p<.0001]JPN: [t(43) = –0.91, p=.370]
Results 3.2: ma- (CV) vs. man- (CVC)(Japanese listeners) Slope Threshold man- man- ma- ma- No significant effect[ps>.099] No significant effect[ps>.075]
Results 4.1: ma-(CV on 3rd σ) vs. mana- (CV on 4th σ) (Finnish listeners) Slope Threshold mana- mana- ma- ma- Slope: No significant effect Later threshold for ma- for the Finnish stimuliFIN: [t(43) = –3.40, p<.01]JPN: [t(43) = –2.08, p=.044]
Results 4.1: ma-(CV on 3rd σ) vs. mana- (CV on 4th σ) (Japanese listeners) Slope Threshold mana- mana- ma- ma- No significant effect[ps>.346] No significant effect[ps>.072]
Summary of the effects (Finnish listeners) [Slope] (1) Easier to categorize FIN stimuli (2) Easier to categorize JPN stimuli in ma- condition compared to man- condition [Threshold] (3) The acoustic cues in the original signal are useful (even for JPN stimuli, no observable effect of anti-compensation) (4) Start hearing geminate later when - (a) the first syllable is CV and - (b) the target syllable is in the 3rd syllable (Restricted) Contextual effect
Summary of the effects (Japanese listeners) [Slope] No significant effect [Threshold] (1) Start hearing geminate earlier for Finnish talker (2) The acoustic cues in the original signal are useful (even for FIN stimuli) Replicate the contextual effect (Ofuka, et. al., 2005)
The pattern of threshold shift when Finnish listening to Finnish stimuli (Restricted contextual effect) longer (half-long) Syllable σ1σ2 m a n t a p a n a m a t a p a n a m a n a t a p a n a p p p has to be longer to be identified as pp • Despite of the longer preceding vowel, threshold shifted later only for ma- case. • The threshold shift may not be explained directly by the "listener’s experience of durational covariance" of the preceding vowel (Kingston, et. al., 2009), nor by the pattern of secondary stress (Karvonen, 2005)
A possible reason of threshold shiftsRelevance of moraic structure of words Mora μ1 μ2 μ3 μ4 μ5 m a n t a p a n a m a t a p a n a m a n a t a p a n a (from Suomi, et. al, 2003: 128) • The target 'p(~pp)' is at the third mora only for ma- case • The initial two morae has been argued to be the segmental domain of durational realization of stress and F0 realization of accent (Suomi, 2005: 304). • Consonants should acoustically be longer to be perceived as geminate at the initial position of the second bi-moraic unit? • An example of domain-initial strengthening (Cho & Keating, 2001)
Conclusions 1: Prosodic typology • Length perceptions by Finnish and Japanese listeners are quite different • The only common effect for Finnish and Japanese listeners is the original source effect (threshold: pp-original< p-original) Despite different prosodic types of the Finnish and Japanese, some acoustic covariates of single/geminate distinction are shared • Finnish stimuli are more likely to be heard as "geminates" by Japanese (than Japanese stimuli) Language-specificity in acoustic covariates of single/geminate contrast (Doty et. al., 2007: 2740)
Conclusions 2: Contextual effect • The effect of contextual variation in vowel duration is not totally offset (cancelled out) by language-specific phonetic knowledge Contextual effect for both FIN and JPN • The effect of non-local context at more abstract level (word morphological structure) can override that of local, acoustic contextual effect Relevance of moraic structure (FIN) • When the contextual effect is taken into consideration, the difference among quantity languages may be further elucidated, and eventually the range of possibility in quantity contrast in speech may be more illuminated 24
The end THANK YOU
References 1 Aoyama, K. (2001). "A psycholinguistic perspective on Finnish and Japanese prosody." Dordrecht: Kluwer Publishers. Boersma, P. & Weenink, D. (2009). "Praat: doing phonetics by computer. (Version 5.1.08)." Retrieved June 20, 2009, from http://www.praat.org/ Cho, T. & Keating, P. (2001). "Articulatory and acoustic studies on domain-initial strengthening in Korean," Journal of phonetics29, 155-190. Doty, S. C., Idemaru, K. & Guion, S. (2007). "Singleton and geminate stop in Finnish – acoustic correlates," Proceedings of the 8th Annual Conference of the International Speech Communication Association. Antwerp, Belgium, pp. 2737-2740. Forster, K. & Forster, J. (2003). "DMDX: A Windows display program with millisecond accuracy," Behavior Research Methods, Instruments, & Computers35 (1), 116-124. Ham, W. H. (2001). "Phonetic and phonological aspects of geminate timing." New York: Routledge Hirata, Y. and Whiton, J. (2005). "Effects of speaking rate on the single/geminate stop distinction in Japanese," Journal of the Acoustical Society of America, 118(3), 1647-1660.
References 2 Idemaru, K. & Guion, S. (2008). "Acoustic covariants of length contrast in Japanese stops," Journal of International Phonetic Association, 38-2, 167-186. Järvikivi, J., Aalto, D., Aulanko, R. & Vainio, M. (2007). "Perception of vowel length: tonality cues categorization even in a quantity language," Proceedings of 16th ICPhS, Saarbrücken. pp. 693-696. Karvonen, D. (2005). "Word Prosody in Finnish," Doctoral dissertation, UC Santa Cruz. Kingston, J., Kawahara, S., Chambless, D., Mash, D. & Brenner-Alsop, E. (2009). "Contextual effects on the perception of duration," Journal of Phonetics, 37, 297-320 Lehtonen, J. (1970). "Aspects of Quantity in Standard Finnish," Jyväskyllä : Jyväskyllä University Press. Port, R., Dalby, J. & O'Dell, M. (1987). "Evidence for mora timing in Japanese. Journal of the acoustical society of America, 81(5), 1574-1585. R Development Core Team (2009). "R: A language and environment for statistical computing," R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org.
References 3 Ofuka, E., Mori, Y. & Kiritani, S. (2005). "Perception of Japanese gemimate stop: the effect of the duration of the preceeding / following vowels," Journal of the phonetic society of Japan9-2, 59-65. (in Japanese) Suomi, K. (2005). "Temporal conspiracies for a tonal end: Segmental durations and accentual f0 movement in a quantity language," Journal of Phonetics, 33, 291-309. Suomi, K., Toivonen, J. & Ylitalo, R. (2003). "Durational and tonal correlates of accent in Finnish," Journal of phonetics, 31, 113-138.
Some difference has been found between Finnish and Japanese in production (Aoyama, 2001) • /hana/ vs. /hanna/ spoken in isolation • Examine proportion of nasal against the total word duration (excluding the initial /h/) • "the distinction between single and geminate nasals appears to be acoustically clearer in Finnish than in Japanese" (p.42). • But, this may be due to the effect of anti-compensatory variation of vowel duration (slide #6)
No clear difference has been found between Finnish and Japanese in perception (Aoyama, 2001) • Finnish listened to the stimuli created from Finnish word (hanna),Japanese listened to Japanese word (hanna) • "Finnish speakers have a narrower bandwidth of categorical boundary" (Aoyama, 2001: 63) • Slope: FIN = 1.55 JPN = 1.39 • Threshold: FIN = 105.7 (ms.) JPN = 106.8 ` Bandwidth = the region of ambiguous responses (20 – 80 % “long”)
Examples of the acoustic stimuli (insertion of silence intervals) matapana original (FIN) p = 75 msec. Stimulus 'f_matapana_s2.wav' matappana original (FIN) Stimulus 'f_matapana_s6.wav' pp = 148 msec.
Variation in vowel duration in the acoustic stimuli longer vowel after p half-long Finnish Vowel before p~pp Vowel after p~pp longer vowel before pp Japanese
Participants • Speakers (provide acoustic stimuli) • Finnish (female, 28, Imatra) • Japanese (female, 32, Kawasaki) • Listeners (provide identification judgments) • 22 Native Speakers of Finnish • Age: 20 ~ 58, Median = 31.2 • 17 Native Speakers of Japanese • Age: 19 ~30, Median = 21.0
Data Analysis: Correction of the parameters • The range of silence intervals of the acoustic stimuli • Vary between the min. and max. of the original words (mean of 6 tokens) • Different for talkers / minimal pairs (matapana~matappana: JPN 69.1~140.9, FIN 75.3~148.1) • Slopes and thresholds are corrected for the raw duration • Extremely large or small threshold values were truncated (as they are not reliable estimates, but exert a strong influence on statistical tests): negative values 50 msec. more than 200 200 msec.