260 likes | 274 Views
Explore a new approach based on sonority to automatically classify linguistic rhythm types. Discover the statistical models and evidence supporting this methodology in language discrimination studies.
E N D
Sonority as a Basis for Rhythmic Class Discrimination Antonio Galves, USP. Jesus Garcia, USP. Denise Duarte, USP and UFGo. Charlotte Galves, UNICAMP.
What we do • Our goal: a new approach to the problem of finding acoustic correlates of the rhythmic classes. • Main ingredient: a rough measure of sonority defined directly from the spectrogram of the signal. • Major advantage: can be implemented in an entirely automatic way, with no need of previous hand-labelling of the acoustic signal.
Our main result Applied to the same linguistic samples considered in RNM, our approach produces the same clusters corresponding to the three conjectured rhythmic classes.
RNM revisited Striking features • Linear correlation between ΔC and %V (-0.93). • Clustering into three groups.
A parametric probabilistic model for RNM Duarte et al. (2001) propose a parametric family of probability distributions that closely fit the data in RNM. This has two advantages: • It provides a deeper insight of the phenomena. • It makes it possible to perform statistical inference, i-e to extend results from the sample (data set) to the population (the set of all potential sentences).
The probabilistic model • The durationof the successive consonantal intervals are independent and identically distributed random variables. • The duration of each consonantal interval is distributed acording to a Gamma distribution. • Different languages have Gamma distributions with different standard deviations. • The standard deviation is constant for all languages belonging to the same rhythmic class. • The standard deviations of different classes are different.
Statistical evidence for the clustering • The model enables testing the hypothesis that the eight languages are clustered in three groups. • The hypothesis that the standard deviations of the Gamma distributions are constant within classes and differ among classes are compatible with the data presented in RNM.
Estimated standard deviations of the Gamma distribution for the consonantal intervals
Problems for RNM (1) • RNM is based on a hand-labeling segmentation which is time-consuming and depends on decisions which are difficult to reproduce in an homogeneous way. • This is a problem for linguists.
Problems for RNM (2) • Newborn babies discriminate rhythmic groups from signal filtered at 400 Hz (Mehler et al. 1996). At this frequency, it is impossible to fully discriminate consonants and vowels. • ΔC depends on a complex computation. • This is a problem for babies!
Sonorityas a basis for rhythmic class discrimination • Mehler et al. (1996)’s results strongly suggest that the discrimination of rhythmic classes by babies relies not on a fine-grained distinction between vowels and consonants, but on a coarse-grained perception of sonority in opposition to obstruency. • A natural conjecture is that the identification of rhythmic classes must be possible using a rough measure of sonority.
A rough measure of sonority Goal: to define a function that maps local windows of the signal on the interval [0,1]. This function should assign • values close to 1 for spans displaying regular patterns, characteristic of the sonorant regions of the signal, • values close to 0 for regions characterized by high obstruency.
Technical specifications • The function s(t) is based on the spectrogram of the signal. • Values of the spectrogram are estimated with a 25ms Gaussian window. • The step unit of the function is 2ms. • Computations are made with Praat (http://www.praat.org)
Definition of the function s(t) pt(f) = re-normalized power spectrum for frequency f around time t. This re-normalization makes pt a probability measure. A regular pattern characteristic of sonorant spans will produce a sequence of probability measures which are close in the sense of relative entropy. This suggests defining the function sonority as
Explaining the estimators • is the sample mean of the function s(t). • δS measures how important are the high obstruency regions in the sample. This is due to the fact that typically the values of p(t), and consequently s(t), present large variations when t belongs to intervals with high obstruency.
Extra statistical features • The distance between the first and third quartile increases from Japanese to Dutch. In other terms, the dispersion of sonority increases from mora-timed to stress-timed languages. • The empirical probability of having sonority smaller than 0.3 also increases from Japanese to Dutch. • This reinforces the idea present in Duarte et al. (2001) that the relevant information to discriminate among rhythmic classes is contained in the less sonorant part of the signal.
Distribution of the eight considered languages on the ( ,%V) plane
Conclusions • The main purpose of this presentation was to show that the relevant evidence about rhythmic classes can be automatically retrieved from the acoustic signal, through a rough measure of sonority. • In addition, our statistics are based on a coarse-grained treatment of the speech signal which is likely to be closer to the linguistic reality of the early acquisition.
This work is part of the Project RHYTHMIC PATTERNS, PARAMETER SETTING AND LANGUAGE CHANGE, funded by Fapesp (grant no 98/03382-0). • http://www.ime.usp.br/~tycho