260 likes | 379 Views
Presentation on Timbre Similarity. Alexandre Savard March 2006. Content. Introduction Measurement of timbre Measurement of similarity Systems Evaluation Recent developments Conclusion. Introduction. Incomplete timbre definition Timbre is a fundamental dimension of sound.
E N D
Presentation on Timbre Similarity Alexandre Savard March 2006
Content Introduction Measurement of timbre Measurement of similarity Systems Evaluation Recent developments Conclusion
Introduction Incomplete timbre definition Timbre is a fundamental dimension of sound. Timbre has been too often described as the dimension of sound that lets the listener makes distinction between two sounds that have the same pitch and the same loudness.
Introduction Incomplete timbre definition An efficient operational definition of timbre haven’t been already achieved. Previous research demonstrated the multidimensional nature of timbre. Existing timbre researches has already compared the similarity of the timbre of single instrumental notes.
Introduction Physical features of timbre Attack transients Spectral flux Spectral gravity centre Harmonicity Ratio Spectral/Temporal Envelope Other factors: Pitch Loudness
Introduction Global Timbre A local definition of timbre appears to be useless for electronic music distribution development or music recommendation systems. Researches use the concept of “global” timbre that attributes a timbre quality for an entire piece. This idea only makes sense if there is only little variations in texture and instrumentation.
Measurement of timbre Mel-Frequency Cepstrum Coeficient Mel-Frequency Cepstrum Coefficient (MFCC) Spectral gravity centre Spectral envelope Spectral Flux Combines those measures in a “feature vector”
Measurement of timbre Mel-Frequency Cepstrum Coefficient It is a measure of the spectral envelope variations. Consist of a mapping of the linear frequencies to the psychoacoustically-based Mel scale. It results an ordered sequence of coefficients. Low-order coefficients describe slow temporal changes of the spectral envelope. High-order coefficients describe fast changes.
Measurement of Similarity Similarity Metric Metrics are applied to calculate the distance between two representations and determine the similarity of the music. Should be related to strategy used by humans in similarity judgments of timbre.
Measurement of Similarity Gaussian Mixture Model MFCC involves a large amount of coefficients. It is necessary to get a more compact representation to handle those results.
Measurement of Similarity Gaussian Mixture Model GMM is composed of one or more components Gaussian probability distributions. Distance between GMM’s can be seen as a measurement of the similarity. Random probabilities are computed from each song to be compared. Samples are taken from both songs to be compared.
Measurement of Similarity Gaussian Mixture Model “Distance” between GMM’s can be seen as a measurement of the similarity. “Distance” is the amount of necessary changes to obtain samples of the second song from the first one. The higher are those probabilities, the higher the similarity is.
Measurement of Similarity Gaussian Mixture Model J. Aucouturier et al, 2004 “The Way It Sounds”: Timbre Models for Analysis and Retrieval of Music Signals.
Measurement of Similarity Different Approaches Neural Networks Hidden Markov Model Gaussian Mixture Models Self-Organizing Map
Systems Evaluation Evaluation criteria Timbre similarity judgment is based on a set of objective and subjective perceptual, cognitive and cultural aspects. Measure are highly dependent of music present in the database.
Systems Evaluation Objective Evaluation The objective evaluation of timbral similarity measure is problematic. Metadatas of a given database include description of the artist and of the genre. However, timbre quality is not usually described in it.
Systems Evaluation Subjective Evaluation Conducting a psychoacoustical survey Deciding whether two songs have similar timbre can be uncertain as it is an ill-defined concept.
Recent Developments Aucouturier and Pachet (2002) Segmentation of each song using invariable 50 ms windows. Make use of a 8 coefficient MFCC to characterize each segments. Used Gaussian Mixture Model composed of three Gaussian probability distribution. 100 random samples are taken for similarity measurement.
Recent Developments Aucouturier and Pachet (2002) J. Aucouturier et al, 2004, “The Way It Sounds”: Timbre Models for Analysis and Retrieval of Music Signals.
Recent Developments Aucouturier and Pachet (2004) Finding the best set of parameters Sampling rate of the music signal Number of MFCCs extracted from each frame of data Number of components used in the GMM The distance sample rate to estimate the likelihood of one model given another Window size
Recent Developments Aucouturier and Pachet (2004) J. Aucouturier et al, 2004, “The Way It Sounds”: Timbre Models for Analysis and Retrieval of Music Signals.
Recent Developments Aucouturier and Pachet (2004) Alternative similarity measurements using Earth Mover’s Distance and Hidden Markov Model. Those techniques didn’t improved the performances. Bring the idea that there could exist a ceiling for the performance of technique involving timbre similarity.
Recent Developments Liu and Huang (2000) Developed an algorithm for singing voice. Used MFCC as well as GMM for their timbre representation. The segmentation of audio signal is done according to the phonemes in singing.
Recent Developments Logan and Salomon (2001) Characterized timbre with MFCC. Used K-means clustering instead of GMM. Calculate the amount of similarity using Earth Mover’s Distance.
Bibliography J. Aucouturier, F. Pachet, and Mark Sandler. 2004. “The way it sounds”: Timbre models for analysis and retrieval of music signals. IEEE Transaction on multimedia. J. Aucouturier, and F. Pachet. 2004. Improving timbre similarity : How high’s the sky ? Proceedings of the International Conferenceon Music Information Retrieval. J. Aucouturier, and F. Pachet. 2002. Music similarity measures: What’s the use ? Proceedings of the International Conferenceon Music Information Retrieval. C. Liu, and C. Huang. 2002. A singer identification technique for content-based classification of mp3 music object. Proceeding of the Conference on Information and Knowledge Management. B. Logan, and A. Salomon. 2001. A music similarity function based on signal analysis. Proceeding of the International Conference on Multimedia and Expo.