1 / 32

Pitch-dependent Musical Instrument Identification and Its Application to Musical Sound Ontology

Pitch-dependent Musical Instrument Identification and Its Application to Musical Sound Ontology Tetsuro Kitahara* Masataka Goto** Hiroshi G. Okuno* *Grad. Sch’l of Informatics, Kyoto Univ. **PRESTO JST / Nat’l Inst. Adv. Ind. Sci. & Tech. IEA/AIE-2003 (24 th June 2003 in UK) Today’s talk

Jims
Download Presentation

Pitch-dependent Musical Instrument Identification and Its Application to Musical Sound Ontology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pitch-dependent Musical Instrument Identification and Its Application to Musical Sound Ontology Tetsuro Kitahara*Masataka Goto**Hiroshi G. Okuno* *Grad. Sch’l of Informatics, Kyoto Univ.**PRESTO JST / Nat’l Inst. Adv. Ind. Sci. & Tech. IEA/AIE-2003 (24th June 2003 in UK)

  2. Today’s talk • Musical Instrument Identification • Difficulty: The pitch dependency of timbre • Solution:Approximating it as a function of F0 • Experiments • Musical Sound Ontology • A hierarchy of musical instrument sounds • Systematically constructed by C5.0

  3. 1. What is musical instrument identification? • To obtain the names of musical instruments from sounds (acoustical signals). • Useful for automatic music transcription, music information retrieval, etc. Feature Extraction (e.g. Decay speed, Spectral centroid) w = argmax p(w|X) = argmax p(X|w) p(w) p(X|wpiano) p(X|wflute) <inst>piano</inst>

  4. (a) Pitch = C2 (65.5Hz) (b) Pitch = C6 (1048Hz) 0.5 0.5 0 0 -0.5 -0.5 0 1 2 3 0 1 2 3 time [s] time [s] 2. What is the difficulty? The pitch dependency of timbree.g. Low-pitch piano sounds decay slowly High-pitch piano sound decay fast

  5. (a) Pitch = C2 (65.5Hz) (b) Pitch = C6 (1048Hz) 0.5 0.5 0 0 -0.5 -0.5 0 1 2 3 0 1 2 3 time [s] time [s] 2. What is the difficulty? The pitch dependency of timbree.g. Low-pitch piano sound = Slow decay High-pitch piano sound = Fast decay In previous studies… The pitch dependency was pointed out, but has not been dealt with.

  6. 3. How is the pitch dependency coped with? Our solution: Approximate the pitch dependency of each featureas a function of fundamental frequency (F0)

  7. Modelling of how each feature varies according to F0 3. How is the pitch dependency coped with? Our solution: Approximate the pitch dependency of each featureas a function of fundamental frequency (F0)

  8. 3. How is the pitch dependency coped with? An F0-dependent multivariate normal distributionhas following two parameters: F0-dependent mean function which captures the pitch dependency (i.e. the position of distributions of each F0) F0-normalized covariancewhich captures the non-pitch dependency

  9. 4. Musical instrument identification using F0-dependent multivariate normal distribution A musical instrument identification method has following four steps: 1st step: Feature extraction 2nd step: Dimensionality reduction 3rd step: Parameter estimation Final step: Using the Bayes decision rule

  10. 4. Musical instrument identification using F0-dependent multivariate normal distribution(1st) Feature extraction 129 features defined based on consulting literatures are extracted. (1) Spectral centroid (which captures brightness of tones) Piano Flute Spectral centroid Spectral centroid

  11. 4. Musical instrument identification using F0-dependent multivariate normal distribution(1st) Feature extraction 129 features defined based on consulting literatures are extracted. (2) Decay speed of power Flute Piano not decayed decayed

  12. 4. Musical instrument identification using F0-dependent multivariate normal distribution(2nd)Dimensionality reduction The dimensionality of the feature space is reduced by following two methods. 129-dimensional feature space PCA (principal component analysis) (with the proportion value of 99%) 79-dimensional feature spaceLDA (linear discriminant analysis) 18-dimensional feature space

  13. 4. Musical instrument identification using F0-dependent multivariate normal distribution(3rd) Parameter estimation First, the F0-dependent mean function is approximated as a cubic polynomial.

  14. 4. Musical instrument identification using F0-dependent multivariate normal distribution(3rd) Parameter estimation Second, the F0-normalized covariance is obtained by subtracting the F0-dependent mean from each feature. eliminating the pitch dependency

  15. 4. Musical instrument identification using F0-dependent multivariate normal distribution(Final)The Bayes decision rule The instrument w satisfying w = argmax [log p(X|w; f) + log p(w; f)]is determined as the result. p(X|w; f) … - A probability density function of the F0-dependent multivariate normal distribution. - Defined by F0-dependent mean function and the F0-normalized covariance.

  16. 5. Experimental Conditions • Database: A subset of RWC-MDB-I-2001 • Consists of solo tones of 19 real instrumentswith all pitch range. • Contains 3 individuals and 3 intensitiesfor each instrument. • Contains normal articulation only. • The number of all sounds is 6,247. • Using the 10-fold cross validation. • Evaluate the performance both at individual instrument level and at category level.

  17. 6. Experimental Results Recognition rates: 79.73% (at individual level)90.65% (at category level) Improvement:4.00% (at individual level)2.45% (at category level) Error reduction (relative):16.48% (at individual level)20.67% (at category level) Category level(8 classes) Individual level(19 classes)

  18. 6. Experimental Results The recognition rates of following 6 instruments were improved by more than 7%. Piano: The best improved (74.21%a83.27%) Because the piano has the wide pitch range.

  19. 7. Musical sound ontology • A hierarchy of musical instrument sounds • Important for various applications.e.g. Category-level musical instrument recognition (such as strings, wind instruments) Music composing (or arrangement) supporting • However, its systematic construction has not been reported. • We report the result of constructing acoustics-based musical sound ontology using C5.0 decision tree program.

  20. 7. Musical sound ontology

  21. 7. Musical sound ontology Different from conventional hierarchy.

  22. 7. Musical sound ontology Acoustic characteristics depend on the pitch as well as the sounding mechanism.

  23. 7. Musical sound ontology This hierarchy was known to musicians experientially, but has not been constructed by computer previously.

  24. 8. Conclusions • We proposed a method for musical instrument identification which takes into consideration the pitch dependency of timbre.aRecognition rate improved: 75.73%a79.73% • We reported the construction ofmusical sound ontology based on acoustic characteristics. • Future works: • Evaluation against mixture of sounds • Development of application systems using the proposed method.

  25. Temporal mean of kurtosis of spectral peaks Spectral peaks Non-harmonic If power of non-harmonic components are stronger, kurtosis of spectral peaks become higher a This feature captures how much are non-harmonic components contained in spectrum.

  26. Recognition rates at category level Err Rdct35% 8% 23% 33% 20% 13% 15% 8% • Recognition rates for all categories were improved. • Recognition rates for Piano, Guitar, Strings: 96.7%

  27. We adopted Bayes vs k-NN Bayes (18 dim; PCA+LDA) Bayes (79 dim; PCA only)Bayes (18 dim; PCA only)3-NN (18 dim; PCA+LDA)3-NN (79 dim; PCA only)3-NN (18 dim; PCA only) • PCA+LDA+Bayes achieved the best performance. • 18-dimension is better than 79-dimension. # of training data is not enough for 79-dim. • The use of LDA improved the performance. LDA considers separation between classes.

  28. We adopt Bayes vs k-NN Bayes (18 dim; PCA+LDA) Bayes (79 dim; PCA only)Bayes (18 dim; PCA only)3-NN (18 dim; PCA+LDA)3-NN (79 dim; PCA only)3-NN (18 dim; PCA only) Jain’s guideline (1982):Having 5 to 10 times as many training data as # of dimensions seems to be a good practice. • PCA+LDA+Bayes achieved the best performance. • 18-dimension is better than 79-dimension. # of training data is not enough for 79-dim. • The use of LDA improved the performance. LDA considers separation between classes.

  29. Relationship between training data and dimension 14 dim. (85%)18 dim. (88%)20 dim. (89%)23 dim. (90%)32 dim. (93%)41 dim. (95%)52 dim. (97%)79 dim. (99%) Hughes’s peaking phenomenon • At 23-dimension, the performance peaked. • Any results without LDA were worse than that with LDA.

  30. Conventional hierarchy(Sounding-mechanism-based)

More Related