260 likes | 279 Views
A Wavelet-Based Approach to the Discovery of Themes and Motives in Melodies. Gissel Velarde and David Meredith Aalborg University Department of Architecture, Design & Media Technology EuroMAC, September 2014. We present.
E N D
A Wavelet-Based Approach to the Discovery of Themes and Motives in Melodies Gissel Velardeand David Meredith Aalborg UniversityDepartment of Architecture, Design & Media Technology EuroMAC, September 2014
We present • A computational method submitted to the MIREX 2014 Discovery of Repeated Themes & Sections task • The results on the monophonic version of the JKU Patterns Development Database
The idea behind the method • In the context of pattern discovery in monophonic pieces: • With a good melodic structure in terms of segments, it should be possible to gather similar segments into clusters and rank their salience within the piece.
Considerations • “a good melodic structure in terms of segments” • Is considered to be closer to the ground truth analysis (See Collins, 2014) • It specifies certain segments or patterns • These patterns can be overlapping and hierarchical
Considerations • We also consider other aspects of the problem, • representation, • segmentation, • measuring similarity, • clustering of segments and • ranking segments according to salience
The method • The method • Follows and extends our approach to melodic segmentation and classification based on filtering with the Haar wavelet (Velarde, Weyde and Meredith, 2013) • Uses idea of computing a similarity matrix for “window connectivity information from a generic motif discovery algorithm for sequential data (Jensen, Styczynski, Rigoutsos and Stephanopoulos, 2006)
Wavelet transform The wavelet coefficients of the pitch vector v for scale s and shift u are defined as the inner product: Haar Wavelet A family of functions is obtained by translations and dilatations of the mother wavelet:
Representation (Velarde et al. 2013) New representation
First stage Segmentation (Velarde et al. 2013) New segmentation
Segmentation Constant segmentation, wavelet zero-crossings or modulus maxima First stage segmentation Distance matrix given a measure Comparison Binarized distance matrix given a threshold Concatenation Contiguous similar diagonal segments are concatenated Comparison Distance matrix given a measure By agglomerative clusters from an agglomerative hierarchical cluster tree Clustering Ranking Criteria: sum of the length of occurrences
Parameter combinations We tested the following parameter combinations: • MIDI pitch • Sampling rate: 16 samplesperqn • Representation: • normalizedpitchsignal, wavcoefficients, wavcoefficientsmodulus • Scalerepresentation at 1 qn • Segmentation: • constantsegmentation, zerocrossings, modulusmaxima • Scalesegmentation at 1 and 4 qn • Thresholdforconcatenation: 0, 0.1, 1 • Distances: • city-block, Euclidean, DTW • Agglomerative clusters from an agglomerative hierarchical cluster tree • Number of clusters: 7 • Rankingcriterion: Sum of the length of occurrences
Evaluation • As described at MIREX 2014:Discovery of Repeated Themes & Sections • establishment precision, establishment recall, and establishment F1 score; • occurrence precision, occurrence recall, and occurrence F1 score; • three-layer precision, three-layer recall, and three-layer F1 score; • runtime, first five target proportion and first five precision; • standard precision, recall, and F1 score;
Results • On the JKU Patterns Development Database monophonic version • J. S. Bach, Fugue BWV 889, • Beethoven's Sonata Op. 2, No. 1, Movement 3, • Chopin's Mazurka Op. 24, No. 4, • Gibbons's Silver Swan, and • Mozart's Sonata K.282, Movement 2. • We selected best combinations according to representation and segmentation.
Results Fig 1. Mean F1 score (mean(f1_est, f1_occ(c=.75), 3L F1, f1_occ (c=.5)) .
Results Fig 2. Standard F1 score
Results Fig 3. Mean Runtime per piece.
Our MIREX Submissions VM1 and VM2 Combinations selected based on • mean F1 score: mean(F1_est, F1_occ(c=.75), F1_3, F1_occ (c=.5)) • standard F1 score • VM1 differs from VM2 in the following parameters: • Normalized pitch signal representation, • Constant segmentation at the scale of 1 qn, • Threshold for concatenation 0.1. • VM2 differs from VM1 in the following parameters: • Wavelet coefficients representation filtered at the scale of 1 qn • Modulus maxima segmentation at the scale of 4 qn • Threshold for concatenation 1
Our MIREX Submissions Table 1. Results of VM1 on the JKU Patterns Development Database. Table 2. Results of VM2 on the JKU Patterns Development Database. Three Layer F1, (χ2(1)=1.8, p=0.1797): ->No significant difference Standard F1, (χ2(1)=4, p=0.045): ->VM1 preferred Runtime, (χ2(1)=5, p=0.0253) ->VM2 preferred
Example: Bach's Fugue BWV 889 prototypical pattern Example: Bach's Fugue BWV 889 prototypical pattern
Observations • The segmentation stage makes more difference in the results, according to the parameters • In the first stage segmentation • The size of the scale affects the results for standard measures and runtimes • In the first comparison • Zero-crossings segmentation works best with DTW • DTW is much more expensive to compute
Observations • In the comparison (after segmentation), City-block is dominant • DTW in the comparison after segmentation is not in the best combinations • Maybe because there is no ritardando or accelerando in this dataset and/or representation • For standard measures and a smaller segmentation scale • Pitch signal works better than wavelet representation • For non standard measures and a larger segmentation scale • Modulus maxima performs slightly better than zero-crossings and constant segmentation
Conclusions • Our novel wavelet-based method outperforms the methods reported by Meredith (2013) and Nieto & Farbood (2013) on the monophonic version of the JKU PDD training dataset, scoring higher on precision, recall and F1 score, and reporting faster runtimes.
Conclusions • The segmentation stage makes more difference in the results, according to the parameters • A small scale for first stage segmentation should be preferable for higher values of the standard measures and a large scale should be preferable for runtime computation. • City-block should be preferable after segmentation
References [1] T. Collins. Mirex 2014 competition: Discovery of repeated themes and sections, 2014. http://www.music-ir.org/mirex/wiki/2014:Discovery_of_Repeated_Themes_%26_Sections. Accessed on 12 May 2014. [2] K. Jensen, M. Styczynski, I. Rigoutsos and G. Stephanopoulos: “A generic motif discovery algorithm for sequential data”, Bioinformatics, 22:1, pp. 21-28, 2006. [3] D. Meredith. “COSIATEC and SIATECCompress: Pattern discovery by geometric compression”, Competition on Discovery of Repeated Themes and Sections, MIREX 2013, Curitiba, Brazil, 2013. [4] O. Nieto, and M. Farbood. “Discovering Musical Patterns Using Audio Structural Segmentation Techniques. Competition on Discovery of Repeated Themes and Sections, MIREX 2013, Curitiba, Brazil, 2013 [5] G. Velarde, T. Weyde and D. Meredith: “An approach to melodic segmentation and classification based on filtering with the Haar-wavelet”, Journal of New Music Research, 42:4, 325-345, 2013.