260 likes | 271 Views
Tree structured and combined methods for comparing metered polyphonic music. Kjell L ëmstrom David Rizo Valero José Manuel Iñesta CMMR’08 May 21, 2008. Outline. Objectives State of the art Tree representation of monodies and polyphonic songs
E N D
Tree structured and combined methods for comparing metered polyphonic music Kjell Lëmstrom David Rizo Valero José Manuel Iñesta CMMR’08 May 21, 2008
Outline • Objectives • State of the art • Tree representation of monodies and polyphonic songs • Comparison of trees for obtaining similarities between songs • Geometric methods • Combination of methods • Experiments • Conclusions and work lines
Melodic comparison (symbolic) Given the sequence of notes at the scores … Are those tunes the same?
Target • Polyphonic music comparison of whole songs
Approaches to polyphonic comparison • Convert into monophonic • Use sequence comparison • Adapted text retrieval methods • PROMS: Clausen et al ‘00 • Doraisamy and Rüger ‘04: n-grams • Geometric methods • Lubiw and Tanur ‘04 • Ukkonen, Lemström and Mäkinen ‘03 + CMMR’08 Session: MUSR: Music Retrieval papers
Tree representation for monodies whole 4 beats half 2+2 1 4/4 bar 4×1 quarter eighth 8×½ F Duration C E G Initial time Tree construction process (Rizo et al. ’03) • Based on the logarithmic nature of music notation • Each tree level is a subdivision of the upper level . . . . . . . . . . . . . . . . . . . . . . . . . • Leaf labels can be any pitch magnitude • Rests are coded the same way as notes • Duration is implicitly coded in the tree structure
Tree representation Representation of whole melodies • The complete melody (all bars) is a forest (all trees) • Bars can be grouped sequentially or hierarchically C E G F A B C G Sequential grouping: G C A B F C E G
{C,E,F,G} G E {C,F,G} {G} {C,G,E} Actually, the interval from the tonic is represented in the tree Using tree tonality guessing (rizo et al.’06) {0,4,5,7} {0,5,7} {0,7} {0} {5} Polyphonic tree representation C F CG Process repeated for each voice: replace single labels for sets {C,G} Propagate from bottom using set union {C} {F}
Polyphonic tree representation • Better tree summarization: Use duration importance: rhythmic weights Multiset Rhythmic weight = 2h-l h = tree height l = node level {C=3,E=2,F=1,G=4} l = 1 {C=2,E=2,G=2} {C=1,F=1,G=2} l = 2 {C=1} {F=1} l = 3 It has been tested to use theKrumhansl-Schmuckler profiles along with the rhythmic weights: worse results
Comparing songs • Compare songs = compare trees • Approaches • Classical tree edit distances • Shasha • Selkow • Use only the information of the roots • Sequence edit distance • Longest Common Subsequence
Tree comparison {C=0.3, E=0.2, G=0.5} {F=1, G=1,A=1, B=0.2} { C=0.6, F=0.2 } { C=0.3, E=0.1, G=1 } Sa { C=0.6, F=0.2 } Labels of the root of each tree • Use only information in the roots • Roots contain the summary of its children after propagation . . . . . Bar 1 Bar 2 Bar 3 Bar 4 Bar N • RootED and LCRS: • Let be a tree level ot tree T, compose a sequence S(T) with all nodes at that level in the forest • RootED and LCRS use =1 • Distance between 2 songs A and B at a level • d(A,B, a, b)= stringDistance(Sa(A), Sb(B)) • or • d(A,B, a, b)= LCS(Sa(A), Sb(B)) Complexity with = 1 O(|barsA| * |barsB|)
Multiset substitution cost • Define multiset as a vector: • Index = interval from tonic • Value = cardinality • E.g: {C=1, G=4, B=2} is defined as • [1,0,0,0,0,0,0,4,0,0,0,2] • Multiset substitution cost between multisets X and Y represented by vectors v and w
Graphical representations • P1, P2, P3 algorithms from Ukkonen, Lemström, Makinen ‘03 • P2v5, P2v6: indexed versions of P2 • Not published yet
Method combination • Dissimilarity measure for a method = distance between songs • Combined dissimilarity measure = combination of distances between songs • Combination = sum of normalized distances
Experiments • Corpora: • ICPS (68 files): • 7 different polyphonic incipits: Schubert’s Ave Maria, Ravel’s Bolero, Alouette, Happy Birthday, Frère Jacques, Jingle Bells, When The Saints Go Marching In • Covers made up of polyphonic piano files + “Band in a box” variations • VAR (78 files): • Bach Goldberg variations • Bach english suites variations • Some Tchaikowsky variations
Evaluation method • Leave one out • All-against-all: each song S is compared with the rest of the songs, the result is an ordered list with the most similar songs first • Accuracy • Top-recognition-rate (TRRn): presence percentage of the a version of the song S among the top n slots • Success rate = TRR0 • Precision-at-|class| • |class| = number of versions of the same song • Times • Exclude preprocessing times: only performed once at startup of system • Averages: all results are averages of all queries
Results: ICPS Combined method: success rate Time and success rate
Results: VAR Combined method: again success rate Cuccess rate
Top-recognition-rate: ICPS Combined method gets a good result
Top-recognition-rate: VAR Combined method is the best one: combined methods are more robust
Conclusions and work lines Query • Very hard task when MIDI files are real ones • Preprocess songs: Use automatic tonal analysis + tree propagation to remove non-important notes in songs • Improve results by combining more different classifiers • Tune the tree comparison measures: submitted • Add LCS fast implementation from Hyyrö ‘04 • Add confidence values to LCS • Include meter extraction methods to build the trees MIDI
Melody = sequence of notes • String representation + string distances • (Mongeau and Sankoff ‘90, Lemström 2000) GGAGCBGG GAGAGGCBB • Symbols are combinations of pitch x rhythm • Pitch can be: absolute pitch, pitch class, interval from tonic, interval, contour, high-def contour, nothing • Rhythm can be: absolute, inter-onset interval, inter-onset ratio, contour, nothing • e.g.: (G4,8)(G4,8)(A4,4)(G4,4)(C5,4)(B4,2)(G4,8)(G4,8) • Best comparison results using intervals • with no rhythm information
Too many ornament notes: edit distance ≈ String distances • Drawbacks on the comparison without rhythm • Wrong results with: Same melodic distance and different rhythm: edit distance Hungarian dance, Schubert Godfather theme
Tree representation Tree construction process Rules (Rizo et al., 2003) • Propagation and prunning s F F A G Tree as initially coded from the score Max. prunning level defined
Tree representation C C C G C G C A C C A C A A A Melodic similarity metrics • TREE EDIT DISTANCE (Zhang & Shasha, 1989) The distance is computed as the cost of the operations to transform one tree into the other. t1 t2 d(t1,t2) Weighted operations of insertion deletion replacement Tree edit distance O( |T1| |T2| h(T1) h(T2) ) Previous prunning process helps to overcome this complexity (Zhang & Shasha, “Simple fast algorithms for the editing distance between trees...”. SIAM J Comput., 8(6): 1245-1262. 1989)