1 / 25

Similarity Matrix Processing for Music Structure Analysis

Similarity Matrix Processing for Music Structure Analysis. Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006. System Framework. Pitch Class Profile (PCP).

catrin
Download Presentation

Similarity Matrix Processing for Music Structure Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006

  2. System Framework

  3. Pitch Class Profile (PCP) • The PCP vector is a 12-dimensional vector, which shows the relative intensities of the 12 pitch classes, {C, C#, D, D#, E, F, F#, G, G#, A, A#,B} • Normalized to a unit vector

  4. Pitch Class Profile (PCP)

  5. Measure-based Similarity Matrix • Previous similarity matrix • Pre-defined window size • results in a similarity matrix of a large size that makes further processing more expensive • In this paper • Use measure as the element of similarity matrix

  6. Measure-based Similarity Matrix • PCP Vector generation • choose a window size that is equal to the duration of one half beat • Detect onset signal • compute the change of the spectral content between two adjacent shifting windows of 20ms long and with 50% overlap

  7. Measure-based Similarity Matrix • the autocorrelation function (ACF) of the onset signal is calculated to determine the beat period • Example: • 100BPM → length of half beat is 300 ms • Longer than the window size commonly use in previous work

  8. Measure-based Similarity Matrix • Grouping N successive PCP vectors • Since PCP vectors are unit vectors, 0 <= sij <= 1 • dynamic time warping (DTW) can be used to enhance the sij value

  9. Dynamic Time Warping

  10. Measure-based Similarity Matrix • After the simplification, a 3-minute song with a tempo of 100BPM can form a 75 × 75 similarity matrix • MSM reveals more the chord similarity rather than the melody similarity

  11. Two MSM Examples • Johnny Cash’s Hurt repeatedly uses the chord succession {Am, Am, C, D} in the 1st and 3rd sections while {G, A, F, C} in the 2nd and 4th sections. • Beatles’ Yesterday does not have chord succession of short periods. Its music form structure is P = {I V V C V C V O}

  12. Detection of Local Similarity • Using a 2D moving window

  13. Detection of Local Similarity • move the 2D moving window along the diagonal line of the MSM

  14. Detection of Long Range Similarity • The Viterbi algorithm is used to find segments with consecutive large similarity values along the 45-degree direction • we can exploit the output from the second module that provides the chord succession similarity to enhance the long range similarity detection.

  15. Detection of Long Range Similarity • interpret the x-axis as the “time”, the y-axis as the “state”

  16. Detection of Long Range Similarity • use “scores” instead of “probabilities” • The score of a path is defined as the product of similarity value of all states and scores of all state transitions

  17. Detection of Long Range Similarity • PT0 > PT1 to guarantee the preference along the 45-degree direction. • The larger the ratio, the more favorable the path will proceed along the 45-degree direction. • In our experiment, the ratio PT0/PT1 is chosen to be 1.5

  18. Detection of Long Range Similarity • Pruning with Chord Succession Information • sections with repetitive chord successions of a certain period should be similar to sections of same period • A period value p is tagged to a measure

  19. Detection of Long Range Similarity

  20. Post-processing • we begin with the state j that gives the highest Q(L, j) at time L, and perform a back-tracking process. • Segments with length smaller than φ measures are removed • In our implementation, φ = 8. • Segments whose mean similarity value is less than a threshold, τ , are removed • τ = mean + standard deviation (for all sij)

  21. Post-processing • Each segment should be divided • if their two corresponding sections in the song overlap with each other • if there is a significant difference between similarity values before and after a certain point in the segment. • If there are conflicts on sections, the one with a higher similarity value has the priority to keep the boundaries • For those songs in verse-chorus form, similarity values are clustered into two classes • high similarity values are claimed to be the chorus

  22. Experiment • collection of 120 pop, country and rock songs after 60’s. • 100 of them are of the verse-chorus form and 20 are of the AAA or other form • mono audio sampled at a rate of 22,050Hz, with 16 bits per sample.

  23. Experimental Results • The pattern extraction of a song is claimed to be correct if all patterns in the song are extracted without distinguishing between verse and chorus • The accurate detection rate is 112/120 = 93.33%.

  24. Experimental Results

More Related