210 likes | 297 Views
Robust Temporal and Spectral Modeling for Query By Melody. Shai Shalev, Hebrew University Yoram Singer, Hebrew University Nir Friedman, Hebrew University Shlomo Dubnov, Ben-Gurion University. Prelude. Problem Setting. Find: performances of the queried melody. Query: a melody.
E N D
Robust Temporal and Spectral Modeling for Query By Melody Shai Shalev, Hebrew University Yoram Singer, Hebrew University Nir Friedman, Hebrew University Shlomo Dubnov, Ben-Gurion University
Problem Setting Find: performances of the queried melody Query: a melody Database of real recordings
Challenge • Find performances of the queried melody independent of: • Tempo • Performing instrument • Dynamics • Expression • Accompaniment
Related Work • A. Ghias, et al. “Query by humming” • A. S. Durey and M. A. Clements. “Melody spotting using hidden markov models” • C. Raphael. “Automatic segmentation of acoustic musical signals using HMMs” • B. Doval and X. Rodet. “Fundamental frequency estimation using a new harmonic matching method”
Overview of Solution • Employ a statistical framework • Align a melody to a performance using an explicit tempo modeling • Employ a maximum likelihood model for the spectrum of a note given the note’s pitch value • Find the best alignment of a melody to a performance using dynamic programming
A melody query Ranked list of A database of real recordings Query Engine According to For each recording find: Statistical Framework
Tempo Melody Aligned Melody Sound Melody Modeling Legend: Hidden Variable Observed Variable
Tempo Modeling • Sequence of scaling factors (one per note) • Model tempo as a first order Markov model • Use log-normal distribution to model conditional probability of tempo
Spectral Modeling (cont.) • Estimate the amplitude at each harmonyand global variance of the noise using the maximum likelihood principle • Resulting signal-to-noise likelihood function:
Finding the best melody-performance alignment • Recurse over tempo and end-time of the previous note Dynamic Programming procedure • Complexity: #Possible Tempo values #notes Length of Signal
Experimental Results • Queries: 50 melodies from opera arias (from Midi files) • Database: over 800 performances of opera arias performed by over 50 tenors with full orchestral accompaniment • Compared our variable-tempo (VT) model vs. fixed-tempo (FT) and locally-fixed-tempo (LFT) models • Compared our Harmonic with Scaled Noise (HSN) spectral model vs. Harmonic with Independent Noise (HIN) model
Evaluation Measures + - Oerr = 0 + Cov = 3 - 2 Likelihood Value - - - - - 1 2 3 4 5 Index of Performancein the ranked list
Summary of Results • One Error of VT+HSN: 8% • Average Precision of VT+HSN: 95% • Coverage of VT+HSN: 0.21
Results Spectral Distribution Model HSN HIN AvgP Cov Oerr AvgP Cov Oerr 25 Sec. VT 0.95 0.21 0.08 0.92 0.40 0.10 LFT 0.66 5.90 0.46 0.63 5.98 0.48 FT 0.34 20.69 0.77 0.33 22.46 0.79 15 Sec. VT 0.86 1.75 0.19 0.83 3.02 0.19 LFT 0.66 8.10 0.44 0.66 8.15 0.42 FT 0.38 19.83 0.71 0.36 19.08 0.73 5 Sec. VT 0.51 10.67 0.65 0.46 11.83 0.69 LFT 0.43 17.33 0.69 0.37 17.94 0.75 FT 0.38 22.96 0.69 0.35 21.67 0.75
Future Work • More data • Other genre of music • Alternative spectral distribution models using supervised learning methods. • Use alignment results for separating a soloist from the accompaniment