250 likes | 261 Views
Finding a single voice in music. Christine Smit April 26, 2007. Outline. Introduction Classification Strategies: Counting silent frequency bins Pitch cancellation MFCCs Trading recall for precision What worked and what didn’t. Introduction. What am I doing?. What is a ‘single voice’?.
E N D
Finding a single voice in music Christine Smit April 26, 2007
Outline • Introduction • Classification Strategies: • Counting silent frequency bins • Pitch cancellation • MFCCs • Trading recall for precision • What worked and what didn’t
Introduction What am I doing?
What is a ‘single voice’? • a single note sounding at a time
Why do this? single voice finder + instrument identifier = instrument sample library
What are the data sets? • training set: 10 1-minute samples • test set: 10 1-minute test samples • 25% single voice, 75% multi-voice/silence • mixture of classical and folk music
What characterizes a single voice? non-solo solo non-solo
Strategy #1: Silence detection music find silence silence counts silent raw classification HMM? Nothing really worked
Strategy #2: Pitch Cancellation music filter pitch filtered music single voice? raw classification HMM final classification
Strategy #3: MFCCs music MFCC 13 features GMM likelihood HMM final classification
Quick reminder • Precision = out of the stuff we got, how much of it was right? Are google’s results relevant? • Recall = out of all the right stuff, how much did we get? If I asked google for the UN, did I get all the UN’s websites?
Precision is important • If I have a large enough database, I can afford to have relatively low recall. But I want high precision so what I do get is what I want.
Strategy #2: Pitch Cancellation music filter pitch Tweak Cutoff filtered music single voice? raw classification HMM final classification
Strategy #3: MFCCs music MFCC 13 features GMM Tweak Probabilities likelihood HMM final classification
Conclusion • Silence detection really didn’t work out. • MFCCs + GMM is really just as good as pitch cancellation • At 90% precision, I get about 25% recall.
Acknowledgements Much thanks to Professor Ellis for his assistance on this project.