330 likes | 472 Views
Content-based Music Retrieval from Acoustic Input (CBMR). Outline. What is CBMR? Methods Signal processing Similarity comparison Experiment results Demo Future work. What is CBMR?. CBMR : Content-based Music Retrieval Traditional database query : Text-based or SQL-based
E N D
Outline • What is CBMR? • Methods • Signal processing • Similarity comparison • Experiment results • Demo • Future work
What is CBMR? • CBMR : • Content-based Music Retrieval • Traditional database query : • Text-based or SQL-based • Our goal : • Music retrieval by singing/humming
Related Work • Query by humming by Ghias,Loga and Chamberlin in 1995 • Autocorrelation pitch detection • 183 songs in database • MELDEX system by New Zealand Digital Library Project in 1996 • Gold/Rabiner Algorithm (800 songs) • Sing ‘la’ or ‘ta’ when transposition • Karaoke song recognizer by J.F. Wang in 1997 • Novel pitch detection • 50 songs in database
Flowchart On-line processing Microphone Signal Input Filtering Pitch Tracking Post Signal Processing Sampling 11KHz Mid-level Representation Similarity Comparison QueryResults (Ranked SongList) Midi message Extraction Off-line processing Songs Database
Original Wave Input 小雨中的回憶 11025 Hz 8 Bits Mono
Single Frame Overlap Zoom in Frame 512 points/frame 340 points overlap
Pitch Tracking • Range • E2 - C6 • 82 Hz - 1047 Hz ( - ) • Method • Auto-correlation
Center Clipping Clipping limits are set to r% of the absolute maximum of the auto-correlation data 0 0 0 (a) (b) (c)
Signal Process • Remove violent point & short notes • Down sampling & smoothing • Frequency to semitone • Semitone : A music scale based on A440
Similarity Comparison • Goal • Find the most similar Midi file • Challenge • Tempo variance • Dynamic time warping (DTW) • Tune variance • Key transposition
Compare by DTW Wave File DTW Mid File
Dynamic Time Warping (DTW) j window r(j) r(j-1) window i t(i-1) t(i)
DTW (cont.) j dist(i,j) = |t(i)-r(j)| if ( t(i) = Rest && r(j) = Rest ) dist(i,j) = 0; elseif ( t(i) = Rest || r(j) = Rest) dist(i,j) = restWeight; i
Key Transposition • Mean sift • Binary search in the searching area • O( N) --> O (log N) Mean Searching Area
Score Function • m : length of match string • n : length of input string • e : DTW distance • A = 0.8 • B = 0.6
Experiment Environment • 290 wave files • Wave length : 5 - 8 sec • Wave format : PCM, 11025Hz, 8bits, Mono • Environment • Celeron 450 with 128Mb RAM under Matlab 5.3 • Database • 493 midi files
Experiment Result (Pie) Total time: 4589 sec (15.8 sec/per-wave)
Experiment Result (Pie) - With Rest Total time : 7893 sec (27.2 sec/per-wave)
How to Accelerate? • Branch and bound • O(N) -> O(lnN) • Triangle inequality • d(a,b) + d(b,c) ≧ d(a,c) • Hierarchical • 2 phase • 3/32 sec • 2/32 sec
Experiment Result (Pie) - 3/32 sec Total time : 2358 sec (8.9 sec/per-wave)
Experiment Result (Pie) - 2 Phase Total time: 3006 sec (11.2 sec/per-wave)
Error Analysis • Midi error • Singing error • Low pitch • Broken vocalism • Noise
Future Work • Time consuming • Better similarity comparison • Different comparison unit • Hardware acceleration • Better searching algorithm • Steadier pitch tracking algorithm • Noise handle