170 likes | 927 Views
Audio Tempo Extraction. Presenter: Simon de Leon Date: February 9, 2006 Course: MUMT611. Agenda. Introduction Algorithm Onset extraction Periodicity detection Temporal estimation of beat locations Examples Conclusion Discussion. Introduction. Tempo extraction is useful for
E N D
Audio Tempo Extraction Presenter: Simon de Leon Date: February 9, 2006 Course: MUMT611
Agenda • Introduction • Algorithm • Onset extraction • Periodicity detection • Temporal estimation of beat locations • Examples • Conclusion • Discussion
Introduction • Tempo extraction is useful for • Automatic rhythm alignment • Beat-driven effects • Cut & paste operations in audio editing • Tempo extraction in general is mature for straightforward, rhythmic music (rock, rap, reggae, etc.) • The challenge is to be accurate across the widest range of genres
Introduction • We will focus on the winning algorithm for MIREX 2005 [1] • The top algorithms belong to the class that performs the following: • Time-freq. analysis to determine beat onset • Pitch detection and autocorrelation techniques for periodicity estimation • Evaluation of algorithms is difficult due to different perceptions of rhythm
Algorithm • Divided into three sections • Onset extraction • Where are the exact locations of the musical salient features? • Periodicity estimation • What is the tempo of the beats found? • Temporal estimation of beat locations • We found the onset locations in the spectral domain, but they are not all necessarily the beats
Algorithm – Onset extraction • Idea is that the beat onsets correspond with • Note changes • Harmonic changes • Percussive events • Define spectral energy flux • Time derivative of the frequency component magnitudes • Technique of [1] assumes onsets correspond to the fastest change of frequency component magnitudes
Algorithm – Onset extraction • Step 1: Take STFT of signal • Step 2: Take time derivative of frequency components (spectral energy flux) • a) Low-pass filter STFT magnitude • b) Apply logarithmic compression [2] • c) Pass through FIR filter differentiator [3] • Step 3: Use dynamic threshold and remove the smallest onset spectral energy flux “spikes” from previous step
Algorithm – Onset extraction • Top left: Piano signal. Bottom left: STFT • Top right: Spectral energy flux. Bottom right: Detection function
Algorithm – Onset extraction • Top left: Violin signal. Bottom left: STFT • Top right: Spectral energy flux. Bottom right: Detection function
Algorithm – Periodicity Detection • Two techniques studied in [1] • Spectral product • Autocorrelation function • Assume tempo T is between 60bpm and 200bpm • Spectral product • Step 1) Take FFT of detection function • Step 2) For each frequency, multiply it by all of it’s integer multiples • Step 3) Largest product corresponds to frequency of periodicity
Algorithm – Periodicity Detection • Autocorrelation function • Classical periodicity estimation, slightly outperforms spectral product method • It is the cross-correlation of a signal with itself • Three largest peaks of cross-correlation are analyzed for a multiplicity relationship
Algorithm – Beat location • Given the tempo extracted from previous steps, we need to align the beat in phase • Step 1) Create pulse train q(t) with period Tderived from periodicity algorithm • Step 2) Find phase by cross-correlating q(t) with detection function, evaluating only at indices corresponding to detection function maximas • Step 3) For successive beats in an analysis window, simply add T and search for peak in detection function in vicinity • Repeat (2) to re-align phase if peak not found
Examples • Let’s listen to some demos of the algorithm in action • Jazz – very good to good • Rock – very good • Classical – very bad to good • Soul – very good • Latin – satisfactory to good
Conclusion • This algorithm represents the state-of-the-art in tempo extraction, the majority of the work focusing on onset detection • Problem areas • Long fading attacks and decays produce false onsets • Many instruments playing continuously with no stable regions produces too many false onsets • Cannot keep up when tempo varies quickly
Conclusion • Results from [1] indicate roughly • 80-90% accuracy for classical, jazz, rock • 90-100% for latin, pop, reggae, soul, rap, techno • Results from [1] using MIREX database • 95% of the time gave correct tempo
Discussion • Can evaluation methods be improved? How can we avoid the subjective nature of tempo perception? • Any suggestions on how we might improve the onset detection algorithm? How about the periodicity algorithm?
References [1] Alonso, Miguel, Bertrand David, and Gael Richard. 2004. Tempo and Beat Estimation of Musical Signals. Proceedings of the 5th International Conference on Music Information Retrieval. [2] Klapuri, Anssi. 1999. Sound Onset Detection by Applying Psychoacoustic Knowledge. Proceedings of the IEEE International Conference of Acoustics, Speech and Signal Processing: 3089-3092. [3] Proakis, John G., and Dimitris K. Manolakis. 1995. Digital Signal Processing: Principles, Algorithms and Applications. 3rd Ed. New York: Prentice Hall.