270 likes | 474 Views
ISCA Tutorial and Research Workshop on Statistical And Perceptual Audition (SAPA) 2010 . Informed Source Separation of Orchestra and Soloist Using Masking and Unmasking. By Yushen Han, Christopher Raphael School of Informatics and Computing, Indiana University Bloomington.
E N D
ISCA Tutorial and Research Workshop on Statistical And Perceptual Audition (SAPA) 2010 Informed Source Separation of Orchestra and Soloist Using Masking and Unmasking By Yushen Han, Christopher Raphael School of Informatics and Computing, Indiana University Bloomington Saturday 25 September 2010, Makuhari, Japan
Motivation – Musical Source Separation In General • To extract the orchestra accompaniment from any desirable recordings given the score - can be used in automatic accompaniment system (e.g. a piano concerto) or Karaoke • To isolate one chosen instrument from an ensemble • can be used in performance analysis of soloist
Motivation – This Project • Problem: separation procedures cause damage to each of the separated sources • Object: to address the degradation of the separation results • Strategy: by exploiting the information redundancy of the musical audio within each source
Overview • Motivation and introduction – system diagram • Previous – Separation by Spectrogram Masking • Recent – Repair by Spectrogram Unmasking • harmonicity hypothesis tested by Kalman phase smoothing • repair by amplitude inference and harmonic transposition • Examples with evaluation by spectrogram reassignment • Relevant works – ISS? • Conclusion and Discussion
Informed Source Separation System Diagram System Input System Output • Expectation- Maximization • Dynamic Programming • Machine Learning • (binary classification) • Harmonic- Percussive Separation by Spectrogram Masking Audio-score Alignment Desoled Audio Damaged Audio EM DP ML HPSS Score Evaluation According to BASS Note-wise audio reconstruction Note sample models 2D Spectral Modeling Note sample library Phase Estimation by Kalman Smoothing Harmonicity Hypothesis Amplitude Inference Desoled Audio Repaired Phase Estimation Mostly Previous Work Recent Development Focus of This Paper
Separation “Informed” by Score Following solo accompaniment
Previous: (Binary) Spectrogram Masking Short-time Fourier Transform Complementary binary masks with (hard binary mask)
Previous: 2D Note-based Model a “template” function qm of note model indexed by m
Informed Source Separation System Diagram System Input System Output • Expectation- Maximization • Dynamic Programming • Machine Learning • (binary classification) • Harmonic- Percussive Separation by Spectrogram Masking Audio-score Alignment Desoled Audio Damaged Audio EM DP ML HPSS Score Evaluation According to BASS Note-wise audio reconstruction Note sample models 2D Spectral Modeling Note sample library Phase Estimation by Kalman Smoothing Harmonicity Hypothesis Amplitude Inference Desoled Audio Repaired Phase Estimation Mostly Previous Work Recent Development Focus of This Paper
Phase Estimation • Amplitude-Phase Decoupling Model Slowing varying at hth harmonic Locally linear in s up to a small correction term amplitude signal phase Phase unwrapping
State-space Model for Phase (cont.) • This idea of using state-space model to estimate phase should be credited to A. TaylanCemgil. For observable phase sequence at hth harmonic we introduce state vector with an unobservable component As timesprogresses (discretely), the state vector propagates via the state equation Where state transition matrix governs the the sinusoidal movement of phase according to the average phase advance at harmonic h over a relative long period (s0, s1) and w(s)is an unobservable, zero-man random (state) perturbation.
Illustration of State-space Model for Phase Estimation s s + 1 x2 x1 = (x1(s), x2(s) )t, but only observe y = x1 connects the observed and unobservable where H(s)=[1 0] and r(s) = 0 is the degenerated random (observation) perturbation
Kalman Smoothing Follows the state-space model, we can obtain the amplitude and phase This state-space model can be computed by Kalman filter but since the phase estimation is offline, we can update the state estimates backward to incorporate the observation that were not “available” at sample t in the forward pass by Kalman smoothing
Informed Source Separation System Diagram System Input System Output • Expectation- Maximization • Dynamic Programming • Machine Learning • (binary classification) • Harmonic- Percussive Separation by Spectrogram Masking Audio-score Alignment Desoled Audio Damaged Audio EM DP ML HPSS Score Evaluation According to BASS Note-wise audio reconstruction Note sample models 2D Spectral Modeling Note sample library Phase Estimation by Kalman Smoothing Harmonicity Hypothesis Amplitude Inference Desoled Audio Repaired Phase Estimation Mostly Previous Work Recent Development Focus of This Paper
Phase Estimation And Pairwise Unwrapped Phase Difference ? pitch G#3 (written A#3 on Bb clarinet) over a crescendo clarinet
Application of Phase Estimation – Using Pairwise Unwrapped Phase Difference to Test the Harmonicity Hypothesis By projecting the unwrapped phase θi(s) from harmonic i to j we visualize the unwrapped phase difference between harmonics in woodwinds and strings to test the harmonicity hypothesis
Informed Source Separation System Diagram System Input System Output • Expectation- Maximization • Dynamic Programming • Machine Learning • (binary classification) • Harmonic- Percussive Separation by Spectrogram Masking Audio-score Alignment Desoled Audio Damaged Audio EM DP ML HPSS Score Evaluation According to BASS Note-wise audio reconstruction Note sample models 2D Spectral Modeling Note sample library Phase Estimation by Kalman Smoothing Harmonicity Hypothesis Amplitude Inference Desoled Audio Repaired Phase Estimation Mostly Previous Work Recent Development Focus of This Paper
Informed Source Separation System Diagram System Input System Output • Expectation- Maximization • Dynamic Programming • Machine Learning • (binary classification) • Harmonic- Percussive Separation by Spectrogram Masking Audio-score Alignment Desoled Audio Damaged Audio EM DP ML HPSS Score Evaluation According to BASS Note-wise audio reconstruction Note sample models 2D Spectral Modeling Note sample library Phase Estimation by Kalman Smoothing Harmonicity Hypothesis Amplitude Inference Desoled Audio Repaired Phase Estimation Mostly Previous Work Recent Development Focus of This Paper
Damage Repair b b
Experiment an excerpt of 45 seconds from the 2nd movement of Ravel’s piano concerto in G major
Excerpt from the 2nd movement of Ravel’s piano concerto in G major
Relevant works • BSS • NMF (non-negative “part-based representation” in NMF) • Latent variable decomposition by Raj, Smaragdis • Other Score-guided separation by Dubnov • “Informed Source Separation” using watermark by Parvaix • Harmonic/Percussive Sound Separation (HPSS), by Sagayama, Ono • Physical Acoustics, Fletcher
Conclusion and Future Work • Harmonic-wise Information Redundancy Expressed In both amplitude and phase can be used to inference “partially” damaged notes • Creating a framework to perform separation/repair in a large scale with synthesized ground truth and with BASS performance measurement by E. Vincent • (coming soon) xavier.informatics.indiana.edu/~yushan/SAPA2010
FINE Thank you for your attention