Informed Source Separation of Orchestra and Soloist Using Masking and Unmasking

ISCA Tutorial and Research Workshop on Statistical And Perceptual Audition (SAPA) 2010 Informed Source Separation of Orchestra and Soloist Using Masking and Unmasking By Yushen Han, Christopher Raphael School of Informatics and Computing, Indiana University Bloomington Saturday 25 September 2010, Makuhari, Japan

Motivation – Musical Source Separation In General • To extract the orchestra accompaniment from any desirable recordings given the score - can be used in automatic accompaniment system (e.g. a piano concerto) or Karaoke • To isolate one chosen instrument from an ensemble • can be used in performance analysis of soloist

Motivation – This Project • Problem: separation procedures cause damage to each of the separated sources • Object: to address the degradation of the separation results • Strategy: by exploiting the information redundancy of the musical audio within each source

Overview • Motivation and introduction – system diagram • Previous – Separation by Spectrogram Masking • Recent – Repair by Spectrogram Unmasking • harmonicity hypothesis tested by Kalman phase smoothing • repair by amplitude inference and harmonic transposition • Examples with evaluation by spectrogram reassignment • Relevant works – ISS? • Conclusion and Discussion

Informed Source Separation System Diagram System Input System Output • Expectation- Maximization • Dynamic Programming • Machine Learning • (binary classification) • Harmonic- Percussive Separation by Spectrogram Masking Audio-score Alignment Desoled Audio Damaged Audio EM DP ML HPSS Score Evaluation According to BASS Note-wise audio reconstruction Note sample models 2D Spectral Modeling Note sample library Phase Estimation by Kalman Smoothing Harmonicity Hypothesis Amplitude Inference Desoled Audio Repaired Phase Estimation Mostly Previous Work Recent Development Focus of This Paper

Score Following

Separation “Informed” by Score Following solo accompaniment

Previous: (Binary) Spectrogram Masking Short-time Fourier Transform Complementary binary masks with (hard binary mask)

Previous: 2D Note-based Model a “template” function qm of note model indexed by m

Phase Estimation • Amplitude-Phase Decoupling Model Slowing varying at hth harmonic Locally linear in s up to a small correction term amplitude signal phase Phase unwrapping

State-space Model for Phase (cont.) • This idea of using state-space model to estimate phase should be credited to A. TaylanCemgil. For observable phase sequence at hth harmonic we introduce state vector with an unobservable component As timesprogresses (discretely), the state vector propagates via the state equation Where state transition matrix governs the the sinusoidal movement of phase according to the average phase advance at harmonic h over a relative long period (s0, s1) and w(s)is an unobservable, zero-man random (state) perturbation.

Illustration of State-space Model for Phase Estimation s s + 1 x2 x1 = (x1(s), x2(s) )t, but only observe y = x1 connects the observed and unobservable where H(s)=[1 0] and r(s) = 0 is the degenerated random (observation) perturbation

Kalman Smoothing Follows the state-space model, we can obtain the amplitude and phase This state-space model can be computed by Kalman filter but since the phase estimation is offline, we can update the state estimates backward to incorporate the observation that were not “available” at sample t in the forward pass by Kalman smoothing

Phase Estimation And Pairwise Unwrapped Phase Difference ? pitch G#3 (written A#3 on Bb clarinet) over a crescendo clarinet

Application of Phase Estimation – Using Pairwise Unwrapped Phase Difference to Test the Harmonicity Hypothesis By projecting the unwrapped phase θi(s) from harmonic i to j we visualize the unwrapped phase difference between harmonics in woodwinds and strings to test the harmonicity hypothesis

Amplitude Inference: Sampling a Note

Amplitude Inference by Conditional Expectation

Damage Repair b b

Experiment an excerpt of 45 seconds from the 2nd movement of Ravel’s piano concerto in G major

Excerpt from the 2nd movement of Ravel’s piano concerto in G major

Relevant works • BSS • NMF (non-negative “part-based representation” in NMF) • Latent variable decomposition by Raj, Smaragdis • Other Score-guided separation by Dubnov • “Informed Source Separation” using watermark by Parvaix • Harmonic/Percussive Sound Separation (HPSS), by Sagayama, Ono • Physical Acoustics, Fletcher

Conclusion and Future Work • Harmonic-wise Information Redundancy Expressed In both amplitude and phase can be used to inference “partially” damaged notes • Creating a framework to perform separation/repair in a large scale with synthesized ground truth and with BASS performance measurement by E. Vincent • (coming soon) xavier.informatics.indiana.edu/~yushan/SAPA2010

FINE Thank you for your attention

Informed Source Separation of Orchestra and Soloist Using Masking and Unmasking

Informed Source Separation of Orchestra and Soloist Using Masking and Unmasking

Presentation Transcript

Sound Source Separation using 3D Correlogram, Fuzzy Logic, and Neural Networks

Blind Single Channel Speech Separation by Spectrogram Masking

Separation of Heart and Respiration signals using MATLAB

Pitch, Timbre, Source Separation, and the Myths of Sound Localization

Analysis of diesel engine combustion using imaging and blind source separation

Sound Source Separation using 3D Correlogram, Fuzzy Logic, and Neural Networks

Intrusion Tolerance Using Masking, Redundancy and Dispersion

Soundprism An Online System for Score-informed Source Separation of Music Audio

Source separation and analysis of piano music signals using instrument-specific sinusoidal model

The Soloist

The Soloist

Stereo Mix Source Identification and Separation

Instruments of the Band and Orchestra

Pitch, Timbre, and Source Separation

Finding Interesting Climate Phenomena Using Source Separation Techniques

Source separation – the future is now? On urine and blackwater separation

Blind Source Separation : from source separation to pixel classication

Source Number Estimation and Clustering for Undetermined Blind Source Separation

Gradient Flow Source Separation and Localization

Pitch , Timbre, and Source Separation

The Soloist

Informed Consent and Using Archived Interview Data