Refinement in FTLP-HNM system for Speech Enhancement

Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing Group School of Engineering and Design, Brunel University 23 Nov, 2005

Review of FTLP-HNM system; Parameters estimation of HNM (incl. pitch/harmonic tracking in noise) Objective results of pitch, harmonic tracking and FTLP-HNM system Demo of enhanced speeches from old archive recordings Outline

Overview of FTLP-HNM Speech Enhancement System

Harmonic plus Noise Model In HNM, speech is decomposed to two parts : Harmonic part and noise part. Harmonic : where L(t) denotes the number of harmonic included in the harmonic part, ω0 denotes the pitch frequency. Noise : where h the a time-varying autoregressive(AR) model and b is white Gaussian noise. Synthesized Speech :

HNM - Pitch Tracking • Error function in frequency domain • In noisy condition the error function is modified to including SNR dependent weights The weighting function W(l) is a SNR-dependent given by • NOTE: • The input speech frame is bandpassed to eliminated the parts which don’t contain explicit harmonics. • For Each speech frame, it outputs several pitch candidates (N=3) and Viterbi algorithm then generates the final pitch tracks. • It might be useful to have candidates from this method and traditional autocorrelation method.

Results of Pitch Tracking Figure - Comparison of the performance of different pitch track methods for speech in (a) train noise (b) car noise from 0dB SNR to clean.

HNM - Harmonic Tracking Noise model VAD Smoothed Harmonic Magnitude by Kalman filter Pitch Tracking Noise Speech Harmonic Track Candidates Harmonic Frequency bin tracks Peak picking Tracking FFT • Data structure of harmonic track candidates are improved and speed up the whole system.

Results of Harmonic Tracking in Clean Speech Figure - An illustration of pitch tracks of a speech segment at sampling frequency of 8kHz.

Harmonic Recovery Pitch recovery Results of Harmonic Tracking in Noisy Speech

Synthesis of Excitation by HNM Voiced Excitation : Unvoiced Excitation : Where b(m) is unit white Gaussian noise , e(m) is original excitation and a is the phases of original excitation.

Results of Speech Enhancement Enhanced speech is synthesized by inverse filtering the HNM residual with cleaned LP shape. Figure - Comparison of the harmonicity of MMSE and FTLP-HNM systems on train noisy speech at different SNRs Figure - Performance of MMSE and FTLP-HNM on train noisy speech at different SNR levels.

Demo (1) Persian speech for Iranian King Mozaffareddin Shah Original speech Enhanced speech

Demo (2) Florence Nightinggale 1890 Original speech Enhanced speech

Refinement in FTLP-HNM system for Speech Enhancement

Refinement in FTLP-HNM system for Speech Enhancement

Presentation Transcript

Optimisation in Refinement

Nearfield Spherical Microphone Arrays for speech enhancement and dereverberation

Wavelet-Based Speech Enhancement

Subspace Methods for Speech Enhancement

Noise Supression Techniques for Speech Enhancement Using Adaptive Filtering

Speech Enhancement

Advanced Speech Enhancement in Noisy Environments

Bayesian Enhancement of Speech Signals

Speech Enhancement Using Spectral Subtraction

SEQUENTIAL STATE-SPACE FILTERS FOR SPEECH ENHANCEMENT

Wavelet-Based Speech Enhancement

Speech Enhancement EE 516 Spring 2009

Speech Enhancement

Automatic Communication Refinement for System Level Design

Speech Enhancement using Excitation Source Information

Bayesian Methods for Speech Enhancement

Speech Enhancement for ASR

Wearable Speech Enhancement

Speech Enhancement through Noise Reduction

Signal Subspace Speech Enhancement

Data refinement database refinement services in usa

System Enhancement for Health Action in Transition (SEHAT)